Evolving Messaging For Microservices: A Retrospective from Building Workiva’s Messaging Platform

Workiva’s original product — supporting the mundane task of filing documents with the SEC — was so innovative that within its first 5 years it was being used by more than 65 percent of the Fortune 500 and generating more than $100 million in annual revenue. During that explosion in growth, the software development team focused solely on supporting and expanding the existing software stack. However, after several years of growth and expansion maintaining and extending that single code base became unsustainable. ...

June 28, 2017 · 21 min · Kevin Sookocheff

How to create a functional VPC using CloudFormation

This tutorial walks through how to create a fully functional Virtual Private Cloud in AWS using CloudFormation. At the end of the tutorial, you will have a reproducible way to create a virtual cloud with three subnets, a security group, and an internet gateway with SSH access for your IP address. I’ve found this template useful for creating an isolated environment to develop and test software. Full code for this tutorial is available on Github. ...

June 7, 2017 · 5 min · Kevin Sookocheff

Compiling private Java code from Leiningen

For a recent project, I wanted to verify the correctness of a distributed queue implementation based on Amazon SQS. For this, I turned to the Jepsen library for verifying distributed systems. Jepsen is written in Clojure and the first task was to get Jepsen to compile with a Java library hosted on our internal Maven repository. I googled for a while, asked around, and assembled instructions from a few different places. Here then, is a single blog post summarizing the solution for future use. ...

May 31, 2017 · 2 min · Kevin Sookocheff

Uploading Large Payloads through API Gateway

API Gateway supports a reasonable payload size limit of 10MB. One way to work within this limit, but still offer a means of importing large datasets to your backend, is to allow uploads through S3. This article shows how to use AWS Lambda to expose an S3 signed URL in response to an API Gateway request. Effectively, this allows you to expose a mechanism allowing users to securely upload data directly to S3, triggered by the API Gateway. ...

May 10, 2017 · 6 min · Kevin Sookocheff

Getting Started With TLA+

This post shows how to write your first simple TLA+ specification. What is a specification? In software, the behaviour of a system is described as a sequence of states. Mathematically, each state is expressed as a function F(t), which represents the state of a system at time t. To completely specify a system, we write out each state to fully define the systems behaviour. A simple clock This example comes from Chapter 2 of the book Specifying Systems by the creator of TLA+, Leslie Lamport. If you are unfamiliar with the math used here, refer to basic math for writing TLA+ article. ...

April 20, 2017 · 6 min · Kevin Sookocheff

Basic Math for TLA+

At its most basic, TLA+ is a written description of what a system is supposed to do. More specifically, TLA+ is a specification language for formally defining the behavioural properties of a system. TLA+ is based on temporal logic, which is built on top of first-order logic and set theory, and provides some conveniences for working with large specifications for complex systems. You can think of a TLA+ specification as mostly ordinary math and logic, glued together with temporal logic for parts requiring it. ...

April 19, 2017 · 4 min · Kevin Sookocheff

First Musings on TLA+

I’ve been helping define some concurrent algorithms and I’m struggling with a number of issues: concurrent algorithms are difficult to design, they are often difficult to implement, and, even after they are designed, are difficult to guarantee that they are correct. This lead me into some research on formal specifications for algorithms, and how they can help. Finally, I settled on TLA+ as a viable tool for just such a problem. ...

April 18, 2017 · 2 min · Kevin Sookocheff

Paper Review: WebTables: Exploring the Power of Tables on the Web

Title and Author of Paper WebTables: Exploring the Power of Tables on the Web. M.J. Cafarella et al. Summary WebTables is a project to extract and process HTML tables from Google’s serach index. It attempts to answer two questions: what are some effective techniques for searching structured data at search engine scale, and what can be derived from analyzing a large corpus of HTML tables? Web documents often contain structured and relational data embedded in HTML tables. The WebTables project extracted 14.1 billion English language HTML tables and further filtered those down to 154 million tables that contain structured data. From this data, we have the potential to determine semantic information embedded in the web, create visualizations, and integrate web documents into new applications. ...

March 29, 2017 · 4 min · Kevin Sookocheff

Paper Review: Combining Systems and Databases: A Search Engine Retrospective

Title and Author of Paper Combining Systems and Databases: A Search Engine Retrospective. Eric A. Brewer. Summary Search engines manage data and respond to queries, which provides some similarities to databases. However, search engines are really an application-specific system built to handle large datasets. This system can leverage databases, or not, depending on the system goals. This paper describes a search engine design that leverages the ideas and vocabulary of the database community. ...

March 27, 2017 · 6 min · Kevin Sookocheff

Publish-Subscribe Messaging Using Amazon SQS

Amazon’s Simple Queue Service (SQS) provides durable messaging guarantees and is an excellent backbone for messaging services. However, SQS does not support “fan-out” of messages so that multiple consuming services can each receive a copy of a message. This means that true publish-subscribe messaging requires some additional work. This post describes some architectural choices that provide durable publish-subscribe messaging using SQS by tracking messaging subscribers using a database, and matching published messages to interested subscribers. ...

March 24, 2017 · 5 min · Kevin Sookocheff