Kevin Sookocheff

Paper Review: Dynamo: Amazon’s Highly Available Key-value Store

Title and Author of Paper Dynamo: Amazon’s Highly Available Key-value Store, DeCandia et al. Summary Dynamo, as the title of the paper suggests, is Amazon’s highly available key-value storage system. Dynamo only supports primary-key access to data, which is useful for services such as shopping carts and session management. Dynamo’s use case for these services is providing a highly-available system that always accepts writes. This requirement forces the complexity of conflict resolution to data readers. Writes are never rejected. ...

0 to Message in 60 Seconds: Getting Started with Amazon SQS

Amazon Simple Queue Service (SQS) is a message queue service that allows applications to reliably queue messages from one system component to be consumed by another component. Adding a queue between application components allows them to run independently, effectively decoupling applications by providing a buffer between producers of data and their consumers. Up and Running This section provides a guide to getting up and running with SQS using a fictional music store as an example. This music store has the ability to notify clients of any upcoming events via SQS. ...

Paper Review: Cap Twelve Years Later: How the “Rules” Have Changed

Title and Author of Paper Cap Twelve Years Later: How the “Rules” Have Changed. Eric Brewer. Summary This article provides an exploration of the CAP Theorem and how it relates to database system design. The author argues that, since partitions are likely to happen, the system designer can introduce methods for safely recovering from partitions to compensate. This strategy allows the database to continue to provide availability during a partition, and enforce consistency once the partition is resolved. ...

Integrating Applications: From RPC to Messaging

When integrating two independent services you have two options: making a remote procedure call (RPC), or sending a message. Which should you choose? What is a Remote Procedure Call? An organization often needs to share data and processes between multiple independent processes in a responsive way. For example, updating a user’s name in a shipping system may trigger updates in a billing system. The shipping system could update the billing system’s data directly by modifying a shared database, but this approach requires the shipping system to know far too much about the internal processes of the billing system. Instead, the business system can encapsulate the update operation behind an interface in the form of a remote procedure call, or RPC. ...

Integrating Applications

A big portion of any software engineering revolves around integrating multiple, disparate applications into a cohesive and functional whole. These apps may be built in house or third-party, or they may run on your network or distributed geographically, or they may be microservices designed to integrate. In any of these cases, you have several different options for integration, each with pros and cons. The integration patterns listed here are ordered by least to most sophisticated, but also by least to most complex. ...

Tests are Never Enough

The request was fairly simple — iterate through some data and update some dates. Something as simple as this. for entry in db.query(): entry.date = entry.date + timedelta(days=10) entry.put() Unfortunately, the NoSQL database we were using did not support schemas and, unbeknownst to me, some data had been changed in unexpected ways on production. Something as simple as this. Notice the Outlier? My code seemed correct. Unit tests passed. Code review passed. It ran flawlessly on the test environment. QA signed off on the change. So we deployed it. And things broke. ...

The Problem with Point to Point Communication

Point-to-point communication between servers usually works just fine when one or two instances are communicating. One or two instances communicating However, if you increase the number of applications, the total number of connections increases. Three instances communicating In fact, the total number of connections increases as a square of the number of application instances. So, if you are running 100 instances, you will be maintaining O(100^2) connections. The scaling becomes quite painful. ...

A Guided Tour Through Thrift

After reading the Thrift whitepaper and sending your first message, you may still have some questions about how Thrift actually works. This article helps answer those questions by providing a guided tour through the Apache Thrift architecture, highlighting the protocols, transports, and compiler, and how they interact with each other. Thrift from 10,000 feet At a high-level, Thrift is organized into several layers as in Figure 1. The layers highlighted in yellow represent application code that is written by a user. The portions in red represent code generated by the Thrift compiler from an interface definition defined in an IDL file. The layers in orange are portions of Thrift available as library code imported into your application as a dependency. Lastly, the device layer in blue represents the physical device transmitting messages. ...

Stability Anti-Patterns

I recently finished reading the excellent book “Release It!” by Michael Nygard. One of the key points that I wanted to remember was the stability anti-patterns. So, this post will serve as a reminder of architectural smells to look out for when designing production systems. This list of anti-patterns are common forces that will create or accelerate failures in production systems. Given the nature of distributed systems, avoiding these patterns is not possible. You must accept that they will happen, and program your application to be resilient to these failures. ...

Metrics-Driven Development

Metrics-Driven Development is an emerging term developing from the practices of continuous integration, continuous delivery, dev ops, and agile software methodologies. This article serves to define what metrics-driven development is, why it is useful, and how to use it to drive software changes. Let’s start with a definition of metrics-driven development. Metrics-Driven Development (MDD) The use of real-time metrics to drive rapid, precise, and granular software iterations. This definition is simple and straightforward, but does leave room for interpretation. Let’s dive deeper and break the definition down, bit-by-bit. ...