Getting Started With TLA+

This post shows how to write your first simple TLA+ specification. What is a specification? In software, the behaviour of a system is described as a sequence of states. Mathematically, each state is expressed as a function F(t), which represents the state of a system at time t. To completely specify a system, we write out each state to fully define the systems behaviour. A simple clock This example comes from Chapter 2 of the book Specifying Systems by the creator of TLA+, Leslie Lamport. [Read More]

Basic Math for TLA+

At its most basic, TLA+ is a written description of what a system is supposed to do. More specifically, TLA+ is a specification language for formally defining the behavioural properties of a system. TLA+ is based on temporal logic, which is built on top of first-order logic and set theory, and provides some conveniences for working with large specifications for complex systems. You can think of a TLA+ specification as mostly ordinary math and logic, glued together with temporal logic for parts requiring it. [Read More]

First Musings on TLA+

I’ve been helping define some concurrent algorithms and I’m struggling with a number of issues: concurrent algorithms are difficult to design, they are often difficult to implement, and, even after they are designed, are difficult to guarantee that they are correct. This lead me into some research on formal specifications for algorithms, and how they can help. Finally, I settled on TLA+ as a viable tool for just such a problem. [Read More]

Paper Review: WebTables: Exploring the Power of Tables on the Web

Title and Author of Paper WebTables: Exploring the Power of Tables on the Web. M.J. Cafarella et al. Summary WebTables is a project to extract and process HTML tables from Google’s serach index. It attempts to answer two questions: what are some effective techniques for searching structured data at search engine scale, and what can be derived from analyzing a large corpus of HTML tables? Web documents often contain structured and relational data embedded in HTML tables. [Read More]

Paper Review: Combining Systems and Databases: A Search Engine Retrospective

Title and Author of Paper Combining Systems and Databases: A Search Engine Retrospective. Eric A. Brewer. Summary Search engines manage data and respond to queries, which provides some similarities to databases. However, search engines are really an application-specific system built to handle large datasets. This system can leverage databases, or not, depending on the system goals. This paper describes a search engine design that leverages the ideas and vocabulary of the database community. [Read More]

Publish-Subscribe Messaging Using Amazon SQS

Amazon’s Simple Queue Service (SQS) provides durable messaging guarantees and is an excellent backbone for messaging services. However, SQS does not support “fan-out” of messages so that multiple consuming services can each receive a copy of a message. This means that true publish-subscribe messaging requires some additional work. This post describes some architectural choices that provide durable publish-subscribe messaging using SQS by tracking messaging subscribers using a database, and matching published messages to interested subscribers. [Read More]

Paper Review: The Anatomy of a Large-Scale Hypertextual Web Search Engine

Title and Author of Paper The Anatomy of a Large-Scale Hypertextual Web Search Engine. Sergey Brin and Lawrence Page. Summary This paper describes the underpinnings of the Google search engine. The paper presents the initial Google prototype and describes the challenges in scaling search engine technology to handle large datasets. At the time of writing, the main goal of Google is to improve the quality of web searches by taking advantage of the existing link data embedded in web pages to calculate the quality of a page. [Read More]

Paper Review: Consistency Analysis in Bloom: a CALM and Collected Approach

Title and Author of Paper Consistency Analysis in Bloom: a CALM and Collected Approach. Alvaro et al. Summary Distributed programming is difficult for even experienced developers to get correct. Understanding the tradeoff between consistency, availability, and latency, while guaranteeing data correctness, provides a wealth of problems for the application developer. This paper presents a language and method for programmatically verifying distributed consistency. CALM - Consistency and Logical Monotonicity There is a connection between distributed consistency algorithms and logical monotonicity, that is, our programs must be correct even in the face of the delay and re-ordering of messages and data across different nodes in a system. [Read More]

Comparing Kinesis and SQS

Amazon Kinesis has a feature set that makes it tempting to use for a variety of applications. However, it is really designed for a particular set of use cases, and those use cases must be carefully considered before adopting Kinesis. In this article, we compare Kinesis with Amazon’s Simple Queue Service (SQS), showing the benefits and drawbacks of each system, and comparing their ideal uses. Kinesis - Streaming Data Kinesis’ primary use case is collecting, storing and processing real-time continuous data streams. [Read More]

Software Architecture as Business Analysis

Architecture is the bridge between (often abstract) business goals and the final (concrete) resulting system. Software Architecture in Practice A software architect should act as a bridge between business stakeholders and technical stakeholders. To be this bridge requires understanding the business problem being solved, and being able to distill that problem into a technical solution that a software team can implement. In essence, the architect acts as a technical business analyst that helps to define the needs of an organization and recommend solutions that deliver value to stakeholders. [Read More]