Kevin Sookocheff

Deploying a static website with rsync

This is one of those things that we all kinda know but that it’s good to write down – how to deploy static files using rsync. For purposes of illustration, this article will describe how to deploy a statically generated website. ...

Thoughts On Google Cloud Platform Next

I was fortunate enough to attend Google Cloud Platform Next last week and wanted to summarize a few of my thoughts on the conference. As I sat down to analyze the event, I found a few distinct themes that I would like to expand on. Multi-Cloud Google is serious about multi-cloud support for their monitoring and integration products. As a cynic, if Google wants to steal customers from Amazon, offering tools to aid the transition is in their best interest. As a realist, most companies will continue to operate in a multi-cloud environment to take advantage of the strengths of each platform, and it’s nice to see Google recognizing this fact and working within it. Either way, I think being open about supporting AWS is a good thing. ...

Paper Review: Access Path Selection in a Relational Database Management System

Title and Author of Paper Access Path Selection in a Relational Database Management System. P. G. Selinger et al. Summary This paper describes methods of the SQL query optimizer for determining the cost of satisfying a query. It also describes methods for choosing among several competing methods. What are the motivations for this work? SQL is a high-level language where requests for data are stated non-procedurally. The user is not expected to need any knowledge of how the data is stored in the database or how it is retrieved. Thus, it is up to the DBMS to choose an appropriate access path for data retrieval on the users behalf. By designing the database in this fashion, we preserve data independence, where a users view of the data is independent of the databases view of the data. ...

Paper Review: Eddies: Continuously Adaptive Query Processing

Title and Author of Paper Eddies: Continuously Adaptive Query Processing. Ron Avnur and Joseph M. Hellerstein. Summary Eddies describes a query optimization system that continuously reorders operators in a query plan as the it runs. This insight is based on the observation that assumptions made about the database at the time that a query is submitted will rarely hold throughout the duration of query processing. Query plans can be reordered using two criteria: synchronization barriers and moments of symmetry. Synchronization barriers exist whenever an operator is waiting for a table scan to complete before making forward progress. In general, these barriers limit concurrency and one goal of the eddies system is to avoid or improve these barriers by selecting an appropriate join algorithm. ...

Paper Review: The Gamma Database Machine Project

Title and Author of Paper The Gamma Database Machine Project. David J. DeWitt et al. Summary This paper presents the research undertaken at the University of Wisconsin-Madison to develop a scalable database architecture. The paper presents novel methods for scaling a database cluster using a shared-nothing architecture, and for using hash-based join algorithms to parallelize the workload across the cluster. What are the motivations for this work? The motivation behind Gamma was to support horizontally scalable database using commodity parts. ...

Local Development with the Kinesis Client Library

I’ve been working on an application to read data from a Kinesis using the Kinesis Client Library for Java. One requirement was to be able to run the application locally during development. This requires configuring Kinesis, DynamoDB, and CloudWatch to work locally. Dynalite Getting a local copy of DynamoDB running is easily done using the dynalite library. > npm install -g dynalite > dynalite --port 7000 Kinesalite A local copy of Kinesis is found with the kinesalite library. ...

Paper Review: The Design of POSTGRES

Title and Author of Paper The Design of POSTGRES, Michael Stonebraker and Lawrence A. Rowe. Summary Postgres started as a research project to extend the standard database architecture to support several additional concepts: complex objects as values, user-defined data types and procedures, and alerting and triggers. This paper describes the system architecture designed to achieve these goals, while retaining functionality of the relational model. Although the design incorporates additional ideas such as time varying data, I will focus my review on the user-defined types and alerting scenarios. ...

Paper Review: System R: Relational Approach to Database Management

Title and Author of Paper System R: Relational Approach to Database Management. M. M. Astrahan et al. Summary It’s hard to overstate the influence that the System R project had on database design and implementation. After reading this paper it is clear that traditional database architecture has not significantly changed since the System R project. System R provided the first implementation of SQL, the first demonstration of performant transactions, and provided the foundational groundwork in concurrency control and query optimization. ...

The Five Stages of NoSQL

Imagine a fledgling software startup consisting of one or two developers. They are following the lean startup methodology by throwing ideas and implementations at a wall to see what sticks. This methodology demands keeping your application as simple as possible until you find the optimum market. An on-going concern for the developers is finding a simple, flexible way to store application data: NoSQL or SQL? The NoSQL database offers a premium out-of-the-box experience: install a package, start the database, and post and retrieve data using a JSON API. And by eschewing the details of a schema and the inconvenience of data modelling, the NoSQL database allows for fast iteration. ...

Paper Review: Architecture of a Database System

Title and Author of Paper Architecture of a Database System. Joseph M. Hellerstein, Michael Stonebraker, James Hamilton. Summary Architecture of a Database System provides an explanation of how to implement a relational database. It begins with an architectural overview of the main parts of a database system as viewed through the life of an SQL query. This includes how the query is received, parsed and optimized and how the resulting data is returned from storage as part of a transaction. ...