Title and Author of Paper
The Design of POSTGRES, Michael Stonebraker and Lawrence A. Rowe.
Postgres started as a research project to extend the standard database architecture to support several additional concepts: complex objects as values, user-defined data types and procedures, and alerting and triggers.
This paper describes the system architecture designed to achieve these goals, while retaining functionality of the relational model. Although the design incorporates additional ideas such as time varying data, I will focus my review on the user-defined types and alerting scenarios.
What are the motivations for this work?
At the time of writing, databases shipped with support for a handful of data types. Additional types of data needed to be integrated by database vendors, making extension of the system impractical for end-users. To solve this problem required two additions to database architecture. First, the ability to store arbitrary complex objects. Second, to allow users to define their own types and the ways that those types are accessed.
Another common requirement of modern databases is to support stored triggers and alerts for changes in data. Triggers ensure that data changes are correctly reflected in dependent tables, while alerts notify an application of data changes.
Together, these two additions to database architecture provide the foundation of required functionality of any modern DBMS.
What is the proposed solution?
The proposed solution involves both user-defined types and alerting and triggers. Each is covered separately.
User-Defined Data Types
The design of Postgres allows for storage of unbounded arrays, SQL statements, and procedures of arbitrary code — storing complex objects is assumed by storing arbitrary code or SQL statements.
By allowing users to define their own operators that satisfy the dependencies for navigating a B-tree, user-defined types act as first-class citizens in the database. Indexes built using user-defined operators act the same as indexes built from explicitly supported data types as long as they define access operators correctly.
To allow for storing arbitrary procedures requires two pre-computation steps for performance: compiling an access plan, and executing the access plan. In Postgres, the result of executed code is placed in the tuple field itself for efficient future retrieval. The pre-computed values may become invalid if dependent data changes. To handle this fact, a database lock will be added to any fields accessed during pre-computation. If such a lock is held on an object during a write, the lock signifies that pre-computation must be done again.
Alerting and Triggers
When a command is run to start alerting on a field, another kind of lock is set on that field. Whenever a transaction writes to data that has a lock set on it, the alert is triggered.
What are the contributions?
The contributions of this paper are support for user-defined types, alerting and triggers, and time-varying data. In addition, Postgres has been continually evolving and is one of the leading open-source relational databases in the world. The future contribution of this paper has been immense.
What are future directions for this research?
The future directions of this paper have been realized with the open-source Postgres database.
What questions are you left with?
A comparison of Postgres as designed and Postgres as implemented — what changes were necessary and what comprises needed to be made.
What is your take-away message from this paper?
Architecting a database for extensibility provides a means of future proofing your database against change.
- Paper Review: The CQL continuous query language: semantic foundations and query execution
- Paper Review: BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
- Paper Review: Informix under CONTROL: Online Query Processing
- Paper Review: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates