Paper Review: What Goes Around Comes Around

Title and Author of Paper

What Goes Around Comes Around. Joseph M. Hellerstein and Michael Stonebraker.

Summary

What Goes Around Comes Around summarizes several methods for modelling data within a database system. Each data model is described and the benefits and drawbacks listed as lessons learned from research into that model. The authors clearly present their opinions on each model and help readers unfamiliar with past modelling attempts understand the history of this area of research.

What are motivations for this work?

To provide a summary of the research efforts into data modelling techniques applicable to databases. To provide both guidance and caution for implementers of data models to consider before undertaking the effort.

What is the proposed solution?

This paper does not provide any prescribed solution to the data modelling problem but the author’s bias for the object-relational model does show through. The assumed, rather than explicit, solution to the data modelling problem is to continue using the object-relational model as the standard for database systems.

What are the contributions?

The benefits and drawbacks of each model are explained at a high enough level that they are understandable without having to dive into the details — which greatly simplifies one’s ability to understand the past research efforts in this area. The majority of the models reviewed in this paper were developed before my education in Computer Science began. Therefore, this paper served me well as an introduction to the modeling ideas.

What are future directions for this research?

JSON. This is touched on in Chapter 1 of the red book. It is clear that JSON is a hierarchical model and that it’s suitability for general purpose data modelling is questionable. It would be nice to see JSON evaluated in the same manner as other models in the paper, although it may be too early to list the lessons learned from this model.

What questions are you left with?

Some data models are shown to be superior in the paper but have not been widely deployed due to market forces or model complexity. Is the object-relational model really the best model and how long will it stay the best?

Schema-last data models — where arbitrary data is allowed to be inserted into the database without regard to a schema, and then retrieved with a schema — are discussed in the context of XML. I would like to see this discussion evaluate some of the benefits and drawbacks of schema-free databases that serve as document stores such as MongoDB. This is touched on in Chapter 1 of the red book in the context of data lakes but I would like to see a full examination of this topic.

What is your take-away message from this paper?

Always do your research before working on a new idea. Consider what has come before and why it has succeed or failed to avoid reinventing the wheel. As the authors state:

To avoid repeating history, it is always wise to stand on the shoulders of those who went before, rather than on their feet.

comments powered by Disqus