The database that refused to die: How Postgres survived its own creators

Postgres, now a cornerstone of modern database systems, had a rather unremarkable beginning. Its inception can be traced back to the pioneering work of Michael Stonebraker, who also created Ingres, the predecessor from which Postgres derived its name—short for “Post-Ingres.” At a recent PGDay conference in Boston, Stonebraker shared insights into the intricate history of Postgres, which emerged long before the concept of open source was formally recognized.

Stonebraker emphasized that “Postgres is the epitome of open source software, because it doesn’t belong to anybody.” After stepping away from Postgres in the mid-1990s, he expected it to fade into obscurity. Instead, it was revitalized by a passionate community of open source contributors who enhanced the codebase by integrating standard SQL while maintaining its innovative extensible architecture. Today, this resilient database system underpins much of the cloud infrastructure we rely on.

Data should be relational

The journey of relational databases began with British computer scientist Ted Codd, who, in 1970, proposed that all data should be organized in tables and accessed through a high-level query language. IBM took up this challenge, implementing Codd’s ideas in System R and creating SQL as the query language, which eventually became part of IBM’s DB2. Stonebraker, then an assistant professor at UC Berkeley, also embraced Codd’s vision, leading a team that not only built a working prototype but also a full-scale implementation. This endeavor culminated in the commercial launch of Ingres, which utilized QUEL, a query language distinct from SQL.

In the early 1980s, Stonebraker pivoted from Ingres to develop Postgres, recognizing the need for databases to accommodate more complex data types beyond basic integers and strings. As industries began to require storage for intricate data types such as CAD and GIS, Stonebraker and his team understood that extending the database system was essential. This led to the introduction of user-defined data types, operators, and functions, culminating in the support for abstract data types (ADTs), a feature that has become a hallmark of modern database systems.

Beyond Ingres: Postgres

Stonebraker’s ambitions for Postgres extended to incorporating new concepts of referential integrity and a rules engine to monitor changes within the database. While the latter features did not materialize as planned, the foundational work on ADTs proved to be a significant success, earning Stonebraker the prestigious 2014 A.M. Turing Award. The initial commercialization of Postgres occurred through a startup named Illustra, which was eventually acquired by Informix, integrating the technology into its database server while also maintaining an open source version, albeit under different nomenclature.

The architecture that refused to die

In 1995, two graduate students from Berkeley, Andrew Yu and Jolly Chen, resurrected Postgres from its last academic release. They replaced the underperforming rules engine and disaster recovery features, transitioning from QUEL to the more widely accepted SQL, leading to the release of Postgre95, which later evolved into PostgreSQL. Stonebraker noted that he was unaware of the volunteer developers who took it upon themselves to advance the project over the ensuing decades. This community-driven approach ensured that Postgres remained accessible and modifiable, paving the way for its adoption across various platforms, including Amazon Web Services, Microsoft Azure, and Google Cloud, all of which offer Postgres-compatible database services.

Top of the heap

Today, Postgres ranks among the most popular database systems globally, just behind Oracle, MySQL, and Microsoft SQL Server. Unlike its competitors, Postgres continues to gain traction in the market. Tom Kincaid, a vice president at EDB, highlighted several factors contributing to Postgres’s success, particularly its extensibility, which has allowed it to adapt to evolving data storage needs. The introduction of ADTs facilitated its entry into the geospatial and document database markets, enabling developers to efficiently store and retrieve diverse data types.

Moreover, the high standards of the Postgres codebase and its robust optimizer have attracted top-tier developers. The permissive licensing model has also encouraged startups to innovate without the fear of legal repercussions, further solidifying Postgres’s position in the industry.

Why Postgres still doesn’t have file-level encryption

Despite its many accolades, Postgres still lacks certain features that are standard in commercial database systems. During a recent PGDay discussion, long-time contributor Bruce Momjian outlined several missing functionalities, including 64-bit transaction IDs and support for columnar storage, which are crucial for large-scale data analysis. However, the most notable absence is file-level encryption, also known as transparent data encryption (TDE), which is now a requirement for storing financial transaction data under the latest Payment Card Industry (PCI DSS) specifications.

Currently, Postgres relies on the operating system for encryption, and efforts to implement file-level encryption have encountered significant challenges. Momjian remarked that the complexity of the necessary code changes has stalled progress, as modifications would need to be made across various system functions. He emphasized that while commercial entities may prioritize such features due to customer demands, the Postgres development team is cautious about adding features that do not provide substantial technical value.

Ultimately, the absence of TDE may reflect Postgres’s overarching philosophy: focusing on the needs of general users rather than catering solely to the demands of large enterprises. This approach may well define the essence of success for an open source project.

Tech Optimizer
The database that refused to die: How Postgres survived its own creators