Postgres Pro TDE

Data Protection in Databases

Databases are repositories of vast amounts of information, much of which is sensitive and requires robust protection against unauthorized access. At the SQL interface level, this protection is primarily managed through the Access Control List (ACL), utilizing commands such as GRANT and REVOKE to effectively control access.

However, the challenge intensifies when considering the possibility of an attacker gaining direct access to database files. This scenario can arise if backups are stored in insecure locations or if a database is hosted in the cloud, where physical access to servers may be granted to unknown administrators. While encrypting backups is a common practice supported by most modern backup systems, the process can be cumbersome, especially when dealing with large volumes of data, such as 10 TB. The encryption process can significantly extend the time required for both copying and restoring data, particularly when using cryptographic standards like GOST, which lack hardware acceleration on popular CPU architectures.

Moreover, if an attacker has access to the file system, such as a hardware administrator in a data center, they may not need SQL access to compromise the database. They could simply execute a command like cp to copy the entire database. In this context, traditional disk encryption and encrypted backups may not provide adequate protection. Thus, the need for a more robust defense mechanism arises: encrypting the database files themselves, a process referred to here as protective transformation or encoding.

Client-side Encoding with pgcrypto

The most straightforward method of protecting sensitive data is to encode it before it is stored in the database, ensuring that plaintext data is never saved. This approach requires that encoding keys be securely managed on the client side or through a proxy that mediates SQL queries and results. PostgreSQL offers an extension called pgcrypto that facilitates this encoding. Despite its potential, this method has not gained widespread adoption due to several inherent challenges:

  • Keys must be distributed, synchronized, and securely stored on every client.
  • All clients require the appropriate keys to access protected tables.
  • If a key is compromised or rotated, re-encryption of all data is necessary, along with redistributing new keys.
  • SQL cannot filter or search pgcrypto-protected tables, as it only sees unreadable data blobs.
  • Constraints cannot be applied since SQL lacks knowledge of the contents of those columns.
  • Queries must be rewritten to incorporate pgcrypto functions explicitly.

To address these limitations, a more seamless solution known as Transparent Data Encryption (TDE) has emerged, which operates transparently for both SQL and users.

Transparent Data Encoding

Transparent Data Encoding allows SQL operations such as INSERT, UPDATE, and SELECT to function as usual, while the actual data stored on disk is rendered unreadable. In this model, the database management system (DBMS) manages the keys and the encoding/decoding processes. Various implementations exist, ranging from simple to complex.

Percona pg_tde_basic

The initial TDE implementation from Percona, now discontinued, aimed to integrate encoding and decoding into the memory buffer manager. This approach ensured that data was encrypted before being written to table files or the Write-Ahead Log (WAL), and decrypted upon retrieval. However, this method presented several challenges:

  • Indexes were not protected, as they did not reside within the pg_tde framework.
  • Custom Table Access Methods (TAMs) remained unprotected.
  • Performance was significantly impacted, with queries experiencing substantial slowdowns.

These limitations underscored the necessity for a deeper integration of encoding and decoding processes.

Cybertec PGEE and EnterpriseDB

Cybertec’s and EnterpriseDB’s TDE implementations took a different approach by encrypting the entire cluster, including all tables, indexes, and the system catalog. This method simplifies the encryption process, ensuring consistent behavior across the database. However, it also introduces drawbacks, such as:

  • Reduced performance due to the overhead of encrypting non-sensitive tables.
  • Complications in technical support, as encrypted tables can complicate the process of providing necessary files to support teams.

Pangolin SE, Fujitsu EP, and Percona Distribution

To mitigate the issues associated with indiscriminate encryption, Pangolin and Fujitsu proposed marking specific tables for protection by creating a “protected” tablespace. This selective encryption approach allows for the encryption of only those tables that require it, thereby preserving performance. However, Fujitsu’s implementation posed significant challenges due to its invasive nature, complicating integration with the vanilla upstream.

Percona’s recent TDE implementation also follows this selective encryption model, marking protected tables with a special TAM. However, it currently lacks support for non-standard TAMs, which limits its applicability in certain scenarios.

Postgres Pro Enterprise TDE

In developing its TDE implementation, Postgres Professional aimed to address key issues such as key rotation and protection against time-based analysis. The solution involves storing a key index for each table page, allowing for key rotation without necessitating a full re-encryption of the table. This innovative approach ensures that new data is encrypted with the latest key while maintaining access to previously encrypted data.

Additionally, the implementation incorporates random initialization vectors (IVs) to enhance security against time-based analysis, ensuring that each encrypted page appears unique and indistinguishable from random noise. The use of a Message Authentication Code (MAC) further bolsters data integrity, providing a robust safeguard against tampering.

Furthermore, Postgres Pro Enterprise allows for full database encryption, enabling comprehensive protection for all tables and the system catalog within a cluster. This flexibility is crucial for managing sensitive data effectively.

Future Directions

The journey toward perfecting TDE continues, with ongoing developments aimed at enhancing its capabilities. Future improvements include enabling TDE to protect tables stored in compressed tablespaces and refining the encryption of WAL data to facilitate easier access for utilities without compromising security.

Tech Optimizer
Postgres Pro TDE