PostgreSQL 16, released in September 2023, brings significant performance improvements and a host of new features to the popular open-source relational database. These enhancements span various aspects of data management, including ingestion speed, data distribution options, and query performance insights. You can try out PostgreSQL 16 on both Amazon Aurora PostgreSQL-Compatible Edition and Amazon Relational Database Service (Amazon RDS) for PostgreSQL.
New features and enhancements to logical replication
Logical replication in PostgreSQL 16 introduces several new features, including logical decoding on standbys and parallel application of logical replication. These enhancements provide more granular control over data replication and reduce latency.
Parallel apply of large transactions
In previous versions, large transactions were sent in multiple streams, causing replication delays. PostgreSQL 16 introduces a parallel apply option, which writes data directly to the target table using multiple background worker processes, significantly reducing latency. This option can be enabled when creating the subscription, and the substream column in the pg_subscription catalog table indicates whether the parallel option is enabled.
Example: Replicating a transaction with 10 million rows
We set up RDS databases, enabled logical replication parameters, and configured replication through the pub/sub method. The following code demonstrates the setup:
psql -d postgres -h primary_rds_endpoint -p 5432 -c "CREATE PUBLICATION logical_pub FOR TABLE large_tabl;"
For the subscription server:
psql -d postgres -h secondary_rds_endpoint -p 5432 -c "CREATE SUBSCRIPTION logical_sub connection 'dbname=postgres host=primary_rds_endpoint port=5432 password=password user=postgres' PUBLICATION logical_pub;"
We inserted data into the source table, which took about 1 minute to complete:
INSERT INTO large_tabl
VALUES(generate_series(1,10000000),md5(random()::text),now(),md5(random()::text));
INSERT 0 10000000
Time: 63447.643 ms (01:03.448)
Without the parallel option, it took around 2.5 minutes to replicate the data. With the parallel option enabled, it took around 1 minute. This demonstrates the significant reduction in replication time with the parallel apply option.
Logical replication from a standby instance
PostgreSQL 16 allows logical replication from a standby instance, reducing the load on the primary instance. This feature is illustrated by setting up three databases running on different ports, with connectivity between the standby and another instance using logical replication.
New SQL/JSON functionality
PostgreSQL 16 enhances its JSON capabilities, introducing features like the IS JSON predicate, which checks the validity of JSON objects based on their structure. The IS JSON predicate includes tests such as IS JSON VALUE, IS JSON ARRAY, IS JSON OBJECT, and IS JSON SCALAR.
Additionally, PostgreSQL 16 supports SQL/JSON standard-conforming constructors for JSON types like JSON_ARRAY(), JSON_ARRAYAGG(), JSON_OBJECT(), and JSON_OBJECTAGG(). These constructors allow for more precise and efficient handling of JSON data.
Speed up concurrent data ingestion with COPY
PostgreSQL 16 introduces a significant upgrade to its bulk extension feature for relations, allowing multiple blocks to be extended simultaneously. This reduces overhead and enhances performance, particularly for concurrent COPY operations. Benchmarking tests show a substantial increase in speed for both single-threaded and multi-threaded configurations.
Benchmark with pgbench
The performance test was conducted using pgbench, with 16 concurrent users. The results show a 1.8 times faster performance improvement in PostgreSQL 16 compared to PostgreSQL 15 for single-threaded configurations, and a 1.6 times faster improvement for multi-threaded configurations.
Subtransaction performance improvements
PostgreSQL 16 enhances the mechanism for looking up transaction identifiers in snapshots using SIMD (Single Instruction, Multiple Data) instructions. This significantly improves efficiency, particularly for operations on large datasets, and is beneficial in environments with a high number of concurrent writers.
Benchmark with heavy concurrent writes
A benchmarking exercise focused on heavy concurrent writes shows notable performance improvements in PostgreSQL 16 compared to PostgreSQL 15. The results indicate a substantial increase in transactions per second (TPS) with a higher number of write clients.
Track when a table or index was last used
PostgreSQL 16 introduces a feature to track the timestamp of the last scan of a relation, useful for index evaluation and monitoring relation usage over time. Additionally, a new counter, n_tup_newpage_upd, tracks row updates that result in a new heap page, aiding in precise database performance tuning.
Monitor I/O with pg_stat_io
PostgreSQL 16 introduces the pg_stat_io view for more comprehensive I/O statistics. This view provides detailed information on I/O operations, allowing for better performance monitoring and optimization.
New security features
PostgreSQL 16 introduces the SYSTEM_USER SQL keyword, which returns the authenticated identity associated with the current database session. This is particularly useful in scenarios where the authenticated identity differs from the SESSION_USER, such as in GSS and peer authentication.
GSS authentication
Using GSS authentication, the SYSTEM_USER keyword provides the relevant authentication information, ensuring that the authenticated identity is visible and can be retrieved.
Peer authentication
Similarly, with peer authentication, the SYSTEM_USER keyword helps retrieve the authenticated identity, enhancing the completeness of information available in the current session.
Row-level security policy
The SYSTEM_USER functionality can also be used to create row-level security policies that restrict access based on the system user details, ensuring that users can only access their own data.
PostgreSQL 16 continues to improve on what’s possible with database management, introducing significant enhancements and features that boost performance and monitoring capabilities. The release brings forward vector operations for XID and sub-XID searches, a new system view for tracking I/O statistics, and significant improvements in logical replication and SQL/JSON constructors. These offerings make PostgreSQL 16 a compelling choice for managing complex data workloads while providing robust security.
You can use these advanced features with PostgreSQL 16 on Amazon Aurora and Amazon RDS, and experience the potential for even more innovation in future releases. Stay connected with the community mailing lists to keep up to date on the latest developments in one of the most powerful and reliable database management systems available.
About the authors
Jonathan Katz is a Principal Product Manager – Technical on the Amazon RDS team and is based in New York. He is a Core Team member of the open-source PostgreSQL project and an active open-source contributor.
Rajesh Madiwale is a Lead Consultant with Amazon Web Services. He has deep expertise on database development and administration on Amazon RDS for PostgreSQL, Amazon Aurora PostgreSQL, Amazon Redshift, MySQL, and Greenplum databases. He is an ardent member of the PostgreSQL community and has been working on PostgreSQL his entire tenure. He has also delivered several sessions at PostgreSQL conferences.
Baji Shaik is a Sr. Database Consultant with AWS Professional Services. He is a Database Migration Expert and has developed many successful database solutions addressing challenging business requirements for moving databases from on premises Oracle, SQL Server to Amazon RDS and Amazon Aurora PostgreSQL/MySQL. He is an eminent author, having written several books on PostgreSQL. A few of his recent works include “Procedural Programming with PostgreSQL PL/pgSQL,” “PostgreSQL Configuration,” “Beginning PostgreSQL on the Cloud,” and “PostgreSQL Development Essentials.” Furthermore, he has delivered several conference and workshop sessions.
Swanand Kshirsagar is a Lead Consultant within the Professional Services division at Amazon Web Services. He specializes in collaborating with clients to architect and implement scalable, robust, and security-compliant solutions within the AWS Cloud environment. His primary expertise lies in orchestrating seamless migrations, encompassing both homogenous and heterogeneous transitions, facilitating the relocation of on-premises databases to Amazon RDS and Amazon Aurora PostgreSQL with efficiency.