PostgreSQL 18 on Amazon Aurora and Amazon RDS: Security, monitoring, and developer enhancements

In the latest iteration of PostgreSQL, version 18, significant strides have been made in enhancing security, monitoring capabilities, developer productivity, and logical replication. These improvements are designed to streamline operational efficiency and elevate the overall developer experience.

Deprecation of MD5 Password

PostgreSQL 18 marks the deprecation of MD5 password authentication, advocating for the adoption of more secure methods such as SCRAM-SHA-256. While MD5 authentication remains functional in this version, it is slated for removal in future releases. Users of Amazon Relational Database Service (Amazon RDS) or Amazon Aurora PostgreSQL-Compatible Edition will find that the default password_encryption is already configured to scram-sha-256. For those still utilizing MD5 or having existing MD5-hashed passwords, it is imperative to transition to the recommended method. This can be accomplished by modifying the password_encryption parameter in your DB parameter group, which can be done dynamically without requiring a reboot.

Additionally, PostgreSQL 18 introduces a new GUC parameter, md5_password_warnings, which is enabled by default. This dynamic parameter generates deprecation warnings in the engine logs whenever a password is stored using MD5 encryption, serving as a helpful reminder to migrate affected roles to SCRAM-SHA-256.

Enhanced Monitoring of Parallel Worker Activity

This version enhances monitoring capabilities by adding two new columns to pg_stat_database and pg_stat_statements that track parallel worker activity:

Column Description
parallel_workers_to_launch Number of parallel workers the planner intended to launch
parallel_workers_launched Number of parallel workers actually launched

The disparity between these two metrics serves as a crucial indicator of system performance. If the planner requests workers but the system cannot provide them due to limits being reached, queries may revert to fewer workers or even serial execution—a situation that was previously opaque to users. The new columns provide visibility into this aspect of query execution, allowing for better resource management.

For users of Aurora PostgreSQL, it’s noteworthy that the default value for the max_parallel_workers_per_gather parameter has been adjusted to 0, effectively disabling parallel queries by default. This change necessitates user intervention to re-enable parallel queries based on specific workload requirements.

Database-Level Monitoring

The pg_stat_database view now aggregates parallel worker activity across queries for the database. A consistent worker_shortage indicates that the system is regularly reaching its max_parallel_workers or max_worker_processes limits, prompting a review of resource allocation.

SELECT datname,
       parallel_workers_to_launch,
       parallel_workers_launched,
       parallel_workers_to_launch - parallel_workers_launched AS worker_shortage
FROM pg_stat_database
WHERE datname = current_database();

Query-Level Monitoring

Similarly, pg_stat_statements provides insights at the query level, enabling users to identify which specific queries are most impacted by worker availability:

SELECT LEFT(query, 60) AS query,
       calls,
       parallel_workers_to_launch,
       parallel_workers_launched,
       ROUND(100.0 * parallel_workers_launched / NULLIF(parallel_workers_to_launch, 0), 2) AS launch_success_rate
FROM pg_stat_statements
WHERE parallel_workers_to_launch > 0
ORDER BY parallel_workers_to_launch DESC
LIMIT 5;

Finding Problematic Queries

To pinpoint queries where worker launches consistently fail, a targeted query can be executed. This allows for strategic adjustments, such as increasing max_parallel_workers or disabling parallelism for specific queries:

SELECT LEFT(query, 80) AS query_preview,
       calls,
       parallel_workers_to_launch,
       parallel_workers_launched,
       parallel_workers_to_launch - parallel_workers_launched AS workers_failed
FROM pg_stat_statements
WHERE parallel_workers_to_launch > 0
  AND parallel_workers_launched < parallel_workers_to_launch
ORDER BY workers_failed DESC
LIMIT 10;

Replication Monitoring Improvements in pg_stat_subscription_stats

PostgreSQL 18 enhances replication monitoring by adding new columns to pg_stat_subscription_stats, allowing for more granular tracking of conflict types encountered during logical replication:

New Conflict Columns

SELECT column_name
FROM information_schema.columns
WHERE table_name = 'pg_stat_subscription_stats'
ORDER BY ordinal_position;

The newly introduced confl_* columns provide detailed insights into various conflict scenarios, facilitating easier diagnosis of replication issues.

Querying Conflict Statistics

SELECT subname,
       apply_error_count,
       confl_insert_exists,
       confl_update_missing,
       confl_delete_missing
FROM pg_stat_subscription_stats;

Why These Matter

Prior to PostgreSQL 18, identifying the nature of conflicts during logical replication required sifting through log files. The new columns allow for immediate recognition of conflict patterns, enabling users to take corrective actions swiftly.

Optimizer Statistics Preservation During pg_upgrade

Historically, major PostgreSQL upgrades neglected to carry over optimizer statistics, resulting in degraded query performance post-upgrade. PostgreSQL 18 addresses this by automatically transferring optimizer statistics during the upgrade process, ensuring continuity in query performance.

What Changes in PostgreSQL 18

The pg_upgrade process now includes automatic transfer of optimizer statistics, eliminating the need for manual intervention. For dump/restore upgrades, users can explicitly export statistics using pg_dump --with-statistics.

What Is Not Preserved

While optimizer statistics are preserved, extended statistics created with CREATE STATISTICS are not. Users are advised to run a targeted analyze post-upgrade to collect any missing statistics efficiently.

Generate Timestamp-Ordered UUIDs with uuidv7()

PostgreSQL 18 introduces the uuidv7() function, which generates UUID version 7, combining a Unix timestamp with random bits. This innovation allows for globally unique and naturally sortable UUIDs, addressing the limitations of previous UUID versions.

The Problem with UUIDv4

UUIDv4, while unique, suffers from performance issues in high-insert workloads due to its random nature, leading to index bloat and degraded write performance.

UUIDv7: Timestamp-Ordered

UUIDv7 resolves these issues by embedding a timestamp within the UUID itself, allowing for efficient indexing and insertion order preservation.

Time-Shifted UUIDs

The uuidv7() function also supports an optional interval to shift the embedded timestamp, offering flexibility in UUID generation.

Using uuidv7() as a Primary Key

Due to its monotonically increasing nature, uuidv7() can be effectively used as a primary key, simplifying data retrieval and enhancing performance.

Default Streaming Changed to Parallel

With PostgreSQL 18, the default setting for the streaming option in CREATE SUBSCRIPTION has shifted from off to parallel, optimizing logical replication throughput and reducing lag during large transactions.

Background

This change reflects the maturity of the parallel option, which allows for simultaneous processing of changes, thereby enhancing performance and resource utilization.

Impact on Existing Subscriptions

Existing subscriptions retain their previous settings post-upgrade, necessitating a review and potential update to leverage the benefits of parallel streaming.

Inactive Replication Slot Auto-Invalidation

PostgreSQL 18 introduces the idle_replication_slot_timeout parameter, which automatically invalidates inactive replication slots after a specified duration, mitigating the risk of disk exhaustion due to abandoned slots.

The New Parameter

This parameter offers a safeguard against the operational hazards associated with inactive replication slots, providing greater control over replication management.

Configuring in Amazon RDS for PostgreSQL

Users can easily set this parameter in their parameter group without requiring a reboot, enhancing operational efficiency.

COPY Command Enhancements

PostgreSQL 18 introduces two enhancements to the COPY command: REJECT_LIMIT and a new LOG_VERBOSITY level called silent. These improvements provide users with greater control over error handling during bulk data loads.

REJECT_LIMIT for Controlled Error Tolerance

The introduction of REJECT_LIMIT allows users to set a maximum number of errors tolerated during a COPY FROM operation, enhancing data integrity during bulk loads.

LOG_VERBOSITY Silent

The new silent level suppresses messages about discarded rows, reducing log noise during large data loads while maintaining error control.

OLD and NEW in RETURNING Clauses (INSERT, UPDATE, DELETE, and MERGE)

PostgreSQL 18 introduces OLD and NEW aliases in the RETURNING clause for various DML commands, allowing for the retrieval of both previous and current values in a single statement. This enhancement simplifies application code and reduces the need for additional queries or triggers.

These advancements in PostgreSQL 18 collectively enhance security, monitoring, and developer productivity, providing a robust framework for managing modern database workloads. Users are encouraged to explore these features and consider upgrading their systems to leverage the full benefits of this latest release.

Tech Optimizer
PostgreSQL 18 on Amazon Aurora and Amazon RDS: Security, monitoring, and developer enhancements