In the current data-centric environment, organizations are increasingly challenged to maintain the performance of their mission-critical databases under demanding workloads. The Amazon Relational Database Service (Amazon RDS) provides a managed solution that is both cost-effective and highly scalable, catering to these needs effectively.
Within PostgreSQL, transactions are initially recorded in a Write-Ahead Log (WAL) buffer before being stored in a dedicated WAL segment. This logging mechanism is essential for data recovery and replication. The performance of these transactions is closely tied to the storage system housing the WAL segments. By default, PostgreSQL stores all files, including WAL, in a single location. However, the PostgreSQL documentation suggests that separating the WAL storage can significantly enhance performance. In line with this, Amazon has introduced the Dedicated Log Volume (DLV) feature, which allows users to store WAL segments on a distinct volume, fixed at 1,024 GiB with 3,000 Provisioned IOPS.
One of the standout features of DLV is its capability for write coalescing, which effectively reduces the overall IOPS demand for the WAL by consolidating log writes into larger, more efficient IO operations. In conventional storage configurations, log and data writes tend to be intermingled in smaller, frequent operations. In contrast, DLV maintains a separation that permits logs to be written in larger batches, thereby enhancing throughput and IO efficiency. This optimization can lead to reduced costs by lessening the reliance on high-performance storage while still maintaining robust performance levels.
Solution Overview
The benchmarking solution involves executing a script on an Amazon Elastic Compute Cloud (Amazon EC2) instance. This script evaluates performance across two distinct RDS for PostgreSQL instances: one with DLV enabled and the other without.
Prerequisites
To initiate the benchmarking process, the following prerequisites must be met:
- Launch an Amazon EC2 Linux instance. It is advisable to deploy the instance within the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
- Install
pgbench
on the EC2 instance. The PostgreSQL server installation package includes thepgbench
utility by default, which can be installed from the Amazon extras library:sudo amazon-linux-extras enable postgresql15 sudo yum install postgresql-server -y
- Ensure you have a psql client that can connect to your Amazon RDS for PostgreSQL instance. The PostgreSQL server installation also includes the
psql
client by default. - Launch two Amazon RDS for PostgreSQL instances—one with DLV enabled and the other without.
To create the instances, utilize the AWS Command Line Interface (AWS CLI) command, ensuring to input your specific AWS Region and account number. The parameter --dedicated-log-volume
activates the DLV feature, while --no-dedicated-log-volume
disables it. For additional details, refer to the documentation on Enabling DLV when creating a DB instance.
Performance Benchmarking
This section delves into the benchmarks conducted to compare the performance of an RDS for PostgreSQL instance with DLV enabled against one without DLV.
Infrastructure
The infrastructure of the EC2 instance used for the benchmarking script is summarized in the table below:
Instance class | Operating System | vCPU | Memory (GiB) |
r5.4xlarge | Amazon Linux 2023 | 16 | 128 |
The following table outlines the infrastructure specifics for each RDS instance:
Instance name | Instance class | vCPU | Memory (GiB) | Storage | Dedicated Log Volume |
dlv | db.r5.16xlarge | 64 | 512 | IO2 (10,000 PIOPS) | On |
withoutdlv | db.r5.16xlarge | 64 | 512 | IO2 (10,000 PIOPS) | Off |
Database Configuration Parameters
The performance of Amazon RDS for PostgreSQL is influenced by various database configuration parameters, particularly those related to commit, checkpoint, and WAL configuration. To gather statistics for WAL, the track_wal_io_timing
parameter must be enabled. The following SQL query retrieves key parameters and their settings:
SELECT name, setting
FROM pg_settings
WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
Benchmarking Script
To ensure accurate performance comparisons, we benchmarked the DLV and non-DLV instances independently using a specific script. Prior to the benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared
function to ensure that results reflect the specific test conditions rather than previous activity. Our performance evaluation employed the pgbench
tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against a data scale (-s
) of 10,000.
The benchmarking script is as follows:
#!/bin/bash # Function to print usage information print_usage() { echo "Usage:" echo "To load data ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content using
tag for a sub-heading. You can use only
,
,
,
- , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements. In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance. In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases,
pgbench
repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled. Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!Solution overview
The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled. The following diagram illustrates the high-level architecturePrerequisites
To get started, you must complete the following prerequisites:
- Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
- Install
pgbench
on the EC2 instance. Since the PostgreSQL server installation package includes thepgbench
utility by default, you can install PostgreSQL from the amazon extras library.sudo amazon-linux-extras enable postgresql15 sudo yum install postgresql-server -y
- You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the
psql
client by default.
- Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.
To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter
--dedicated-log-volume
enables the DLV feature, and--no-dedicated-log-volume
disables it. Refer to Enabling DLV when you create a DB instance for more information.
The following code creates an RDS for PostgreSQL instance with DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2 --dedicated-log-volume
The
--no-dedicated-log-volume
flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2
We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.
Performance benchmarking
This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.
Infrastructure
The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.
Instance class Operating System vCPU Memory (GiB) r5.4xlarge Amazon Linux 2023 16 128 The following table summarizes the infrastructure of each RDS instance.
Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off Database configuration parameters
Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the
track_wal_io_timing
parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:SELECT name, setting FROM pg_settings WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
Output: name | setting ------------------------+---------- checkpoint_timeout | 300 commit_delay | 0 shared_buffers | 16324847 synchronous_commit | on track_wal_io_timing | on wal_buffers | 8192 wal_compression | zstd wal_segment_size | 67108864 wal_writer_delay | 200 wal_writer_flush_after | 128
Benchmarking script
To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the
pg_stat_reset_shared
function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used thepgbench
tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s
) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
The following is the script:#!/bin/bash # Function to print usage information print_usage() { echo "Usage:" echo "To load data ==> $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To run benchmarks ==> $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To load data and run benchmarks ==> $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" exit 1 } # Function to check if an RDS instance exists check_instance_exists() { INSTANCE=$1 AWS_REGION=$2 aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'." exit 1 fi } # Function to retrieve the PostgreSQL password from AWS Secrets Manager get_pg_password() { SECRET_ID="rds/db_master_password" AWS_REGION=$1 aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password' } # Check the command and set the required number of arguments accordingly if [ "$#" -lt 2 ]; then print_usage fi COMMAND=$1 shift # Create logs directory mkdir -p $HOME/DLV/logs # Redirect stdout and stderr to a log file exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1 # Log the start time and parameters echo "Command: $COMMAND" load_data() { SCALE=$1 AWS_ACCOUNT_ALIAS=$2 AWS_REGION=$3 DLV_INSTANCE=$4 NON_DLV_INSTANCE=$5 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Load data into dlv instance nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Load data into withoutdlv instance nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for data loading to complete wait echo "Data loaded" } run_benchmarks() { NUM_TRANSACTIONS=$1 NUM_CLIENTS=$2 NUM_THREADS=$3 AWS_ACCOUNT_ALIAS=$4 AWS_REGION=$5 DLV_INSTANCE=$6 NON_DLV_INSTANCE=$7 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Clear WAL statistics echo "Clearing WAL statistics" echo "Clearing WAL statistics for dlv" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" echo "Clearing WAL statistics for withoutdlv" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" # Start benchmarks echo "Starting benchmarks" nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for benchmarks to complete wait # Retrieve WAL statistics echo "Getting WAL statistics" echo "WAL statistics for dlv instance" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "WAL statistics for withoutdlv instance" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "All done" } if [ "$COMMAND" == "load" ]; then # Check if the required arguments for load are provided if [ "$#" -ne 5 ]; then print_usage fi load_data $@ elif [ "$COMMAND" == "run" ]; then # Check if the required arguments for run are provided if [ "$#" -ne 7 ]; then print_usage fi run_benchmarks $@ elif [ "$COMMAND" == "load_run" ]; then # Check if the required arguments for load_run are provided if [ "$#" -ne 8 ]; then print_usage fi SCALE=$1 NUM_TRANSACTIONS=$2 NUM_CLIENTS=$3 NUM_THREADS=$4 AWS_ACCOUNT_ALIAS=$5 AWS_REGION=$6 DLV_INSTANCE=$7 NON_DLV_INSTANCE=$8 load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests echo "sleeping" echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")" sleep 600 echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")" run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE else echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both." print_usage fi
Save the script to a file named
dlv_bm.sh
and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE> &
- SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
- NUM_TRANSACTIONS = Number of transactions within the interval
- NUM_CLIENTS = Number of concurrent database sessions
- NUM_THREADS = Number of worker threads
- AWS_ACCOUNT_ALIAS = AWS account alias
- AWS_REGION = The region where your resources are located.
- DLV_INSTANCE = The name of your RDS instance with DLV
- NON_DLV_INSTANCE = The name of your RDS instance without DLV
For example:
nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &
Upon completion of the script, the following three log files are generated for further analysis:
The two key log files for the comparison are the ones with the
dlv-blog-run-dlv-*.log
anddlv-blog-run-withoutdlv-*.log
patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
By running adiff
command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
For example:diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "
Following is the output from the benchmark logs we did:
Monitoring DLV
This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.
Monitor DLV using Amazon CloudWatch
Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.
Monitor DLV using PostgreSQL’s pg_stat_wal
In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.
Evaluating performance
To assess the impact of DLV on database performance, we reviewed following metrics:
- Runtime
- Transactions per second (TPS) and latency average recorded by
pgbench
- PostgreSQL’s cumulative statistic,
wal_sync_time
andwal_write_time
, captured frompg_stat_wal
- Graphs from Amazon RDS Performance Insights
Runtime
The runtime represents the duration required for processing all 32 million transactions using
pgbench
, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.
Metric Without DLV With DLV % Improvement Time taken by the benchmark (minutes) 77 40 92.5% TPS and latency
The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.
Metric Without DLV With DLV % Improvement TPS 6984 13397 91.83% Latency average (milliseconds) 8.8 4.5 95.56% WAL statistics
We closely examined key WAL statistics from
pg_stat_wal
likewal_write_time
(total time spent writing WAL buffers to disk) andwal_sync_time
(total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.
Metric Without DLV With DLV % Improvement wal_write_time
(milliseconds)56656.796 51882.614 9.2 wal_sync_time
(milliseconds)4431621.33 2186910.078 102.6 RDS Performance Insights
The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
The first graph shows database load and top waits with DLV.The following graph show database load and top waits without DLV.
The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock
). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
The wait details are as follows:
- LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
- IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.
Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.
Clean up
If you are testing this solution, complete the following steps to remove resources and avoid charges
- On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
- On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.
Conclusion
In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.
About the Authors
Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
" temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] load "
echo "To run benchmarks ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content usingtag for a sub-heading. You can use only
,
,
,
- , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements.
In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS.
One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance.
In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases,pgbench
repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled.
Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!Solution overview
The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled.
The following diagram illustrates the high-level architecturePrerequisites
To get started, you must complete the following prerequisites:
- Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
- Install
pgbench
on the EC2 instance. Since the PostgreSQL server installation package includes thepgbench
utility by default, you can install PostgreSQL from the amazon extras library.sudo amazon-linux-extras enable postgresql15 sudo yum install postgresql-server -y
- You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the
psql
client by default.
- Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.
To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter
--dedicated-log-volume
enables the DLV feature, and--no-dedicated-log-volume
disables it. Refer to Enabling DLV when you create a DB instance for more information.
The following code creates an RDS for PostgreSQL instance with DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2 --dedicated-log-volume
The
--no-dedicated-log-volume
flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2
We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.
Performance benchmarking
This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.
Infrastructure
The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.
Instance class Operating System vCPU Memory (GiB) r5.4xlarge Amazon Linux 2023 16 128 The following table summarizes the infrastructure of each RDS instance.
Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off Database configuration parameters
Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the
track_wal_io_timing
parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:SELECT name, setting FROM pg_settings WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
Output: name | setting ------------------------+---------- checkpoint_timeout | 300 commit_delay | 0 shared_buffers | 16324847 synchronous_commit | on track_wal_io_timing | on wal_buffers | 8192 wal_compression | zstd wal_segment_size | 67108864 wal_writer_delay | 200 wal_writer_flush_after | 128
Benchmarking script
To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the
pg_stat_reset_shared
function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used thepgbench
tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s
) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
The following is the script:#!/bin/bash # Function to print usage information print_usage() { echo "Usage:" echo "To load data ==> $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To run benchmarks ==> $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To load data and run benchmarks ==> $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" exit 1 } # Function to check if an RDS instance exists check_instance_exists() { INSTANCE=$1 AWS_REGION=$2 aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'." exit 1 fi } # Function to retrieve the PostgreSQL password from AWS Secrets Manager get_pg_password() { SECRET_ID="rds/db_master_password" AWS_REGION=$1 aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password' } # Check the command and set the required number of arguments accordingly if [ "$#" -lt 2 ]; then print_usage fi COMMAND=$1 shift # Create logs directory mkdir -p $HOME/DLV/logs # Redirect stdout and stderr to a log file exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1 # Log the start time and parameters echo "Command: $COMMAND" load_data() { SCALE=$1 AWS_ACCOUNT_ALIAS=$2 AWS_REGION=$3 DLV_INSTANCE=$4 NON_DLV_INSTANCE=$5 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Load data into dlv instance nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Load data into withoutdlv instance nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for data loading to complete wait echo "Data loaded" } run_benchmarks() { NUM_TRANSACTIONS=$1 NUM_CLIENTS=$2 NUM_THREADS=$3 AWS_ACCOUNT_ALIAS=$4 AWS_REGION=$5 DLV_INSTANCE=$6 NON_DLV_INSTANCE=$7 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Clear WAL statistics echo "Clearing WAL statistics" echo "Clearing WAL statistics for dlv" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" echo "Clearing WAL statistics for withoutdlv" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" # Start benchmarks echo "Starting benchmarks" nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for benchmarks to complete wait # Retrieve WAL statistics echo "Getting WAL statistics" echo "WAL statistics for dlv instance" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "WAL statistics for withoutdlv instance" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "All done" } if [ "$COMMAND" == "load" ]; then # Check if the required arguments for load are provided if [ "$#" -ne 5 ]; then print_usage fi load_data $@ elif [ "$COMMAND" == "run" ]; then # Check if the required arguments for run are provided if [ "$#" -ne 7 ]; then print_usage fi run_benchmarks $@ elif [ "$COMMAND" == "load_run" ]; then # Check if the required arguments for load_run are provided if [ "$#" -ne 8 ]; then print_usage fi SCALE=$1 NUM_TRANSACTIONS=$2 NUM_CLIENTS=$3 NUM_THREADS=$4 AWS_ACCOUNT_ALIAS=$5 AWS_REGION=$6 DLV_INSTANCE=$7 NON_DLV_INSTANCE=$8 load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests echo "sleeping" echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")" sleep 600 echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")" run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE else echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both." print_usage fi
Save the script to a file named
dlv_bm.sh
and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE> &
- SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
- NUM_TRANSACTIONS = Number of transactions within the interval
- NUM_CLIENTS = Number of concurrent database sessions
- NUM_THREADS = Number of worker threads
- AWS_ACCOUNT_ALIAS = AWS account alias
- AWS_REGION = The region where your resources are located.
- DLV_INSTANCE = The name of your RDS instance with DLV
- NON_DLV_INSTANCE = The name of your RDS instance without DLV
For example:
nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &
Upon completion of the script, the following three log files are generated for further analysis:
The two key log files for the comparison are the ones with the
dlv-blog-run-dlv-*.log
anddlv-blog-run-withoutdlv-*.log
patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
By running adiff
command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
For example:diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "
Following is the output from the benchmark logs we did:
Monitoring DLV
This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.
Monitor DLV using Amazon CloudWatch
Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.
Monitor DLV using PostgreSQL’s pg_stat_wal
In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.
Evaluating performance
To assess the impact of DLV on database performance, we reviewed following metrics:
- Runtime
- Transactions per second (TPS) and latency average recorded by
pgbench
- PostgreSQL’s cumulative statistic,
wal_sync_time
andwal_write_time
, captured frompg_stat_wal
- Graphs from Amazon RDS Performance Insights
Runtime
The runtime represents the duration required for processing all 32 million transactions using
pgbench
, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.
Metric Without DLV With DLV % Improvement Time taken by the benchmark (minutes) 77 40 92.5% TPS and latency
The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.
Metric Without DLV With DLV % Improvement TPS 6984 13397 91.83% Latency average (milliseconds) 8.8 4.5 95.56% WAL statistics
We closely examined key WAL statistics from
pg_stat_wal
likewal_write_time
(total time spent writing WAL buffers to disk) andwal_sync_time
(total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.
Metric Without DLV With DLV % Improvement wal_write_time
(milliseconds)56656.796 51882.614 9.2 wal_sync_time
(milliseconds)4431621.33 2186910.078 102.6 RDS Performance Insights
The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
The first graph shows database load and top waits with DLV.The following graph show database load and top waits without DLV.
The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock
). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
The wait details are as follows:
- LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
- IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.
Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.
Clean up
If you are testing this solution, complete the following steps to remove resources and avoid charges
- On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
- On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.
Conclusion
In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.
About the Authors
Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
" temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] run "
echo "To load data and run benchmarks ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content usingtag for a sub-heading. You can use only
,
,
,
- , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements.
In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS.
One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance.
In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases,pgbench
repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled.
Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!Solution overview
The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled.
The following diagram illustrates the high-level architecturePrerequisites
To get started, you must complete the following prerequisites:
- Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
- Install
pgbench
on the EC2 instance. Since the PostgreSQL server installation package includes thepgbench
utility by default, you can install PostgreSQL from the amazon extras library.sudo amazon-linux-extras enable postgresql15 sudo yum install postgresql-server -y
- You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the
psql
client by default.
- Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.
To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter
--dedicated-log-volume
enables the DLV feature, and--no-dedicated-log-volume
disables it. Refer to Enabling DLV when you create a DB instance for more information.
The following code creates an RDS for PostgreSQL instance with DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2 --dedicated-log-volume
The
--no-dedicated-log-volume
flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:aws rds create-db-instance --db-instance-identifier dlv --db-instance-class db.r5.16xlarge --engine postgres --engine-version 15.8 --master-username postgres --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') --allocated-storage 1024 --enable-performance-insights --performance-insights-retention-period 31 --monitoring-interval 1 --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role --region <Your_RDS_region> --iops 10000 --storage-type io2
We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.
Performance benchmarking
This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.
Infrastructure
The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.
Instance class Operating System vCPU Memory (GiB) r5.4xlarge Amazon Linux 2023 16 128 The following table summarizes the infrastructure of each RDS instance.
Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off Database configuration parameters
Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the
track_wal_io_timing
parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:SELECT name, setting FROM pg_settings WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
Output: name | setting ------------------------+---------- checkpoint_timeout | 300 commit_delay | 0 shared_buffers | 16324847 synchronous_commit | on track_wal_io_timing | on wal_buffers | 8192 wal_compression | zstd wal_segment_size | 67108864 wal_writer_delay | 200 wal_writer_flush_after | 128
Benchmarking script
To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the
pg_stat_reset_shared
function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used thepgbench
tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s
) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
The following is the script:#!/bin/bash # Function to print usage information print_usage() { echo "Usage:" echo "To load data ==> $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To run benchmarks ==> $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" echo "To load data and run benchmarks ==> $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>" exit 1 } # Function to check if an RDS instance exists check_instance_exists() { INSTANCE=$1 AWS_REGION=$2 aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1 if [ $? -ne 0 ]; then echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'." exit 1 fi } # Function to retrieve the PostgreSQL password from AWS Secrets Manager get_pg_password() { SECRET_ID="rds/db_master_password" AWS_REGION=$1 aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password' } # Check the command and set the required number of arguments accordingly if [ "$#" -lt 2 ]; then print_usage fi COMMAND=$1 shift # Create logs directory mkdir -p $HOME/DLV/logs # Redirect stdout and stderr to a log file exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1 # Log the start time and parameters echo "Command: $COMMAND" load_data() { SCALE=$1 AWS_ACCOUNT_ALIAS=$2 AWS_REGION=$3 DLV_INSTANCE=$4 NON_DLV_INSTANCE=$5 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Load data into dlv instance nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Load data into withoutdlv instance nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for data loading to complete wait echo "Data loaded" } run_benchmarks() { NUM_TRANSACTIONS=$1 NUM_CLIENTS=$2 NUM_THREADS=$3 AWS_ACCOUNT_ALIAS=$4 AWS_REGION=$5 DLV_INSTANCE=$6 NON_DLV_INSTANCE=$7 PGPASSWORD=$(get_pg_password $AWS_REGION) export PGPASSWORD # Check if instances exist check_instance_exists $DLV_INSTANCE $AWS_REGION check_instance_exists $NON_DLV_INSTANCE $AWS_REGION echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE" # Clear WAL statistics echo "Clearing WAL statistics" echo "Clearing WAL statistics for dlv" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" echo "Clearing WAL statistics for withoutdlv" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');" # Start benchmarks echo "Starting benchmarks" nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 & # Wait for benchmarks to complete wait # Retrieve WAL statistics echo "Getting WAL statistics" echo "WAL statistics for dlv instance" psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "WAL statistics for withoutdlv instance" psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF x SELECT * FROM pg_stat_wal; x EOF echo "All done" } if [ "$COMMAND" == "load" ]; then # Check if the required arguments for load are provided if [ "$#" -ne 5 ]; then print_usage fi load_data $@ elif [ "$COMMAND" == "run" ]; then # Check if the required arguments for run are provided if [ "$#" -ne 7 ]; then print_usage fi run_benchmarks $@ elif [ "$COMMAND" == "load_run" ]; then # Check if the required arguments for load_run are provided if [ "$#" -ne 8 ]; then print_usage fi SCALE=$1 NUM_TRANSACTIONS=$2 NUM_CLIENTS=$3 NUM_THREADS=$4 AWS_ACCOUNT_ALIAS=$5 AWS_REGION=$6 DLV_INSTANCE=$7 NON_DLV_INSTANCE=$8 load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests echo "sleeping" echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")" sleep 600 echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")" run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE else echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both." print_usage fi
Save the script to a file named
dlv_bm.sh
and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE> &
- SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
- NUM_TRANSACTIONS = Number of transactions within the interval
- NUM_CLIENTS = Number of concurrent database sessions
- NUM_THREADS = Number of worker threads
- AWS_ACCOUNT_ALIAS = AWS account alias
- AWS_REGION = The region where your resources are located.
- DLV_INSTANCE = The name of your RDS instance with DLV
- NON_DLV_INSTANCE = The name of your RDS instance without DLV
For example:
nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &
Upon completion of the script, the following three log files are generated for further analysis:
The two key log files for the comparison are the ones with the
dlv-blog-run-dlv-*.log
anddlv-blog-run-withoutdlv-*.log
patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
By running adiff
command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
For example:diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "
Following is the output from the benchmark logs we did:
Monitoring DLV
This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.
Monitor DLV using Amazon CloudWatch
Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.
Monitor DLV using PostgreSQL’s pg_stat_wal
In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.
Evaluating performance
To assess the impact of DLV on database performance, we reviewed following metrics:
- Runtime
- Transactions per second (TPS) and latency average recorded by
pgbench
- PostgreSQL’s cumulative statistic,
wal_sync_time
andwal_write_time
, captured frompg_stat_wal
- Graphs from Amazon RDS Performance Insights
Runtime
The runtime represents the duration required for processing all 32 million transactions using
pgbench
, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.
Metric Without DLV With DLV % Improvement Time taken by the benchmark (minutes) 77 40 92.5% TPS and latency
The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.
Metric Without DLV With DLV % Improvement TPS 6984 13397 91.83% Latency average (milliseconds) 8.8 4.5 95.56% WAL statistics
We closely examined key WAL statistics from
pg_stat_wal
likewal_write_time
(total time spent writing WAL buffers to disk) andwal_sync_time
(total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.
Metric Without DLV With DLV % Improvement wal_write_time
(milliseconds)56656.796 51882.614 9.2 wal_sync_time
(milliseconds)4431621.33 2186910.078 102.6 RDS Performance Insights
The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
The first graph shows database load and top waits with DLV.The following graph show database load and top waits without DLV.
The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock
). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
The wait details are as follows:
- LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
- IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.
Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.
Clean up
If you are testing this solution, complete the following steps to remove resources and avoid charges
- On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
- On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.
Conclusion
In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.
About the Authors
Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
" temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] load_run "
exit 1
}
# Function to check if an RDS instance exists
check_instance_exists() {
INSTANCE=
AWS_REGION=
aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1
if [ $? -ne 0 ]; then
echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'."
exit 1
fi
}
# Function to retrieve the PostgreSQL password from AWS Secrets Manager
get_pg_password() {
SECRET_ID="rds/db_master_password"
AWS_REGION=
aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password'
}
# Check the command and set the required number of arguments accordingly
if [ "$#" -lt 2 ]; then
print_usage
fi
COMMAND=
shift
# Create logs directory
mkdir -p $HOME/DLV/logs
# Redirect stdout and stderr to a log file
exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1
# Log the start time and parameters
echo "Command: $COMMAND"
load_data() {
SCALE=
AWS_ACCOUNT_ALIAS=
AWS_REGION=
DLV_INSTANCE=
NON_DLV_INSTANCE=
PGPASSWORD=$(get_pg_password $AWS_REGION)
export PGPASSWORD
# Check if instances exist
check_instance_exists $DLV_INSTANCE $AWS_REGION
check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
# Load data into dlv instance
nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
# Load data into withoutdlv instance
nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
# Wait for data loading to complete
wait
echo "Data loaded"
}
run_benchmarks() {
NUM_TRANSACTIONS=
NUM_CLIENTS=
NUM_THREADS=
AWS_ACCOUNT_ALIAS=
AWS_REGION=
DLV_INSTANCE=
NON_DLV_INSTANCE=
PGPASSWORD=$(get_pg_password $AWS_REGION)
export PGPASSWORD
# Check if instances exist
check_instance_exists $DLV_INSTANCE $AWS_REGION
check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
# Clear WAL statistics
echo "Clearing WAL statistics"
echo "Clearing WAL statistics for dlv"
psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
echo "Clearing WAL statistics for withoutdlv"
psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
# Start benchmarks
echo "Starting benchmarks"
nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
# Wait for benchmarks to complete
wait
# Retrieve WAL statistics
echo "Getting WAL statistics"
echo "WAL statistics for dlv instance"
psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
x
SELECT * FROM pg_stat_wal;
x
EOF
echo "WAL statistics for withoutdlv instance"
psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
x
SELECT * FROM pg_stat_wal;
x
EOF
echo "All done"
}
if [ "$COMMAND" == "load" ]; then
# Check if the required arguments for load are provided
if [ "$#" -ne 5 ]; then
print_usage
fi
load_data $@
elif [ "$COMMAND" == "run" ]; then
# Check if the required arguments for run are provided
if [ "$#" -ne 7 ]; then
print_usage
fi
run_benchmarks $@
elif [ "$COMMAND" == "load_run" ]; then
# Check if the required arguments for load_run are provided
if [ "$#" -ne 8 ]; then
print_usage
fi
SCALE=
NUM_TRANSACTIONS=
NUM_CLIENTS=
NUM_THREADS=
AWS_ACCOUNT_ALIAS=
AWS_REGION=
DLV_INSTANCE=
NON_DLV_INSTANCE=
load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
# Pause for accurate visual inspection of Performance Insights
echo "sleeping"
echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")"
sleep 600
echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")"
run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
else
echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both."
print_usage
fiSave the script as
dlv_bm.sh
and make it executable. To execute the script, use the following command, adjusting the benchmarking parameters as necessary:nohup ./dlv_bm.sh load_run &
- SCALE = scale factor, i.e., the number of rows generated multiplied by the scale factor
- NUM_TRANSACTIONS = number of transactions within the interval
- NUM_CLIENTS = number of concurrent database sessions
- NUM_THREADS = number of worker threads
- AWS_ACCOUNT_ALIAS = AWS account alias
- AWS_REGION = the region where your resources are located
- DLV_INSTANCE = the name of your RDS instance with DLV
- NON_DLV_INSTANCE = the name of your RDS instance without DLV
For example:
nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &
Upon completion of the script, three log files will be generated for further analysis. The two primary log files for comparison are those with the dlv-blog-run-dlv-*.log
and dlv-blog-run-withoutdlv-*.log
patterns, representing the runs with Dedicated Log Volume enabled and disabled, respectively. A diff
command can be executed on these two log files to directly compare the transaction performance metrics reported by pgbench
.
Monitoring DLV
This section discusses various methods for monitoring an RDS for PostgreSQL instance with DLV enabled.
Monitor DLV using Amazon CloudWatch
Monitoring the DLV that stores transaction logs (WALs) is as crucial as observing other database volumes. The speed of WAL writing directly influences transaction performance, making it essential to monitor IOPS, latency, and throughput of the DLV. These metrics can be tracked using Amazon CloudWatch. The post Enhance database performance with Amazon RDS dedicated log volumes outlines relevant CloudWatch metrics for monitoring DLV usage.
Monitor DLV using PostgreSQL’s pg_stat_wal
In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can assist in monitoring WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics regarding WAL activity.
Evaluating Performance
To evaluate the impact of DLV on database performance, we reviewed the following metrics:
- Runtime
- Transactions per second (TPS) and average latency recorded by
pgbench
- PostgreSQL’s cumulative statistics,
wal_sync_time
andwal_write_time
, captured frompg_stat_wal
- Graphs from Amazon RDS Performance Insights
Runtime
The runtime indicates the duration required to process all 32 million transactions using pgbench
, with each client handling half a million transactions. The runtime saw a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This translates to a 92.5% improvement in processing time, underscoring the substantial benefits provided by the DLV-enabled instance.
Metric | Without DLV | With DLV | % Improvement |
Time taken by the benchmark (minutes) | 77 | 40 | 92.5% |
TPS and Latency
The following table illustrates the performance improvements observed in the instance with DLV enabled. TPS increased by 91.83%, nearly doubling the system’s capacity to process transactions. Concurrently, latency, a measure of response time, was reduced by 95.56%, resulting in a more responsive and efficient system. Overall, the implementation of DLV has led to significant enhancements in system performance, enabling it to handle a higher volume of tasks effectively.
Metric | Without DLV | With DLV | % Improvement |
TPS | 6984 | 13397 | 91.83% |
Latency average (milliseconds) | 8.8 | 4.5 | 95.56% |
WAL Statistics
Key WAL statistics from pg_stat_wal
, such as wal_write_time
(total time spent writing WAL buffers to disk) and wal_sync_time
(total time spent syncing WAL files), were closely examined. The DLV feature demonstrated improvements in both metrics: 102.6% for sync time and 9.2% for write time during our benchmark. This indicates faster transaction commits and a noticeable enhancement in overall database responsiveness, making DLV an excellent choice for optimizing PostgreSQL in high-demand scenarios.
Metric | Without DLV | With DLV | % Improvement |
wal_write_time (milliseconds) |
56656.796 | 51882.614 | 9.2 |
wal_sync_time (milliseconds) |
4431621.33 | 2186910.078 | 102.6 |
RDS Performance Insights
The graphs captured from RDS Performance Insights during the benchmarking period reveal significant differences in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in overall resource demand on the database. This improvement is attributed to DLV’s ability to offload the intensive task of WAL segment writing to a separate volume, thereby freeing up the main database storage for other operations.
The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock
), signaling I/O contention where the database struggles to meet data write demands. Conversely, the DLV-enabled instance experienced a significant reduction in these wait events, facilitating more efficient data retrieval.
- LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), underscoring DLV’s effectiveness in alleviating WAL-related bottlenecks.
- IO:WALSync – This wait event was slightly higher in the non-DLV instance at 0.83% of AAS compared to the DLV instance at 0.65% of AAS, further supporting DLV’s role in enhancing WAL write performance.
The benchmarking results validate DLV’s effectiveness in mitigating WAL-related wait events, demonstrating its success in offloading WAL writes and thereby enhancing overall database performance.
Clean Up
To avoid incurring charges after testing this solution, please follow these steps to remove resources:
- On the Amazon EC2 console, select the Linux EC2 instance and choose “Terminate instance” from the Instance state menu.
- On the Amazon RDS console, select the RDS for PostgreSQL instances and choose “Delete” from the Actions menu.