Benchmark Amazon RDS for PostgreSQL with Dedicated Log Volumes

In the current data-centric environment, organizations are increasingly challenged to maintain the performance of their mission-critical databases under demanding workloads. The Amazon Relational Database Service (Amazon RDS) provides a managed solution that is both cost-effective and highly scalable, catering to these needs effectively.

Within PostgreSQL, transactions are initially recorded in a Write-Ahead Log (WAL) buffer before being stored in a dedicated WAL segment. This logging mechanism is essential for data recovery and replication. The performance of these transactions is closely tied to the storage system housing the WAL segments. By default, PostgreSQL stores all files, including WAL, in a single location. However, the PostgreSQL documentation suggests that separating the WAL storage can significantly enhance performance. In line with this, Amazon has introduced the Dedicated Log Volume (DLV) feature, which allows users to store WAL segments on a distinct volume, fixed at 1,024 GiB with 3,000 Provisioned IOPS.

One of the standout features of DLV is its capability for write coalescing, which effectively reduces the overall IOPS demand for the WAL by consolidating log writes into larger, more efficient IO operations. In conventional storage configurations, log and data writes tend to be intermingled in smaller, frequent operations. In contrast, DLV maintains a separation that permits logs to be written in larger batches, thereby enhancing throughput and IO efficiency. This optimization can lead to reduced costs by lessening the reliance on high-performance storage while still maintaining robust performance levels.

Solution Overview

The benchmarking solution involves executing a script on an Amazon Elastic Compute Cloud (Amazon EC2) instance. This script evaluates performance across two distinct RDS for PostgreSQL instances: one with DLV enabled and the other without.

Prerequisites

To initiate the benchmarking process, the following prerequisites must be met:

  1. Launch an Amazon EC2 Linux instance. It is advisable to deploy the instance within the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
  2. Install pgbench on the EC2 instance. The PostgreSQL server installation package includes the pgbench utility by default, which can be installed from the Amazon extras library:
    sudo amazon-linux-extras enable postgresql15
    sudo yum install postgresql-server -y
  3. Ensure you have a psql client that can connect to your Amazon RDS for PostgreSQL instance. The PostgreSQL server installation also includes the psql client by default.
  4. Launch two Amazon RDS for PostgreSQL instances—one with DLV enabled and the other without.

To create the instances, utilize the AWS Command Line Interface (AWS CLI) command, ensuring to input your specific AWS Region and account number. The parameter --dedicated-log-volume activates the DLV feature, while --no-dedicated-log-volume disables it. For additional details, refer to the documentation on Enabling DLV when creating a DB instance.

Performance Benchmarking

This section delves into the benchmarks conducted to compare the performance of an RDS for PostgreSQL instance with DLV enabled against one without DLV.

Infrastructure

The infrastructure of the EC2 instance used for the benchmarking script is summarized in the table below:

Instance class Operating System vCPU Memory (GiB)
r5.4xlarge Amazon Linux 2023 16 128

The following table outlines the infrastructure specifics for each RDS instance:

Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume
dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On
withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off

Database Configuration Parameters

The performance of Amazon RDS for PostgreSQL is influenced by various database configuration parameters, particularly those related to commit, checkpoint, and WAL configuration. To gather statistics for WAL, the track_wal_io_timing parameter must be enabled. The following SQL query retrieves key parameters and their settings:

SELECT name, setting
FROM pg_settings
WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');

Benchmarking Script

To ensure accurate performance comparisons, we benchmarked the DLV and non-DLV instances independently using a specific script. Prior to the benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared function to ensure that results reflect the specific test conditions rather than previous activity. Our performance evaluation employed the pgbench tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against a data scale (-s) of 10,000.

The benchmarking script is as follows:

#!/bin/bash
# Function to print usage information
print_usage() {
    echo "Usage:"
    echo "To load data ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content using 

tag for a sub-heading. You can use only

,

    ,
      ,
    • , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements. In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance. In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases, pgbench repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled. Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!

      Solution overview

      The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled. The following diagram illustrates the high-level architecture

      Prerequisites

      To get started, you must complete the following prerequisites:
      1. Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
      2. Install pgbench on the EC2 instance. Since the PostgreSQL server installation package includes the pgbench utility by default, you can install PostgreSQL from the amazon extras library.
        sudo amazon-linux-extras enable postgresql15
        sudo yum install postgresql-server -y
      1. You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the psql client by default.
      1. Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.

      To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter --dedicated-log-volume enables the DLV feature, and --no-dedicated-log-volume disables it. Refer to Enabling DLV when you create a DB instance for more information.
      The following code creates an RDS for PostgreSQL instance with DLV:

      aws rds create-db-instance 
          --db-instance-identifier dlv 
          --db-instance-class db.r5.16xlarge 
          --engine postgres 
          --engine-version 15.8 
          --master-username postgres 
          --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
          --allocated-storage 1024 
          --enable-performance-insights 
          --performance-insights-retention-period 31 
          --monitoring-interval 1 
          --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
          --region <Your_RDS_region>   
          --iops 10000      
          --storage-type io2 
          --dedicated-log-volume

      The --no-dedicated-log-volume flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:

      aws rds create-db-instance 
          --db-instance-identifier dlv 
          --db-instance-class db.r5.16xlarge 
          --engine postgres 
          --engine-version 15.8 
          --master-username postgres 
          --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
          --allocated-storage 1024 
          --enable-performance-insights 
          --performance-insights-retention-period 31 
          --monitoring-interval 1 
          --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
          --region <Your_RDS_region>   
          --iops 10000      
          --storage-type io2

      We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.

      Performance benchmarking

      This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.

      Infrastructure

      The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.

      Instance class Operating System vCPU Memory (GiB)
      r5.4xlarge Amazon Linux 2023 16 128

      The following table summarizes the infrastructure of each RDS instance.

      Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume
      dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On
      withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off

      Database configuration parameters

      Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the track_wal_io_timing parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:

      SELECT name, setting
      FROM pg_settings
      WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
      Output:
                name          | setting
      ------------------------+----------
       checkpoint_timeout     | 300
       commit_delay           | 0
       shared_buffers         | 16324847
       synchronous_commit     | on
       track_wal_io_timing    | on
       wal_buffers            | 8192
       wal_compression        | zstd
       wal_segment_size       | 67108864
       wal_writer_delay       | 200
       wal_writer_flush_after | 128

      Benchmarking script

      To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used the pgbench tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
      The following is the script:

      #!/bin/bash
      # Function to print usage information
      print_usage() {
              echo "Usage:"
              echo "To load data ==>  $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
              echo "To run benchmarks ==>  $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
              echo "To load data and run benchmarks ==>  $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
              exit 1
      }
      # Function to check if an RDS instance exists
      check_instance_exists() {
              INSTANCE=$1
              AWS_REGION=$2
              aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1
              if [ $? -ne 0 ]; then
                      echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'."
                      exit 1
              fi
      }
      # Function to retrieve the PostgreSQL password from AWS Secrets Manager
      get_pg_password() {
              SECRET_ID="rds/db_master_password"
              AWS_REGION=$1
              aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password'
      }
      # Check the command and set the required number of arguments accordingly
      if [ "$#" -lt 2 ]; then
              print_usage
      fi
      COMMAND=$1
      shift
      # Create logs directory
      mkdir -p $HOME/DLV/logs
      # Redirect stdout and stderr to a log file
      exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1
      # Log the start time and parameters
      echo "Command: $COMMAND"
      load_data() {
              SCALE=$1
              AWS_ACCOUNT_ALIAS=$2
              AWS_REGION=$3
              DLV_INSTANCE=$4
              NON_DLV_INSTANCE=$5
              PGPASSWORD=$(get_pg_password $AWS_REGION)
              export PGPASSWORD
              # Check if instances exist
              check_instance_exists $DLV_INSTANCE $AWS_REGION
              check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
              echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
              # Load data into dlv instance
              nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Load data into withoutdlv instance
              nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Wait for data loading to complete
              wait
              echo "Data loaded"
      }
      run_benchmarks() {
              NUM_TRANSACTIONS=$1
              NUM_CLIENTS=$2
              NUM_THREADS=$3
              AWS_ACCOUNT_ALIAS=$4
              AWS_REGION=$5
              DLV_INSTANCE=$6
              NON_DLV_INSTANCE=$7
              PGPASSWORD=$(get_pg_password $AWS_REGION)
              export PGPASSWORD
              # Check if instances exist
              check_instance_exists $DLV_INSTANCE $AWS_REGION
              check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
              echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
              # Clear WAL statistics
              echo "Clearing WAL statistics"
              echo "Clearing WAL statistics for dlv"
              psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
              echo "Clearing WAL statistics for withoutdlv"
              psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
              # Start benchmarks
              echo "Starting benchmarks"
              nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Wait for benchmarks to complete
              wait
              # Retrieve WAL statistics
              echo "Getting WAL statistics"
              echo "WAL statistics for dlv instance"
              psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
      x
      SELECT * FROM pg_stat_wal;
      x
      EOF
              echo "WAL statistics for withoutdlv instance"
              psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
      x
      SELECT * FROM pg_stat_wal;
      x
      EOF
              echo "All done"
      }
      if [ "$COMMAND" == "load" ]; then
              # Check if the required arguments for load are provided
              if [ "$#" -ne 5 ]; then
                      print_usage
              fi
              load_data $@
      elif [ "$COMMAND" == "run" ]; then
              # Check if the required arguments for run are provided
              if [ "$#" -ne 7 ]; then
                      print_usage
              fi
              run_benchmarks $@
      elif [ "$COMMAND" == "load_run" ]; then
              # Check if the required arguments for load_run are provided
              if [ "$#" -ne 8 ]; then
                      print_usage
              fi
              SCALE=$1
              NUM_TRANSACTIONS=$2
              NUM_CLIENTS=$3
              NUM_THREADS=$4
              AWS_ACCOUNT_ALIAS=$5
              AWS_REGION=$6
              DLV_INSTANCE=$7
              NON_DLV_INSTANCE=$8
              load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
              # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests
              echo "sleeping"
              echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")"
              sleep 600
              echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")"
              run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
      else
              echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both."
              print_usage
      fi

      Save the script to a file named dlv_bm.sh and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:

      nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION>  <DLV_INSTANCE> <NON_DLV_INSTANCE> &
      • SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
      • NUM_TRANSACTIONS = Number of transactions within the interval
      • NUM_CLIENTS = Number of concurrent database sessions
      • NUM_THREADS = Number of worker threads
      • AWS_ACCOUNT_ALIAS = AWS account alias
      • AWS_REGION = The region where your resources are located.
      • DLV_INSTANCE = The name of your RDS instance with DLV
      • NON_DLV_INSTANCE = The name of your RDS instance without DLV

      For example:

      nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &amp;

      Upon completion of the script, the following three log files are generated for further analysis:

      The two key log files for the comparison are the ones with the dlv-blog-run-dlv-*.log and dlv-blog-run-withoutdlv-*.log patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
      By running a diff command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
      For example:

      diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "

      Following is the output from the benchmark logs we did:

      Monitoring DLV

      This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.

      Monitor DLV using Amazon CloudWatch

      Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.

      Monitor DLV using PostgreSQL’s pg_stat_wal

      In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.

      Evaluating performance

      To assess the impact of DLV on database performance, we reviewed following metrics:

      • Runtime
      • Transactions per second (TPS) and latency average recorded by pgbench
      • PostgreSQL’s cumulative statistic, wal_sync_time and wal_write_time, captured from pg_stat_wal
      • Graphs from Amazon RDS Performance Insights

      Runtime

      The runtime represents the duration required for processing all 32 million transactions using pgbench, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.

      Metric Without DLV With DLV % Improvement
      Time taken by the benchmark (minutes) 77 40 92.5%

      TPS and latency

      The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.

      Metric Without DLV With DLV % Improvement
      TPS 6984 13397 91.83%
      Latency average (milliseconds) 8.8 4.5 95.56%

      WAL statistics

      We closely examined key WAL statistics from pg_stat_wal like wal_write_time (total time spent writing WAL buffers to disk) and wal_sync_time (total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.

      Metric Without DLV With DLV % Improvement
      wal_write_time (milliseconds) 56656.796 51882.614 9.2
      wal_sync_time (milliseconds) 4431621.33 2186910.078 102.6

      RDS Performance Insights

      The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
      The first graph shows database load and top waits with DLV.

      The following graph show database load and top waits without DLV.

      The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
      The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
      The wait details are as follows:

      • LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
      • IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.

      Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.

      Clean up

      If you are testing this solution, complete the following steps to remove resources and avoid charges

      1. On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
      2. On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.

      Conclusion

      In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
      Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
      DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
      Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.


      About the Authors

      Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
      Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
      Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
      " temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] load "
      echo "To run benchmarks ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content using

      tag for a sub-heading. You can use only

      ,

        ,

          ,

        • , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements.
          In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS.
          One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance.
          In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases, pgbench repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled.
          Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!

          Solution overview

          The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled.
          The following diagram illustrates the high-level architecture

          Prerequisites

          To get started, you must complete the following prerequisites:

          1. Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
          2. Install pgbench on the EC2 instance. Since the PostgreSQL server installation package includes the pgbench utility by default, you can install PostgreSQL from the amazon extras library.
            sudo amazon-linux-extras enable postgresql15
            sudo yum install postgresql-server -y
          1. You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the psql client by default.
          1. Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.

          To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter --dedicated-log-volume enables the DLV feature, and --no-dedicated-log-volume disables it. Refer to Enabling DLV when you create a DB instance for more information.
          The following code creates an RDS for PostgreSQL instance with DLV:

          aws rds create-db-instance 
              --db-instance-identifier dlv 
              --db-instance-class db.r5.16xlarge 
              --engine postgres 
              --engine-version 15.8 
              --master-username postgres 
              --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
              --allocated-storage 1024 
              --enable-performance-insights 
              --performance-insights-retention-period 31 
              --monitoring-interval 1 
              --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
              --region <Your_RDS_region>   
              --iops 10000      
              --storage-type io2 
              --dedicated-log-volume

          The --no-dedicated-log-volume flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:

          aws rds create-db-instance 
              --db-instance-identifier dlv 
              --db-instance-class db.r5.16xlarge 
              --engine postgres 
              --engine-version 15.8 
              --master-username postgres 
              --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
              --allocated-storage 1024 
              --enable-performance-insights 
              --performance-insights-retention-period 31 
              --monitoring-interval 1 
              --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
              --region <Your_RDS_region>   
              --iops 10000      
              --storage-type io2

          We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.

          Performance benchmarking

          This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.

          Infrastructure

          The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.

          Instance class Operating System vCPU Memory (GiB)
          r5.4xlarge Amazon Linux 2023 16 128

          The following table summarizes the infrastructure of each RDS instance.

          Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume
          dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On
          withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off

          Database configuration parameters

          Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the track_wal_io_timing parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:

          SELECT name, setting
          FROM pg_settings
          WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
          Output:
                    name          | setting
          ------------------------+----------
           checkpoint_timeout     | 300
           commit_delay           | 0
           shared_buffers         | 16324847
           synchronous_commit     | on
           track_wal_io_timing    | on
           wal_buffers            | 8192
           wal_compression        | zstd
           wal_segment_size       | 67108864
           wal_writer_delay       | 200
           wal_writer_flush_after | 128

          Benchmarking script

          To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used the pgbench tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
          The following is the script:

          #!/bin/bash
          # Function to print usage information
          print_usage() {
                  echo "Usage:"
                  echo "To load data ==>  $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                  echo "To run benchmarks ==>  $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                  echo "To load data and run benchmarks ==>  $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                  exit 1
          }
          # Function to check if an RDS instance exists
          check_instance_exists() {
                  INSTANCE=$1
                  AWS_REGION=$2
                  aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1
                  if [ $? -ne 0 ]; then
                          echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'."
                          exit 1
                  fi
          }
          # Function to retrieve the PostgreSQL password from AWS Secrets Manager
          get_pg_password() {
                  SECRET_ID="rds/db_master_password"
                  AWS_REGION=$1
                  aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password'
          }
          # Check the command and set the required number of arguments accordingly
          if [ "$#" -lt 2 ]; then
                  print_usage
          fi
          COMMAND=$1
          shift
          # Create logs directory
          mkdir -p $HOME/DLV/logs
          # Redirect stdout and stderr to a log file
          exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1
          # Log the start time and parameters
          echo "Command: $COMMAND"
          load_data() {
                  SCALE=$1
                  AWS_ACCOUNT_ALIAS=$2
                  AWS_REGION=$3
                  DLV_INSTANCE=$4
                  NON_DLV_INSTANCE=$5
                  PGPASSWORD=$(get_pg_password $AWS_REGION)
                  export PGPASSWORD
                  # Check if instances exist
                  check_instance_exists $DLV_INSTANCE $AWS_REGION
                  check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
                  echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
                  # Load data into dlv instance
                  nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                  # Load data into withoutdlv instance
                  nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                  # Wait for data loading to complete
                  wait
                  echo "Data loaded"
          }
          run_benchmarks() {
                  NUM_TRANSACTIONS=$1
                  NUM_CLIENTS=$2
                  NUM_THREADS=$3
                  AWS_ACCOUNT_ALIAS=$4
                  AWS_REGION=$5
                  DLV_INSTANCE=$6
                  NON_DLV_INSTANCE=$7
                  PGPASSWORD=$(get_pg_password $AWS_REGION)
                  export PGPASSWORD
                  # Check if instances exist
                  check_instance_exists $DLV_INSTANCE $AWS_REGION
                  check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
                  echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
                  # Clear WAL statistics
                  echo "Clearing WAL statistics"
                  echo "Clearing WAL statistics for dlv"
                  psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
                  echo "Clearing WAL statistics for withoutdlv"
                  psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
                  # Start benchmarks
                  echo "Starting benchmarks"
                  nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                  nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                  # Wait for benchmarks to complete
                  wait
                  # Retrieve WAL statistics
                  echo "Getting WAL statistics"
                  echo "WAL statistics for dlv instance"
                  psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
          x
          SELECT * FROM pg_stat_wal;
          x
          EOF
                  echo "WAL statistics for withoutdlv instance"
                  psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
          x
          SELECT * FROM pg_stat_wal;
          x
          EOF
                  echo "All done"
          }
          if [ "$COMMAND" == "load" ]; then
                  # Check if the required arguments for load are provided
                  if [ "$#" -ne 5 ]; then
                          print_usage
                  fi
                  load_data $@
          elif [ "$COMMAND" == "run" ]; then
                  # Check if the required arguments for run are provided
                  if [ "$#" -ne 7 ]; then
                          print_usage
                  fi
                  run_benchmarks $@
          elif [ "$COMMAND" == "load_run" ]; then
                  # Check if the required arguments for load_run are provided
                  if [ "$#" -ne 8 ]; then
                          print_usage
                  fi
                  SCALE=$1
                  NUM_TRANSACTIONS=$2
                  NUM_CLIENTS=$3
                  NUM_THREADS=$4
                  AWS_ACCOUNT_ALIAS=$5
                  AWS_REGION=$6
                  DLV_INSTANCE=$7
                  NON_DLV_INSTANCE=$8
                  load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
                  # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests
                  echo "sleeping"
                  echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")"
                  sleep 600
                  echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")"
                  run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
          else
                  echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both."
                  print_usage
          fi

          Save the script to a file named dlv_bm.sh and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:

          nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION>  <DLV_INSTANCE> <NON_DLV_INSTANCE> &
          • SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
          • NUM_TRANSACTIONS = Number of transactions within the interval
          • NUM_CLIENTS = Number of concurrent database sessions
          • NUM_THREADS = Number of worker threads
          • AWS_ACCOUNT_ALIAS = AWS account alias
          • AWS_REGION = The region where your resources are located.
          • DLV_INSTANCE = The name of your RDS instance with DLV
          • NON_DLV_INSTANCE = The name of your RDS instance without DLV

          For example:

          nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &amp;

          Upon completion of the script, the following three log files are generated for further analysis:

          The two key log files for the comparison are the ones with the dlv-blog-run-dlv-*.log and dlv-blog-run-withoutdlv-*.log patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
          By running a diff command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
          For example:

          diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "

          Following is the output from the benchmark logs we did:

          Monitoring DLV

          This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.

          Monitor DLV using Amazon CloudWatch

          Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.

          Monitor DLV using PostgreSQL’s pg_stat_wal

          In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.

          Evaluating performance

          To assess the impact of DLV on database performance, we reviewed following metrics:

          • Runtime
          • Transactions per second (TPS) and latency average recorded by pgbench
          • PostgreSQL’s cumulative statistic, wal_sync_time and wal_write_time, captured from pg_stat_wal
          • Graphs from Amazon RDS Performance Insights

          Runtime

          The runtime represents the duration required for processing all 32 million transactions using pgbench, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.

          Metric Without DLV With DLV % Improvement
          Time taken by the benchmark (minutes) 77 40 92.5%

          TPS and latency

          The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.

          Metric Without DLV With DLV % Improvement
          TPS 6984 13397 91.83%
          Latency average (milliseconds) 8.8 4.5 95.56%

          WAL statistics

          We closely examined key WAL statistics from pg_stat_wal like wal_write_time (total time spent writing WAL buffers to disk) and wal_sync_time (total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.

          Metric Without DLV With DLV % Improvement
          wal_write_time (milliseconds) 56656.796 51882.614 9.2
          wal_sync_time (milliseconds) 4431621.33 2186910.078 102.6

          RDS Performance Insights

          The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
          The first graph shows database load and top waits with DLV.

          The following graph show database load and top waits without DLV.

          The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
          The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
          The wait details are as follows:

          • LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
          • IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.

          Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.

          Clean up

          If you are testing this solution, complete the following steps to remove resources and avoid charges

          1. On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
          2. On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.

          Conclusion

          In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
          Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
          DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
          Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.


          About the Authors

          Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
          Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
          Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
          " temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] run "
          echo "To load data and run benchmarks ==> [cyberseo_openai model="gpt-4o-mini" prompt="Rewrite a news story for a technical publication, in a calm style with creativity and flair based on text below, making sure it reads like human-written text in a natural way. The article shall NOT include a title, introduction and conclusion. The article shall NOT start from a title. Response language English. Generate HTML-formatted content using

          tag for a sub-heading. You can use only

          ,

            ,

              ,

            • , and HTML tags if necessary. Text: In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements.
              In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS.
              One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance.
              In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases, pgbench repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled.
              Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!

              Solution overview

              The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled.
              The following diagram illustrates the high-level architecture

              Prerequisites

              To get started, you must complete the following prerequisites:

              1. Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
              2. Install pgbench on the EC2 instance. Since the PostgreSQL server installation package includes the pgbench utility by default, you can install PostgreSQL from the amazon extras library.
                sudo amazon-linux-extras enable postgresql15
                sudo yum install postgresql-server -y
              1. You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the psql client by default.
              1. Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.

              To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter --dedicated-log-volume enables the DLV feature, and --no-dedicated-log-volume disables it. Refer to Enabling DLV when you create a DB instance for more information.
              The following code creates an RDS for PostgreSQL instance with DLV:

              aws rds create-db-instance 
                  --db-instance-identifier dlv 
                  --db-instance-class db.r5.16xlarge 
                  --engine postgres 
                  --engine-version 15.8 
                  --master-username postgres 
                  --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
                  --allocated-storage 1024 
                  --enable-performance-insights 
                  --performance-insights-retention-period 31 
                  --monitoring-interval 1 
                  --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
                  --region <Your_RDS_region>   
                  --iops 10000      
                  --storage-type io2 
                  --dedicated-log-volume

              The --no-dedicated-log-volume flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:

              aws rds create-db-instance 
                  --db-instance-identifier dlv 
                  --db-instance-class db.r5.16xlarge 
                  --engine postgres 
                  --engine-version 15.8 
                  --master-username postgres 
                  --master-user-password $(aws secretsmanager get-secret-value --secret-id rds/db_master_password --query SecretString --output text | jq -r '.master_user_password') 
                  --allocated-storage 1024 
                  --enable-performance-insights 
                  --performance-insights-retention-period 31 
                  --monitoring-interval 1 
                  --monitoring-role-arn arn:aws:iam::<your_aws_account_number>:role/rds-monitoring-role 
                  --region <Your_RDS_region>   
                  --iops 10000      
                  --storage-type io2

              We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.

              Performance benchmarking

              This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.

              Infrastructure

              The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.

              Instance class Operating System vCPU Memory (GiB)
              r5.4xlarge Amazon Linux 2023 16 128

              The following table summarizes the infrastructure of each RDS instance.

              Instance name Instance class vCPU Memory (GiB) Storage Dedicated Log Volume
              dlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) On
              withoutdlv db.r5.16xlarge 64 512 IO2 (10,000 PIOPS) Off

              Database configuration parameters

              Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the track_wal_io_timing parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:

              SELECT name, setting
              FROM pg_settings
              WHERE name IN ('synchronous_commit', 'commit_delay', 'checkpoint_timeout', 'wal_writer_delay', 'wal_buffers', 'shared_buffers', 'wal_compression', 'wal_writer_flush_after', 'wal_segment_size','track_wal_io_timing');
              Output:
                        name          | setting
              ------------------------+----------
               checkpoint_timeout     | 300
               commit_delay           | 0
               shared_buffers         | 16324847
               synchronous_commit     | on
               track_wal_io_timing    | on
               wal_buffers            | 8192
               wal_compression        | zstd
               wal_segment_size       | 67108864
               wal_writer_delay       | 200
               wal_writer_flush_after | 128

              Benchmarking script

              To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used the pgbench tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
              The following is the script:

              #!/bin/bash
              # Function to print usage information
              print_usage() {
                      echo "Usage:"
                      echo "To load data ==>  $0 load <SCALE> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                      echo "To run benchmarks ==>  $0 run <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                      echo "To load data and run benchmarks ==>  $0 load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION> <DLV_INSTANCE> <NON_DLV_INSTANCE>"
                      exit 1
              }
              # Function to check if an RDS instance exists
              check_instance_exists() {
                      INSTANCE=$1
                      AWS_REGION=$2
                      aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1
                      if [ $? -ne 0 ]; then
                              echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'."
                              exit 1
                      fi
              }
              # Function to retrieve the PostgreSQL password from AWS Secrets Manager
              get_pg_password() {
                      SECRET_ID="rds/db_master_password"
                      AWS_REGION=$1
                      aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password'
              }
              # Check the command and set the required number of arguments accordingly
              if [ "$#" -lt 2 ]; then
                      print_usage
              fi
              COMMAND=$1
              shift
              # Create logs directory
              mkdir -p $HOME/DLV/logs
              # Redirect stdout and stderr to a log file
              exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1
              # Log the start time and parameters
              echo "Command: $COMMAND"
              load_data() {
                      SCALE=$1
                      AWS_ACCOUNT_ALIAS=$2
                      AWS_REGION=$3
                      DLV_INSTANCE=$4
                      NON_DLV_INSTANCE=$5
                      PGPASSWORD=$(get_pg_password $AWS_REGION)
                      export PGPASSWORD
                      # Check if instances exist
                      check_instance_exists $DLV_INSTANCE $AWS_REGION
                      check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
                      echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
                      # Load data into dlv instance
                      nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                      # Load data into withoutdlv instance
                      nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                      # Wait for data loading to complete
                      wait
                      echo "Data loaded"
              }
              run_benchmarks() {
                      NUM_TRANSACTIONS=$1
                      NUM_CLIENTS=$2
                      NUM_THREADS=$3
                      AWS_ACCOUNT_ALIAS=$4
                      AWS_REGION=$5
                      DLV_INSTANCE=$6
                      NON_DLV_INSTANCE=$7
                      PGPASSWORD=$(get_pg_password $AWS_REGION)
                      export PGPASSWORD
                      # Check if instances exist
                      check_instance_exists $DLV_INSTANCE $AWS_REGION
                      check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
                      echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
                      # Clear WAL statistics
                      echo "Clearing WAL statistics"
                      echo "Clearing WAL statistics for dlv"
                      psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
                      echo "Clearing WAL statistics for withoutdlv"
                      psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
                      # Start benchmarks
                      echo "Starting benchmarks"
                      nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                      nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
                      # Wait for benchmarks to complete
                      wait
                      # Retrieve WAL statistics
                      echo "Getting WAL statistics"
                      echo "WAL statistics for dlv instance"
                      psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
              x
              SELECT * FROM pg_stat_wal;
              x
              EOF
                      echo "WAL statistics for withoutdlv instance"
                      psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
              x
              SELECT * FROM pg_stat_wal;
              x
              EOF
                      echo "All done"
              }
              if [ "$COMMAND" == "load" ]; then
                      # Check if the required arguments for load are provided
                      if [ "$#" -ne 5 ]; then
                              print_usage
                      fi
                      load_data $@
              elif [ "$COMMAND" == "run" ]; then
                      # Check if the required arguments for run are provided
                      if [ "$#" -ne 7 ]; then
                              print_usage
                      fi
                      run_benchmarks $@
              elif [ "$COMMAND" == "load_run" ]; then
                      # Check if the required arguments for load_run are provided
                      if [ "$#" -ne 8 ]; then
                              print_usage
                      fi
                      SCALE=$1
                      NUM_TRANSACTIONS=$2
                      NUM_CLIENTS=$3
                      NUM_THREADS=$4
                      AWS_ACCOUNT_ALIAS=$5
                      AWS_REGION=$6
                      DLV_INSTANCE=$7
                      NON_DLV_INSTANCE=$8
                      load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
                      # Pause to allow for more accurate visual inspection of Performance Insights for benchmarking tests
                      echo "sleeping"
                      echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")"
                      sleep 600
                      echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")"
                      run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
              else
                      echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both."
                      print_usage
              fi

              Save the script to a file named dlv_bm.sh and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:

              nohup ./dlv_bm.sh load_run <SCALE> <NUM_TRANSACTIONS> <NUM_CLIENTS> <NUM_THREADS> <AWS_ACCOUNT_ALIAS> <AWS_REGION>  <DLV_INSTANCE> <NON_DLV_INSTANCE> &
              • SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
              • NUM_TRANSACTIONS = Number of transactions within the interval
              • NUM_CLIENTS = Number of concurrent database sessions
              • NUM_THREADS = Number of worker threads
              • AWS_ACCOUNT_ALIAS = AWS account alias
              • AWS_REGION = The region where your resources are located.
              • DLV_INSTANCE = The name of your RDS instance with DLV
              • NON_DLV_INSTANCE = The name of your RDS instance without DLV

              For example:

              nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &amp;

              Upon completion of the script, the following three log files are generated for further analysis:

              The two key log files for the comparison are the ones with the dlv-blog-run-dlv-*.log and dlv-blog-run-withoutdlv-*.log patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
              By running a diff command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
              For example:

              diff dlv-blog-run-withoutdlv-t-500000-64-32-28-10-2024-0249.log dlv-blog-run-dlv-t-500000-64-32-28-10-2024-0249.log -y|grep "tps = "

              Following is the output from the benchmark logs we did:

              Monitoring DLV

              This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.

              Monitor DLV using Amazon CloudWatch

              Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.

              Monitor DLV using PostgreSQL’s pg_stat_wal

              In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.

              Evaluating performance

              To assess the impact of DLV on database performance, we reviewed following metrics:

              • Runtime
              • Transactions per second (TPS) and latency average recorded by pgbench
              • PostgreSQL’s cumulative statistic, wal_sync_time and wal_write_time, captured from pg_stat_wal
              • Graphs from Amazon RDS Performance Insights

              Runtime

              The runtime represents the duration required for processing all 32 million transactions using pgbench, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.

              Metric Without DLV With DLV % Improvement
              Time taken by the benchmark (minutes) 77 40 92.5%

              TPS and latency

              The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.

              Metric Without DLV With DLV % Improvement
              TPS 6984 13397 91.83%
              Latency average (milliseconds) 8.8 4.5 95.56%

              WAL statistics

              We closely examined key WAL statistics from pg_stat_wal like wal_write_time (total time spent writing WAL buffers to disk) and wal_sync_time (total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.

              Metric Without DLV With DLV % Improvement
              wal_write_time (milliseconds) 56656.796 51882.614 9.2
              wal_sync_time (milliseconds) 4431621.33 2186910.078 102.6

              RDS Performance Insights

              The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
              The first graph shows database load and top waits with DLV.

              The following graph show database load and top waits without DLV.

              The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
              The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
              The wait details are as follows:

              • LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
              • IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.

              Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.

              Clean up

              If you are testing this solution, complete the following steps to remove resources and avoid charges

              1. On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
              2. On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.

              Conclusion

              In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
              Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
              DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
              Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.


              About the Authors

              Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
              Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
              Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
              " temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" ] load_run "
              exit 1
              }
              # Function to check if an RDS instance exists
              check_instance_exists() {
              INSTANCE=
              AWS_REGION=
              aws rds describe-db-instances --db-instance-identifier $INSTANCE --region $AWS_REGION >/dev/null 2>&1
              if [ $? -ne 0 ]; then
              echo "Error: RDS instance '$INSTANCE' does not exist in region '$AWS_REGION'."
              exit 1
              fi
              }
              # Function to retrieve the PostgreSQL password from AWS Secrets Manager
              get_pg_password() {
              SECRET_ID="rds/db_master_password"
              AWS_REGION=
              aws secretsmanager get-secret-value --secret-id $SECRET_ID --region $AWS_REGION --query SecretString --output text | jq -r '.master_user_password'
              }
              # Check the command and set the required number of arguments accordingly
              if [ "$#" -lt 2 ]; then
              print_usage
              fi
              COMMAND=
              shift
              # Create logs directory
              mkdir -p $HOME/DLV/logs
              # Redirect stdout and stderr to a log file
              exec >$HOME/DLV/logs/dlv-blog-$COMMAND-$(date +"%d-%m-%Y-%H%M").log 2>&1
              # Log the start time and parameters
              echo "Command: $COMMAND"
              load_data() {
              SCALE=
              AWS_ACCOUNT_ALIAS=
              AWS_REGION=
              DLV_INSTANCE=
              NON_DLV_INSTANCE=
              PGPASSWORD=$(get_pg_password $AWS_REGION)
              export PGPASSWORD
              # Check if instances exist
              check_instance_exists $DLV_INSTANCE $AWS_REGION
              check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
              echo "Load parameters: SCALE=$SCALE, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
              # Load data into dlv instance
              nohup pgbench -i -q -s $SCALE -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-dlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Load data into withoutdlv instance
              nohup pgbench -i -q -s $SCALE -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -d postgres >$HOME/DLV/logs/dlv-blog-load-withoutdlv-t-$SCALE-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Wait for data loading to complete
              wait
              echo "Data loaded"
              }
              run_benchmarks() {
              NUM_TRANSACTIONS=
              NUM_CLIENTS=
              NUM_THREADS=
              AWS_ACCOUNT_ALIAS=
              AWS_REGION=
              DLV_INSTANCE=
              NON_DLV_INSTANCE=
              PGPASSWORD=$(get_pg_password $AWS_REGION)
              export PGPASSWORD
              # Check if instances exist
              check_instance_exists $DLV_INSTANCE $AWS_REGION
              check_instance_exists $NON_DLV_INSTANCE $AWS_REGION
              echo "Run parameters: NUM_TRANSACTIONS=$NUM_TRANSACTIONS, NUM_CLIENTS=$NUM_CLIENTS, NUM_THREADS=$NUM_THREADS, AWS_ACCOUNT_ALIAS=$AWS_ACCOUNT_ALIAS, AWS_REGION=$AWS_REGION, DLV_INSTANCE=$DLV_INSTANCE, NON_DLV_INSTANCE=$NON_DLV_INSTANCE"
              # Clear WAL statistics
              echo "Clearing WAL statistics"
              echo "Clearing WAL statistics for dlv"
              psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
              echo "Clearing WAL statistics for withoutdlv"
              psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres -c "SELECT pg_stat_reset_shared('wal');"
              # Start benchmarks
              echo "Starting benchmarks"
              nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-dlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              nohup pgbench postgres -c $NUM_CLIENTS -j $NUM_THREADS -t $NUM_TRANSACTIONS -N -b tpcb-like -P 60 -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres >$HOME/DLV/logs/dlv-blog-run-withoutdlv-t-$NUM_TRANSACTIONS-$NUM_CLIENTS-$NUM_THREADS-$(date +"%d-%m-%Y-%H%M").log 2>&1 &
              # Wait for benchmarks to complete
              wait
              # Retrieve WAL statistics
              echo "Getting WAL statistics"
              echo "WAL statistics for dlv instance"
              psql -h $DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
              x
              SELECT * FROM pg_stat_wal;
              x
              EOF
              echo "WAL statistics for withoutdlv instance"
              psql -h $NON_DLV_INSTANCE.$AWS_ACCOUNT_ALIAS.$AWS_REGION.rds.amazonaws.com -U postgres <<EOF
              x
              SELECT * FROM pg_stat_wal;
              x
              EOF
              echo "All done"
              }
              if [ "$COMMAND" == "load" ]; then
              # Check if the required arguments for load are provided
              if [ "$#" -ne 5 ]; then
              print_usage
              fi
              load_data $@
              elif [ "$COMMAND" == "run" ]; then
              # Check if the required arguments for run are provided
              if [ "$#" -ne 7 ]; then
              print_usage
              fi
              run_benchmarks $@
              elif [ "$COMMAND" == "load_run" ]; then
              # Check if the required arguments for load_run are provided
              if [ "$#" -ne 8 ]; then
              print_usage
              fi
              SCALE=
              NUM_TRANSACTIONS=
              NUM_CLIENTS=
              NUM_THREADS=
              AWS_ACCOUNT_ALIAS=
              AWS_REGION=
              DLV_INSTANCE=
              NON_DLV_INSTANCE=
              load_data $SCALE $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
              # Pause for accurate visual inspection of Performance Insights
              echo "sleeping"
              echo "sleep starts: $(date +"%d-%m-%Y-%H%M%S")"
              sleep 600
              echo "sleep ends: $(date +"%d-%m-%Y-%H%M%S")"
              run_benchmarks $NUM_TRANSACTIONS $NUM_CLIENTS $NUM_THREADS $AWS_ACCOUNT_ALIAS $AWS_REGION $DLV_INSTANCE $NON_DLV_INSTANCE
              else
              echo "Invalid command: $COMMAND. Use 'load' to load data, 'run' to run benchmarks, or 'load_run' to do both."
              print_usage
              fi

              Save the script as dlv_bm.sh and make it executable. To execute the script, use the following command, adjusting the benchmarking parameters as necessary:

              nohup ./dlv_bm.sh load_run         &
              • SCALE = scale factor, i.e., the number of rows generated multiplied by the scale factor
              • NUM_TRANSACTIONS = number of transactions within the interval
              • NUM_CLIENTS = number of concurrent database sessions
              • NUM_THREADS = number of worker threads
              • AWS_ACCOUNT_ALIAS = AWS account alias
              • AWS_REGION = the region where your resources are located
              • DLV_INSTANCE = the name of your RDS instance with DLV
              • NON_DLV_INSTANCE = the name of your RDS instance without DLV

              For example:

              nohup ./dlv_bm.sh load_run 10000 500000 64 32 ab24x43b223s us-west-1 dlv withoutdlv &

              Upon completion of the script, three log files will be generated for further analysis. The two primary log files for comparison are those with the dlv-blog-run-dlv-*.log and dlv-blog-run-withoutdlv-*.log patterns, representing the runs with Dedicated Log Volume enabled and disabled, respectively. A diff command can be executed on these two log files to directly compare the transaction performance metrics reported by pgbench.

              Monitoring DLV

              This section discusses various methods for monitoring an RDS for PostgreSQL instance with DLV enabled.

              Monitor DLV using Amazon CloudWatch

              Monitoring the DLV that stores transaction logs (WALs) is as crucial as observing other database volumes. The speed of WAL writing directly influences transaction performance, making it essential to monitor IOPS, latency, and throughput of the DLV. These metrics can be tracked using Amazon CloudWatch. The post Enhance database performance with Amazon RDS dedicated log volumes outlines relevant CloudWatch metrics for monitoring DLV usage.

              Monitor DLV using PostgreSQL’s pg_stat_wal

              In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can assist in monitoring WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics regarding WAL activity.

              Evaluating Performance

              To evaluate the impact of DLV on database performance, we reviewed the following metrics:

              • Runtime
              • Transactions per second (TPS) and average latency recorded by pgbench
              • PostgreSQL’s cumulative statistics, wal_sync_time and wal_write_time, captured from pg_stat_wal
              • Graphs from Amazon RDS Performance Insights

              Runtime

              The runtime indicates the duration required to process all 32 million transactions using pgbench, with each client handling half a million transactions. The runtime saw a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This translates to a 92.5% improvement in processing time, underscoring the substantial benefits provided by the DLV-enabled instance.

              Metric Without DLV With DLV % Improvement
              Time taken by the benchmark (minutes) 77 40 92.5%

              TPS and Latency

              The following table illustrates the performance improvements observed in the instance with DLV enabled. TPS increased by 91.83%, nearly doubling the system’s capacity to process transactions. Concurrently, latency, a measure of response time, was reduced by 95.56%, resulting in a more responsive and efficient system. Overall, the implementation of DLV has led to significant enhancements in system performance, enabling it to handle a higher volume of tasks effectively.

              Metric Without DLV With DLV % Improvement
              TPS 6984 13397 91.83%
              Latency average (milliseconds) 8.8 4.5 95.56%

              WAL Statistics

              Key WAL statistics from pg_stat_wal, such as wal_write_time (total time spent writing WAL buffers to disk) and wal_sync_time (total time spent syncing WAL files), were closely examined. The DLV feature demonstrated improvements in both metrics: 102.6% for sync time and 9.2% for write time during our benchmark. This indicates faster transaction commits and a noticeable enhancement in overall database responsiveness, making DLV an excellent choice for optimizing PostgreSQL in high-demand scenarios.

              Metric Without DLV With DLV % Improvement
              wal_write_time (milliseconds) 56656.796 51882.614 9.2
              wal_sync_time (milliseconds) 4431621.33 2186910.078 102.6

              RDS Performance Insights

              The graphs captured from RDS Performance Insights during the benchmarking period reveal significant differences in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in overall resource demand on the database. This improvement is attributed to DLV’s ability to offload the intensive task of WAL segment writing to a separate volume, thereby freeing up the main database storage for other operations.

              The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock), signaling I/O contention where the database struggles to meet data write demands. Conversely, the DLV-enabled instance experienced a significant reduction in these wait events, facilitating more efficient data retrieval.

              • LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), underscoring DLV’s effectiveness in alleviating WAL-related bottlenecks.
              • IO:WALSync – This wait event was slightly higher in the non-DLV instance at 0.83% of AAS compared to the DLV instance at 0.65% of AAS, further supporting DLV’s role in enhancing WAL write performance.

              The benchmarking results validate DLV’s effectiveness in mitigating WAL-related wait events, demonstrating its success in offloading WAL writes and thereby enhancing overall database performance.

              Clean Up

              To avoid incurring charges after testing this solution, please follow these steps to remove resources:

              1. On the Amazon EC2 console, select the Linux EC2 instance and choose “Terminate instance” from the Instance state menu.
              2. On the Amazon RDS console, select the RDS for PostgreSQL instances and choose “Delete” from the Actions menu.

Tech Optimizer
Benchmark Amazon RDS for PostgreSQL with Dedicated Log Volumes