Managing upgrades for the Aurora PostgreSQL-Compatible Edition across multiple database clusters can often be a labor-intensive and error-prone task when approached manually. This article explores a method to automate Amazon Aurora PostgreSQL upgrades across your entire database fleet, which can reduce manual effort by as much as 80% while also minimizing the risks associated with downtime through consistent and repeatable procedures.
This piece is the second installment in our series on AWS database upgrade automation. Building on our previous discussion titled “Automate Amazon RDS for PostgreSQL major or minor version upgrade using AWS Systems Manager and Amazon EC2,” we adapt the same automation framework specifically for Aurora PostgreSQL upgrades. While the fundamental need for version management remains the same, the unique cluster architecture of Aurora PostgreSQL, along with its Copy-on-Write cloning capability, offers enhanced rollback options during the upgrade process. Although this solution does require application downtime during cluster upgrades—unlike blue/green deployments—the ability to swiftly create cluster clones provides a robust safety net.
In this guide, you will learn to construct a comprehensive automation solution that:
- Automatically identifies upgrade candidates using database tags.
- Creates safety backups with Copy-on-Write clones prior to any changes.
- Manages the entire upgrade process with minimal human intervention.
- Offers real-time monitoring and email notifications throughout the process.
Solution overview
The solution leverages AWS Systems Manager to orchestrate upgrades, Amazon Elastic Compute Cloud (Amazon EC2) to execute the automation scripts, and AWS Secrets Manager to securely manage database credentials. The program is structured around two primary modules: PREUPGRADE, which conducts readiness checks and prepares the database to identify potential issues, and UPGRADE, which oversees the actual database upgrade process, commencing with the creation of a Copy-on-Write clone for safety. Tags are employed to pinpoint Aurora clusters that qualify for an upgrade, while AWS Systems Manager executes automation documents containing Unix shell scripts that run psql and AWS Command Line Interface (AWS CLI) commands from an Amazon EC2 instance.
This solution accommodates both minor and major version upgrades and does not interfere with Aurora’s built-in Automatic minor version upgrades feature. It has been validated in a single Amazon Virtual Private Cloud (Amazon VPC) and AWS Region environment for one AWS account. For organizations with requirements for multi-VPC, cross-account, or cross-Region deployments, this architecture can be extended, although additional considerations for networking, security (including AWS Identity and Access Management (IAM) roles and permissions), replication, and data consistency must be addressed.
The workflow consists of the following steps:
- The user logs into the Systems Manager console and initiates the automation job.
- The job downloads the upgrade shell script from Amazon Simple Storage Service (Amazon S3) to the EC2 instance.
- The job connects to the EC2 instance and identifies Aurora PostgreSQL clusters based on the cluster tag. For instance, we might use a tag name: UpgradeDB and tag value: Y. For each identified Aurora PostgreSQL cluster, the job also modifies the cluster to publish PostgreSQL logs to Amazon CloudWatch logs if it hasn’t been configured already.
- The job utilizes Aurora cluster tags to locate and retrieve credentials from AWS Secrets Manager. Each Aurora cluster must have a tag where:
- Tag name:
aurora-maintenance-user-secret. - Tag value:
-maintenance-user-secret.
- Tag name:
- The job executes either pre-upgrade checks or the actual upgrade based on user input.
- The job uploads log files to Amazon S3.
- The job sends an email notification through Amazon Simple Notification Service (Amazon SNS).
Prerequisites
To implement this solution, complete the following prerequisite steps:
- Familiarize yourself with the Amazon Aurora upgrade process by consulting Upgrading Amazon Aurora PostgreSQL DB clusters.
- Ensure you have an AWS Identity and Access Management (IAM) user with appropriate permissions to manage Amazon Relational Database Service (Amazon RDS), Amazon EC2, Amazon S3, Amazon SNS, AWS Secrets Manager, and AWS Systems Manager.
- Clone the GitHub repository to your local device:
git clone https://github.com/aws-samples/sample-aurora-postgres-upgrade - Prepare the Aurora PostgreSQL cluster:
- Create a database user with the necessary privileges using psql or another Postgres client.
CREATE USER aurora_maintenance_user WITH PASSWORD 'mypassword'; GRANT rds_superuser TO aurora_maintenance_user; - Create a secret in AWS Secrets Manager for the database user. The secret name must follow this format:
-maintenance-user-secret.aws secretsmanager create-secret --name "-maintenance-user-secret" --description "Maintenance user credentials for Aurora PostgreSQL cluster" --secret-string "{"username":"aurora_maintenance_user","password":"mypassword"}" - Add the required tags to your Aurora cluster.
aws rds add-tags-to-resource --resource-name --tags Key=UpgradeDB,Value=Y Key=aurora-maintenance-user-secret,Value=-maintenance-user-secretReplace
andwith the actual values. If you have multiple clusters, consider using a Unix shell and AWS CLI-based script, a Python script, or a CI/CD pipeline to automate the tagging.
- Create a database user with the necessary privileges using psql or another Postgres client.
- Create an S3 bucket to store the shell script and log files.
- Create an SNS topic and subscription for email notifications.
- Create the necessary IAM policy and IAM role, then attach policies to the role.
- Create a custom policy using AWS CLI.
aws iam create-policy --policy-name --policy-document file://auroraupgradeinstancepolicy.jsonWhere
auroraupgradeinstancepolicy.jsonis a JSON file saved in the current directory with the following content:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "rds:DescribeDBEngineVersions", "rds:DescribeOrderableDBInstanceOptions", "rds:DescribeDBParameterGroups", "rds:DescribeDBClusterParameterGroups" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "rds:DescribeDBClusters", "rds:DescribeDBInstances", "rds:ModifyDBCluster", "rds:ModifyDBInstance", "rds:RebootDBInstance", "rds:CreateDBClusterSnapshot", "rds:DescribeDBClusterSnapshots", "rds:DescribeDBSnapshots", "rds:RestoreDBClusterToPointInTime", "rds:CreateDBParameterGroup", "rds:CreateDBClusterParameterGroup", "rds:ModifyDBParameterGroup", "rds:ModifyDBClusterParameterGroup", "rds:DescribeDBParameters", "rds:DescribeDBClusterParameters", "rds:DescribePendingMaintenanceActions", "rds:ApplyPendingMaintenanceAction", "rds:AddTagsToResource", "rds:ListTagsForResource" ], "Resource": [ "arn:aws:rds:::cluster:*", "arn:aws:rds:::cluster-pg:*", "arn:aws:rds:::cluster-snapshot:*" ] }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::", "arn:aws:s3:::/*" ] }, { "Effect": "Allow", "Action": [ "sns:Publish" ], "Resource": "" }, { "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue" ], "Resource": [ "arn:aws:secretsmanager:::secret:rds*" ] } ] } - Create an IAM role with a trust policy using the AWS CLI:
aws iam create-role --role-name --assume-role-policy-document file://trust-policy.jsonWhere
trust-policy.jsonis a JSON file saved in the current directory with the following content:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } - Attach
AmazonSSMManagedInstanceCore(AWS managed policy) to the role.aws iam attach-role-policy --role-name --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore - Attach the custom policy created in this section to the role.
aws iam attach-role-policy --role-name --policy-arn
- Create a custom policy using AWS CLI.
- Launch a new EC2 instance in the same VPC as your Aurora clusters. If you have Aurora clusters that require upgrades across different VPCs, you can launch the EC2 instance in its own VPC and then set up VPC peering. For more details, refer to Connect VPCs using VPC peering. This EC2 instance will host and run the upgrade script, necessitating AWS CLI, PostgreSQL client,
bc, andjqlibrary (both AWS CLI andjqare pre-installed on the Amazon Linux 2023 AMI). To install the required software on Amazon Linux, use the following commands: - Create an instance profile.
- Add the role created in step 7 to the instance profile.
- Associate the EC2 instance with the instance profile.
- Configure the connection between the database and the EC2 instance. Add an inbound rule to the security group of your Aurora writer instance (Type: PostgreSQL, Port: 5432, Source: the EC2 IP address). For guidance, refer to Automatically connecting an EC2 instance and an Aurora DB cluster.
Quick set up
To test this solution in a new environment, you can utilize our provided AWS CloudFormation template (create_aurora_psql_cluster_cfn.yaml). After deploying the template:
- Retrieve the required passwords from AWS Secrets Manager:
- Master user password
- aurora_maintenance_user password
- Connect to your Amazon EC2 instance using Session Manager.
- Create the maintenance user by executing these commands:
CREATE USER aurora_maintenance_user WITH PASSWORD 'mypassword'; GRANT rds_superuser TO aurora_maintenance_user;
Implementation steps
- Review the README file in the
aurora-postgres-upgradedirectory after cloning the Git repository. It contains comprehensive setup instructions for both existing environments and new deployments. For complete instructions on setting up a new environment, refer to the Testing section in the README. - Upload
aurora_psql_patch.shto your S3 bucket. - Create an AWS Systems Manager automation document using the provided
create_ssm_aurora_patch_automation_document.yamlas a template. You can create a stack from the CloudFormation console or execute the following AWS CLI command: - To create and manage the SSM automation document, you need the following permissions:
ssm:CreateDocumentssm:UpdateDocumentssm:DeleteDocumentssm:GetDocumentssm:ListDocuments
The CloudFormation stack creates two key components: the required IAM role and the AWS Systems Manager automation document. You can find the role ARN and the name of the automation document by navigating to the Outputs tab of your CloudFormation stack.
- Now you’re ready to execute the automation document from AWS Systems Manager.
- On the AWS Systems Manager console, choose Documents under Change management tools in the left navigation pane.
- Choose Owned by me.
- Select the document you created in step 3 to open it.
- Choose Execute automation.
- Keep the Execution mode at its default value of Simple execution.
- Enter the required input parameters:
- For EC2InstanceId, select the EC2 instance that you configured in the prerequisite section.
- For RunPreUpgradeTasks, choose
PREUPGRADEto run pre-upgrade checks or selectUPGRADEto perform the actual upgrade. It is advisable to run pre-upgrade checks first, review the logs, and resolve any issues before proceeding with theUPGRADE. - For ScriptsDIROnEC2, specify the directory on the EC2 instance where the shell script will be saved.
- For S3BucketName, provide the name and directory of the S3 bucket containing the
aurora_psql_patch.shscript. - For TargetEngineVersion, enter a valid target Aurora PostgreSQL version for the upgrade, such as 17.4.
- For SnsTopicArnEmail, input the SNS topic ARN for notifications.
- Leave the remaining parameters as default.
- Choose Execute.
Upon submitting your execution job, the job status will initially display as In progress. If all steps complete successfully for all Aurora clusters, the automation job status will be marked as Success. Conversely, if any cluster encounters an error, the automation job status will be marked as Failed. The logs will clearly indicate which specific clusters failed to upgrade, allowing you to investigate and potentially retry just those clusters.
To view the upgrade details, select the link under Step ID. An example of the output is shown in the following screenshot.
Monitoring and notifications
The shell script generates comprehensive logging throughout the upgrade process. During the pre-upgrade phase, logs are created for general pre-upgrade checks, including replication slot status reports and VACUUM operations. It is essential to drop the replication slots before a major version upgrade. During the upgrade phase, the script creates a cluster clone and logs configuration backups, replication slot status, ANALYZE operations, PostgreSQL extensions, and general upgrade progress. All logs are stored in LOGS_DIR on the EC2 instance and uploaded to Amazon S3 upon completion.
Summary of logs for Pre-Upgrade:
| Log File Type | Sample File Name | Directory Path | Frequency | Purpose |
| Cluster List | PREUPGRADE-cluster_list.txt | is the directory where “rds-psql-patch.sh” is saved | Each run | List of all the clusters in scope |
| Master Log | PREUPGRADE-master-20251114-21-55-51.log | /logs | Each run | General information on pre-upgrade job tasks |
| Pre-upgrade Status log | PREUPGRADE-status | /logs/ | Each run | Pre-upgrade Job status |
| Pre-upgrade Execution Log | PREUPGRADE-20251114-21-51-48.log | /logs/ | Each run | Detail view of all pre-upgrade tasks |
| Freeze Task Log | PREUPGRADE-run_db_task_freeze-20251114-21-55-52.log | /logs/ | Each run | Log on Vacuum Freeze |
| Replication Slot Log | PREUPGRADE-replication_slot_20251114-21-55-52.log | /logs/ | For Major Version Upgrade only | Current replication slot status and recommendations on actions to take before major version upgrade |
Summary of logs for Upgrade:
| Log File Type | Sample File Name | Directory Path | Frequency | Purpose/Error information |
| Cluster list | UPGRADE-cluster_list.txt | /logs | Each run | List of all the clusters in scope |
| Master Log | UPGRADE-master-20251114-22-16-11.log | /logs | Each run | General information on upgrade tasks |
| Upgrade Status log | UPGRADE-status | /logs/ | Each run | Upgrade Job Status |
| Upgrade Execution Log | UPGRADE-20251114-22-16-11.log | /logs/ | Each run | Detail view of all Upgrade tasks |
| Current cluster Configuration Backup | cluster_current_config_backup_aurora-postgresql16-20251114-19-03-32.txt | /logs/ | Each run | Backup of current cluster configuration |
| Clone on Write information | clone-info.txt | /logs/ | Each run | High level Clone cluster log |
| Operation logging | aurora-operations-summary.log | /logs/ | Each run | Database operation during the job run |
| Job Summary | aurora-session-summary.log | /logs/ | Each run | High level summary on upgrade processes |
| Replication Slot Log | UPGRADE-aurora_replication_slot_20251114-22-16-12.log | /logs/ | For Major Version Upgrade Only | Current replication slot status and recommendations on actions to take before major version upgrade |
| Extension Update Log | UPGRADE-update_aurora_db_extensions_20251114-22-16-12.log | /logs/ | Each run | Log on PostgreSQL extension updates |
| Analyze Task Log | UPGRADE-run_aurora_db_task_analyze-20251114-22-16-12.log | /logs/ | Each run | Log on ANALYZE command execution |
You can download the logs from the S3 bucket using the aws s3 cp command or through the Amazon S3 console. The script also sends email notifications via Amazon SNS to subscribed users.
Clean up
To prevent incurring future charges, it is advisable to remove the resources created during this walkthrough. If you utilized the provided scripts to create the resources, empty the S3 bucket through the Amazon S3 console, and then delete the CloudFormation stack via the AWS CloudFormation console. For any resources created manually (such as the IAM policy, IAM role, EC2 instance, SNS topic, or Secrets Manager secrets), identify and delete them individually through their respective consoles.
About the authors
Contributors to this article are experienced professionals in cloud computing and database management, with a focus on AWS technologies and solutions.