The AWS Database Migration Service offers a comprehensive managed solution for migrating and replicating databases to Amazon Web Services (AWS), ensuring both data security and integrity throughout the process. It supports homogeneous migrations, where the source and target databases share the same engine, as well as heterogeneous migrations that facilitate transitions between different database environments.
AWS DMS enables seamless data migration from PostgreSQL databases to any supported target and allows for migration from any supported source to a PostgreSQL database. This functionality provides businesses with a robust pathway to transition their data infrastructure to the cloud.
Solution overview
Open-source PostgreSQL regularly releases new minor and major versions that address common bugs, security vulnerabilities, and data corruption issues. Amazon RDS typically aims to support new engine versions within five months of their release. Upgrading your PostgreSQL instances is crucial when a version reaches its end of support, or if you wish to implement improvements or maintain compliance to safeguard your data.
When planning to upgrade your PostgreSQL database configured as a source or target for an ongoing AWS DMS task, it is essential to incorporate this into your upgrade strategy. This article explores best practices for managing AWS DMS tasks during PostgreSQL upgrades to both minor and major versions.
Prerequisites
To effectively test the solution discussed here, you will need the following resources:
Understanding version upgrades in PostgreSQL
Before examining the impact of PostgreSQL upgrades on AWS DMS tasks, it is important to clarify the differences between major and minor version upgrades. Minor versions primarily focus on patching security vulnerabilities and fixing bugs, without introducing new functionalities. They maintain compatibility with earlier and later minor releases of the same major version number. For instance, version 14.10 is compatible with both 14.9 and 14.16.
Conversely, major releases can alter the internal format of system tables, data files, and data storage formats. Amazon RDS for PostgreSQL utilizes the native pg_upgrade utility for upgrading instances to new major versions. For further details on upgrades, refer to Upgrading the PostgreSQL DB engine for Amazon RDS.
Both minor and major version upgrades entail downtime and should ideally occur within a designated maintenance window. It is advisable to schedule this maintenance during periods of low database activity.
AWS DMS interaction with PostgreSQL
Consider a scenario where AWS DMS is employed to migrate data from a PostgreSQL source to a PostgreSQL target. During a full load, AWS DMS connects to the source PostgreSQL database and executes a select *
command on the tables specified in the table mappings to extract data. The retrieved data is then written to a CSV file within a replication instance for PostgreSQL targets. For these targets, AWS DMS leverages the COPY
command to load data from the CSV file into the target PostgreSQL tables.
To capture ongoing changes during migration, AWS DMS establishes a logical replication slot on the source PostgreSQL database. This slot serves as a stream of changes that can be replayed in the order they were made. DMS employs either test_decoding or pglogical plugins for logical decoding of changes from the replication slot. If the pglogical
plugin is available, DMS will create a replication slot using it; otherwise, it defaults to the test_decoding
plugin. Changes read from the source are then processed by the sorter component on the replication instance, which organizes transactions in commit order before applying them to the target database, either sequentially or in batch mode, depending on your DMS task configuration.
Replication slots play a vital role in both full load plus CDC and CDC-only tasks, as they are responsible for retaining the necessary write-ahead log (WAL) files on the source PostgreSQL database. If a replication slot is removed from the source database, DMS will be unable to process ongoing changes.
How PostgreSQL upgrades affect AWS DMS tasks
The following sections outline how to manage your DMS tasks during minor or major version upgrades of your source or target PostgreSQL database.
When the source PostgreSQL database is upgraded
Full load-only DMS tasks are intended for one-time data migrations and can be safely restarted after either minor or major version upgrades of the source PostgreSQL database. However, for full load plus CDC and CDC-only DMS tasks, which continuously replicate ongoing changes, it is crucial to follow the best practices outlined below during PostgreSQL upgrades.
Minor release or version upgrade
Before initiating a minor version upgrade, stop your ongoing AWS DMS replication tasks. Once the upgrade is complete, you may resume your DMS task.
Major version upgrade
As of the current writing, DMS supports PostgreSQL versions 9.4 and higher (including 9.x, 10.x, 11.x, 12.x, 13.x, 14.x, 15.x, and 16.x). When performing a major version upgrade, ensure that your replication instance is compatible with the new PostgreSQL version. To execute the major version upgrade using pg_upgrade
, it is necessary to drop the replication slots on the source PostgreSQL database. Neglecting to do so may hinder the upgrade process, leading to failure messages in the pg_upgrade_precheck.log
indicating that the instance is blocked by one or more logical replication slots. However, dropping the replication slots will invalidate your AWS DMS tasks, preventing the resumption of ongoing replication tasks.
To effectively manage ongoing replication tasks during major version upgrades, follow these steps:
- Cease all application connections to your PostgreSQL database. Monitor active connections using:
select * from pg_stat_activity where datname = 'database_name';
If necessary, terminate any remaining connections with:
select pg_terminate_backend(pid) from pg_stat_activity where datname = 'database_name' and pid pg_backend_pid();
- Monitor AWS DMS task metrics to ensure that both
CDCLatencySource
andCDCLatencyTarget
are near zero, confirming that DMS tasks are replicating changes without delay. Utilizeawsdms_txn_state
in the target to check task status (this can be enabled with the task settingTaskRecoveryTableEnabled = True
).
- Once latency is minimal, stop all active ongoing replication DMS tasks.
- Remove the existing replication slots from the source PostgreSQL database.
postgres=> select * from pg_replication_slots;
To drop a replication slot, execute:
SELECT pg_drop_replication_slot('slot_name');
- Verify the removal of the replication slots.
postgres=> select * from pg_replication_slots;
- Complete the in-place upgrade on the PostgreSQL database.
- Confirm the successful completion of the upgrade process. Conduct database-level validation checks to ensure the database operates as expected post-upgrade. Before restarting your application, proceed to either
step 8
orstep 9
to manage your DMS tasks.
- Create a new CDC-only task. Under task settings, select Disable custom CDC start mode for CDC start mode for source transactions. Define additional task settings and table mappings similar to your previous task.
After creating the task, initiate the CDC-only task, which will establish a new replication slot on your source PostgreSQL database and begin migrating changes from the moment the replication slot was created.
- Alternatively, manually create the replication slot on your source PostgreSQL database, using the DMS CDC-only task starting from the specified log sequence number (LSN). Document the
confirmed_flush_lsn
.
The confirmed_flush_lsn
indicates the last LSN that the logical slot consumer has confirmed receiving data to the PostgreSQL engine. Data corresponding to transactions committed before this LSN will no longer be accessible.
- Start your DMS task and verify that changes are being migrated to the target database without issues.
- Restart your applications and monitor your DMS CDC replication.
When the target PostgreSQL database is upgraded
AWS DMS CDC remains unaffected by a minor version upgrade of the target PostgreSQL database. Prior to upgrading a PostgreSQL database configured as a target for DMS, stop your DMS task and resume it once the minor version upgrade is successfully completed.
In the case of a major version upgrade on the target PostgreSQL database, ensure the following:
- Verify that your current replication instance engine version supports the new PostgreSQL version.
- If the new engine version is supported, stop the AWS DMS task, complete the major version upgrade, and then resume your DMS task.
- If the new engine version is not supported, halt your DMS task, execute the major version upgrade on the target PostgreSQL database, and upgrade your replication instance to a version compatible with the current version of your target PostgreSQL database. After both the target and source databases are updated to compatible major versions, you can resume your DMS tasks.
Clean up
To revert changes and avoid incurring ongoing charges, delete the resources created for this post:
- Delete the RDS for PostgreSQL instances and EC2 instance that are no longer required and were created for testing this solution.
- Delete the AWS DMS tasks created for testing this solution.
- Delete the AWS DMS source and target endpoints.
- Delete the AWS DMS replication instance.
About the Authors
Veeramani A is a Cloud Database Engineer at Amazon Web Services, specializing in AWS Database Migration Service and Amazon RDS for PostgreSQL. With over 15 years of experience in diverse database technologies, he provides strategic guidance and technical expertise to customers navigating their database migration journey to the AWS Cloud.
Manoj Ponnurangam serves as a Cloud Database Engineer at Amazon Web Services, focusing on Amazon RDS for Oracle, Amazon RDS for PostgreSQL, and AWS DMS. With 15 years of experience in relational databases, Manoj offers guidance and technical support to customers on various database and migration projects.