How BCM One migrated data from an unencrypted Amazon RDS for PostgreSQL database instance to a new encrypted instance using AWS DMS | Amazon Web Services

This post is co-authored with Kate Fike, Software Engineer at BCM One.

BCM One stands as a prominent global provider of NextGen Communications and Managed Services, catering to IT leaders and channel resellers. Among its various NextGen Communications brands is Flowroute, which specializes in SIP trunking and a business messaging platform tailored for mission-critical voice applications.

In this article, we delve into the process Flowroute undertook to encrypt their existing Amazon Relational Database Service (Amazon RDS) for PostgreSQL database instances utilizing the AWS Database Migration Service (AWS DMS).

Storage encryption and migration challenges

Encrypting storage is crucial for safeguarding against unauthorized access to disk data, aligning with best practices as outlined in Security in Amazon RDS. This encryption employs a key stored in AWS Key Management Services (AWS KMS) to secure access to the disk.

Since existing Amazon RDS instances cannot be directly encrypted, a new encrypted instance must be created. To facilitate this with minimal downtime, Flowroute leveraged Amazon RDS snapshots, AWS DMS, and Amazon RDS Proxy. The migration strategy outlined here builds upon the methodology found in Encrypt an existing Amazon RDS for PostgreSQL DB instance.

The architecture before encryption is depicted in the following diagram.

The architecture after encryption is illustrated in the subsequent diagram.

A near-zero downtime migration can be achieved through full load and change data capture (CDC) AWS DMS tasks. However, given the homogeneous nature of this migration, a more efficient approach is to take a snapshot of the entire database and replicate only the changes that occur post-snapshot. This article details how to execute a near-zero downtime migration using a single CDC-only AWS DMS task.

Solution overview

This post outlines the essential steps and considerations for migrating an unencrypted RDS for PostgreSQL database to an encrypted RDS for PostgreSQL database. Currently, there is no direct encryption path available. By employing a combination of Amazon RDS utilities and AWS DMS, the migration to an encrypted RDS database instance can be accomplished.

The high-level steps for implementing this solution are as follows:

  1. Create an empty RDS instance, AWS DMS replication instance, AWS DMS endpoints, and a CDC-only AWS DMS task.
  2. Initiate the CDC-only AWS DMS task.
  3. Cease the AWS DMS task.
  4. Remove the empty RDS instance.
  5. Create a snapshot of the existing unencrypted RDS instance.
  6. Generate an encrypted snapshot from the unencrypted snapshot created in step 5.
  7. Restore a new encrypted RDS database instance from the encrypted snapshot created in step 6.
  8. Resume the CDC-only AWS DMS task.
  9. Validate the data on the encrypted target RDS instance using a validation-only AWS DMS task.
  10. Utilize RDS Proxy to modify the target group, ensuring a seamless transition to the newly encrypted RDS instance for application connectivity.

The following diagram illustrates the solution and workflow.

Prerequisites

In this solution, the existing unencrypted RDS for PostgreSQL DB instance connects to an RDS proxy. This proxy minimizes downtime during the cutover process. Instead of direct database connections, services connect through the proxy. For setup instructions, refer to Using Amazon RDS Proxy.

The RDS proxy is particularly effective for managing the switchover to the encrypted database, facilitating a smooth transition between client-to-proxy and proxy-to-database connections. When the target group of the RDS proxy is altered, it connects seamlessly to the new target group without disrupting existing connections. This approach simplifies the process of restarting services individually, ensuring synchronization throughout. The cutover time is typically swift, usually taking less than a minute.

Using the RDS proxy for cutover is more efficient than alternative methods, such as CNAME records. The RDS proxy continues to accept connections even during database downtime or switchover, making it a widely recommended approach for ensuring a smooth transition once the new encrypted database is operational.

An IAM role is necessary for accessing the secret associated with the proxy in AWS Secrets Manager for each user (the PostgreSQL login role) needing database access. The secret ARN is incorporated into the proxy’s authentication configuration. Prior to encrypting our storage, database user credentials were stored in Parameter Store, a feature of AWS Systems Manager, or other third-party secret managers. This migration provided an opportunity to standardize credential storage, consolidating all credentials within Secrets Manager.

It is advisable for applications connecting through the proxy to utilize a login role (user) distinct from the administrative user, adhering to the principle of least privilege as a best practice.

For a comprehensive list of prerequisites, consult Using a PostgreSQL database as an AWS DMS source.

Migration steps

To migrate your database, follow these steps:

  1. On the existing unencrypted database, create a dms login role:
CREATE ROLE dms WITH LOGIN NOSUPERUSER INHERIT NOCREATEDB NOCREATEROLE NOREPLICATION;
GRANT rds_replication, rds_superuser TO dms;
Enter password for new role: "password"
Enter it again: "password"
  1. Create an empty RDS DB instance.
  2. For DB instance identifier, assign a name to your DB instance. The empty instance must share the same instance identifier as the new encrypted instance. In this example, we use example-target.
  1. For Initial database name, specify a name for your empty instance. This instance must have the same database name you intend to migrate from the existing cluster. In this case, we migrate a database named postgres, the default database created for a new PostgreSQL instance.
  1. On the unencrypted database, create a dms login role using the same password as the dms login role on the unencrypted database.
  2. Configure the unencrypted database’s RDS parameter group for logical replication. For guidance, see Working with AWS-managed PostgreSQL databases as a DMS source.
  3. Create an AWS DMS replication instance. For instructions, see Creating a replication instance.
  4. Create an AWS DMS endpoint for the unencrypted database, serving as the source endpoint. For instructions, see Creating source and target endpoints.
  5. Create an AWS DMS endpoint for the empty database, which will serve as the target endpoint.
  6. Test the connectivity of both the source and target endpoints to the AWS DMS replication instance. The test should succeed, and the endpoints should display as active on the AWS DMS console.
  7. Create a CDC-only AWS DMS task using the default task configuration settings.
  1. Start the AWS DMS task.
  1. Verify the success of the replication process. This can be accomplished by inserting, updating, or deleting a row.
kfike@ubuntu:~$ psql -h example-source -d postgres -U postgres
flowroute=> INSERT INTO country (country) VALUES ('Suriname');
INSERT 0 1
postgres=> SELECT * FROM country ORDER BY country_id DESC LIMIT 1;
country_id | country | last_update
------------+----------+----------------------------
111 | Suriname | 2023-12-12 03:30:20.747883
(1 row)
q
kfike@ubuntu:~$ psql -h example-target -d postgres -U postgres
postgres=> SELECT * FROM country ORDER BY country_id DESC LIMIT 1;
country_id | country | last_update
------------+----------+----------------------------
111 | Suriname | 2023-12-12 03:30:20.747883
(1 row)
  1. Stop the AWS DMS task.

Even though the task is halted, it will continue to queue CDC replication events.

To illustrate how the AWS DMS task queues CDC events, a row was inserted prior to the snapshot creation:

kfike@ubuntu:~$ psql -h example-source -d postgres -U postgres
flowroute=> INSERT INTO country (country) VALUES ('Bahamas');
INSERT 0 1
postgres=> SELECT * FROM country ORDER BY country_id DESC LIMIT 1;
country_id | country | last_update
------------+----------+----------------------------
112 | Bahamas | 2023-12-12 03:40:49.293128
(1 row)
q
  1. Delete the empty database. It is imperative that the AWS DMS task is stopped before the empty database is removed.
  2. Create a snapshot of the unencrypted RDS instance. Another row is inserted following the snapshot creation.
kfike@ubuntu:~$ psql -h example-source -d postgres -U postgres
flowroute=> INSERT INTO country (country) VALUES ('Moldova');
INSERT 0 1
postgres=> SELECT * FROM country ORDER BY country_id DESC LIMIT 1;
country_id | country | last_update
------------+----------+----------------------------
113 | Moldova | 2023-12-12 03:45:06.449679
(1 row)
q
  1. Create an encrypted snapshot from the unencrypted snapshot. For further details, see Encrypt an existing Amazon RDS for PostgreSQL DB instance.

The key used for encrypting the snapshot will also be employed to encrypt the new database. One option is to utilize a KMS customer managed key (CMK). In this example, we opted for an AWS-managed KMS key: aws/rds. For more information on CMKs and AWS-managed keys, refer to AWS KMS Keys.

  1. Create an encrypted RDS instance from the encrypted snapshot.
  1. For DB instance identifier, specify a name for your instance. It is crucial that the actual database shares the same RDS instance ID as the empty database. In this case, it was example-target.
  1. For AWS KMS key, select your AWS-managed key.
  1. Confirm that the encrypted database is accessible by connecting to it:
kfike@ubuntu:~$ psql -h example-target -d postgres -U postgres
Password for user postgres:
psql (15.3 (Ubuntu 15.3-1.pgdg22.04+1), server 12.14)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, compression: off)
Type "help" for help.
postgres=>
  1. Resume the AWS DMS task.

Ensure that the AWS DMS task does not remain in an error state, as this could deplete the source’s CPU resources.

  1. Create a validation-only AWS DMS task.
  2. Verify that the encrypted database is synchronized with the unencrypted database.

To check if the Moldova row in the country table replicated correctly, query the encrypted database:

  1. kfike@ubuntu:~$ psql -h example-target -d postgres -U postgres
    postgres=> SELECT * FROM country ORDER BY country_id DESC LIMIT 1;
    country_id | country | last_update
    ------------+----------+----------------------------
    113 | Moldova | 2023-12-12 03:45:06.449679
    (1 row)
    q

    Complete the switchover using RDS proxy to connect to the encrypted database.

The following screenshot depicts the proxy configuration for connecting to the unencrypted database.

Ensure that the proxy’s AWS Identity and Access Management (IAM) role possesses the necessary permissions to retrieve the primary user credentials for the encrypted database and utilize the KMS key for decrypting the secret. Without these permissions, the target group will be inaccessible.

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "kms:Decrypt",
			"Resource": "arn:aws::key/d1674ca0-653f-4785-99f2-fcfc19837b9e"
		},
		{
			"Effect": "Allow",
			"Action": "secretsmanager:GetSecretValue",
			"Resource": "arn:aws:secretsmanager::secret:rds!db-f80f9ab5-1682-449d-bed3-f268678cc80c-BVviV7"
		}
	]
}
  1. Utilize the Amazon RDS console to update the proxy authentication, incorporating the primary user secret for the encrypted database.

When employing the AWS Command Line Interface (AWS CLI) modify-db-proxy command, be aware that this command does not append secrets to the existing list; it replaces the proxy’s authentication list with only those specified in the modify-db-proxy command’s --auth option. If not all secrets are explicitly included each time the command is executed, existing secrets may be inadvertently removed.

  1. Edit the default target group to point to the encrypted database and save the changes.

The proxy authentication will fail if the same login role name is used for both unencrypted and encrypted databases, even if the passwords differ. For instance, if the primary user for the unencrypted database is the postgres role with a password of 1234, and the encrypted database’s primary user is also postgres but with a password of 5678, toggling the proxy connection from the source to the target will result in an authentication error due to caching issues. To circumvent this, delete the primary user credentials for the unencrypted database after adding the primary user credentials for the encrypted database to the proxy’s authentication.

  1. Once the migration has been successfully tested and completed, clean up DMS resources by terminating the replication instance and deleting the target groups.

Primary keys and unique constraints

In the previous example, the Bahamas row in the country table was inserted between the halting of the AWS DMS task and the snapshot creation. Consequently, this row was present in both the CDC event queue and the snapshot. Upon resuming the AWS DMS task, a duplicate insert of the Bahamas row was attempted. The following diagram illustrates this scenario.

However, due to the primary key constraint on country.country_id, the second attempt to insert the Bahamas row was rejected, as evidenced in the table created by AWS DMS named awsdms_apply_exceptions:

kfike@ubuntu:~$ psql -h example-target -d postgres -U postgres -x
postgres=> SELECT * FROM awsdms_apply_exceptions;
-[ RECORD 1]--------------------------------------------------------------
TASK_NAME | example-cdc-only
TABLE_OWNER | public
TABLE_NAME | country
ERROR_TIME | 2023-12-14 22:54:14.873954
STATEMENT | INSERT INTO "public"."country" ( "country_id","country","last_update" ) VALUES (112,'Bahamas','2023-12-12 03:40:49.293128')
ERROR | RetCode: SQL_ERROR SqlState: 23505 NativeError: 1 Message: ERROR: duplicate key value violates unique constraint "country_pkey";+
| Error while executing the query

In cases where a table lacks a primary key or unique constraint, duplicate rows may be applied to the target table. In the following example, a table without a primary key or unique constraint had rows inserted at various timepoints:

kfike@ubuntu:~$ psql -h example-source -d postgres -U postgres
postgres=> INSERT INTO no_key_tbl (descr) VALUES ('Verify replication to the dummy');
INSERT 0 1
postgres=> INSERT INTO no_key_tbl (descr) VALUES ('After stopping the AWS DMS task');
INSERT 0 1
postgres=> INSERT INTO no_key_tbl (descr) VALUES ('After taking the snapshot');
INSERT 0 1
postgres=> INSERT INTO no_key_tbl (descr) VALUES ('After resuming the AWS DMS task');
INSERT 0 1

On the target, a duplicate row now exists: “After stopping the AWS DMS task,” which should have been rejected:

kfike@ubuntu:~$ psql -h example-target -d postgres -U postgres
postgres=> SELECT * FROM no_key_tbl;
descr
-----------------------------
Verify replication to the dummy
After stopping the DMS task
After stopping the DMS task
After taking the snapshot
After resuming the DMS task
(4 rows)
Tech Optimizer
How BCM One migrated data from an unencrypted Amazon RDS for PostgreSQL database instance to a new encrypted instance using AWS DMS | Amazon Web Services