The Amazon Aurora PostgreSQL-Compatible Edition supports managed blue/green deployments to minimize downtime and risks during updates. This deployment strategy involves creating a staging environment (green) that mirrors the production database (blue) through logical replication. The blue environment represents the current production database, while the green environment incorporates updates without changing the application endpoint. After validating changes, the green environment can be promoted to production.
In case of issues post-upgrade, a rollback plan is essential, as the managed blue/green deployment feature does not provide built-in rollback functionality. A manual rollback cluster can be established using self-managed logical replication to maintain synchronization with the new version after a switchover.
Before the switchover, two clusters exist: the blue cluster (production) and the green cluster (staging). After the switchover, three clusters are present: the old blue cluster (original production), the new blue cluster (updated production), and the blue prime (rollback) cluster (a clone of the old blue cluster).
To implement the solution, prerequisites include a cluster parameter group for the new version database with logical replication enabled and familiarity with the Aurora cloning feature. The process involves creating a blue/green deployment, stopping traffic on the blue cluster, performing the switchover, deleting the blue/green deployment, cloning the old blue cluster to create the blue prime cluster, and establishing logical replication from the new blue cluster to the blue prime cluster.
Limitations of the managed blue/green deployment include the inability to replicate certain DDL operations and the need to handle endpoint changes manually if a rollback is required. Setting up the rollback cluster incurs additional downtime.
To roll back to the blue prime cluster, application traffic must be ceased, the application or DNS records updated, the subscription on the blue prime cluster dropped, and sequence values manually updated if necessary. This process is not automatic and requires careful planning and testing.
In production, it is advisable to retain the new blue prime cluster until all applications have transitioned successfully, and the old blue cluster can be backed up for compliance before deletion. For testing purposes, all clusters should be deleted to avoid additional charges.