How Heroku migrated hundreds of thousands of self-managed PostgreSQL databases to Amazon Aurora

In a significant shift towards enhanced operational efficiency and customer satisfaction, Heroku has successfully migrated its multi-tenant PostgreSQL database fleet from a self-managed environment on Amazon Elastic Compute Cloud (EC2) to the Amazon Aurora PostgreSQL-Compatible Edition. This transition was executed with remarkable precision, ensuring that there was no impact on customers while simultaneously increasing platform reliability and reducing the operational burden on Heroku’s engineering teams.

An Overview of Heroku

Heroku stands as a prominent platform as a service (PaaS) solution, leveraging the robust infrastructure of Amazon Web Services, Inc. Founded in 2007 and acquired by Salesforce in 2010, Heroku has become the backbone for over 13 million applications, catering to a diverse clientele that ranges from budding startups to large enterprises.

Heroku simplifies the deployment and scaling of applications through its Dynos, which are secure, scalable containers for application runtime. Beyond this, Heroku offers a suite of fully managed data solution add-ons via Heroku Data Services, including Heroku Postgres, Apache Kafka, and Heroku Key-Value Store. This allows developers to focus on application development rather than the complexities of data infrastructure management.

Heroku’s Previous Self-Managed PostgreSQL Architecture and Challenges

Heroku’s legacy architecture involved multiple control planes to manage customer resources effectively. The primary control plane handled requests from customers, creating the necessary resources for a managed database experience. This included provisioning virtual private clouds (VPCs), EC2 instances, Amazon Elastic Block Store (EBS) volumes, and Amazon Simple Storage Service (S3) paths based on specific add-on requirements.

While this architecture served Heroku well for over a decade, the increasing complexity of managing a fleet of database instances began to weigh heavily on the engineering team. The responsibility for maintaining infrastructure availability, security, and updates required extensive coding and monitoring efforts, diverting focus from enhancing customer experiences.

Heroku’s New PostgreSQL Architecture With Aurora

The new architecture, built on Amazon Aurora, allows Heroku to leverage AWS Cloud services more efficiently. Inspired by a finite-state machine model, Heroku’s control plane manages resources through defined states, automating actions based on resource status changes. This transition to Aurora means Heroku no longer needs to manage server infrastructure, including custom Amazon Machine Images (AMIs) or operating system updates, allowing them to concentrate on delivering customer value.

How Heroku Migrated Over 200,000 Databases

The migration of over 200,000 self-managed PostgreSQL databases to Aurora was a formidable challenge that Heroku Data tackled with strategic planning and execution. In collaboration with AWS, Heroku’s engineering team received specialized training on Amazon Aurora, equipping them with the knowledge necessary for a smooth transition.

Utilizing a dual control plane approach, the team developed a specialized transfer system that employed pgcopydb to facilitate data migration. This innovative solution allowed for parallel operations, significantly improving efficiency over traditional methods. Comprehensive testing capabilities ensured that the migration process was seamless, with an average of 2,000 databases migrating daily.

Customers were offered two migration paths: a self-serve option allowing them to initiate the process at their convenience, and an automated migration that systematically transitioned databases to minimize disruption.

Advantages and Benefits of the New Architecture

This migration to Amazon Aurora represents a strategic investment by Heroku, reinforcing their commitment to delivering exceptional customer experiences. The shift alleviates the operational burdens previously faced by Heroku engineers, enabling them to focus on innovation rather than infrastructure management.

With the new architecture, Heroku Data is poised to introduce advanced features such as AI-enabled database administration, auto-scaling capabilities, and enhanced security measures, all while maintaining the simplicity that Heroku is known for. The integration of Amazon Aurora not only enhances security through built-in encryption and automated patching but also provides comprehensive audit capabilities via AWS CloudTrail.

Tech Optimizer
How Heroku migrated hundreds of thousands of self-managed PostgreSQL databases to Amazon Aurora