In this post, we demonstrate the process of migrating spatial (geospatial) data from self-managed PostgreSQL, Amazon RDS for PostgreSQL, or Amazon Aurora PostgreSQL-Compatible Edition to Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition utilizing the AWS Database Migration Service (AWS DMS). Spatial data plays a crucial role in applications such as mapping, routing, asset tracking, and geographic visualization. We will guide you through the steps of setting up your environment, configuring AWS DMS, and validating the successful migration of spatial datasets.
Geospatial data encapsulates physical network paths, building locations, and various elements that traverse space. This type of data is essential for identifying geographic locations and generating insights for diverse applications. Represented through a 2D coordinate system, it can take multiple forms, including:
- Points (e.g., GPS coordinates)
- Lines (e.g., roads or rivers)
- Polygons (e.g., borders and land use zones)
- Collections of these features
AWS DMS treats these columns as large objects (LOBs) rather than native spatial types. Without proper configuration for LOB handling and enabling PostGIS support on the target database, replication tasks may encounter failures or result in incomplete data migration.
Amazon RDS for PostgreSQL spatial data
PostgreSQL databases can leverage the PostGIS extension to effectively manage geospatial data types and functions. This extension facilitates the storage and manipulation of geometry and geography spatial types while offering specialized spatial functions. When dealing with spatial data, PostGIS enables operations such as ST_Within for conducting radius searches and distance calculations between geometries. The extension supports various geometric objects and can execute complex spatial operations, such as verifying if a polygon is contained within another polygon using simple SQL functions like ST_Contains.
Amazon Relational Database Service (Amazon RDS) for PostgreSQL offers full integration with PostGIS functionality, making it an ideal choice for applications that require geospatial operations.
Solution overview
This solution utilizes AWS DMS to replicate geospatial data managed by the PostGIS extension. Both the source and target databases are Amazon RDS for PostgreSQL. While they can operate on different PostgreSQL versions, it is essential to ensure compatibility of the PostGIS extension versions between the two databases. AWS DMS supports cross-version PostgreSQL migrations.
Prerequisites
Before configuring the PostGIS extension for RDS for PostgreSQL in both the source and target databases, ensure you meet the following prerequisites:
- AWS setup:
- An AWS account with permissions to create and manage AWS DMS resources.
- Amazon RDS for PostgreSQL instances serving as both source and target. For testing, a single RDS instance can be used to replicate data between different databases within that instance to minimize costs.
- AWS DMS replication instance, endpoints, and an AWS DMS task created.
- Database configuration:
- PostGIS must be installed on both source and target databases.
- To install PostGIS, connect to each database using your preferred SQL client and execute the following command:
CREATE EXTENSION postgis;Note: Without the PostGIS extension on the target database, the AWS DMS task will fail with errors such as “type geometry does not exist.”
- Tables must have primary key or unique key constraints:
CREATE TABLE public.province_boundary ( province_id SERIAL PRIMARY KEY, -- Required: Primary key for DMS replication name TEXT NULL, osm_id INT4 NULL, country_code TEXT NULL, concave_hull GEOMETRY(GEOMETRY, 3857) NULL, -- Nullable for LOB handling admin_level INT4 NULL );Note: Geometry columns must be nullable on target tables for LOB handling. AWS DMS will remove LOB columns if tables lack a primary key or unique constraint. This is critical for capturing LOB changes during CDC. Geometry columns must be nullable on target tables; if your source table has NOT NULL constraints on geometry columns, migration will fail. You can reapply NOT NULL constraints after migration completes.
- Additional considerations:
- Review Managing spatial data with the PostGIS extension in the Amazon RDS documentation.
- Evaluate which additional PostGIS extensions might be necessary for your use case.
- Ensure the database instance has adequate resources for PostGIS operations. Monitor key Amazon CloudWatch metrics: CPUUtilization (25% available), ReadIOPS/WriteIOPS, and FreeStorageSpace.
Best practices when configuring PostGIS extension
When configuring PostGIS, consider the following best practices:
- Validate PostgreSQL and PostGIS version compatibility. To confirm the extension was created, use the following command:
SELECT * FROM pg_extension where extname= 'postgis'; - Assess additional PostGIS extensions based on workload requirements.
- Monitor resource usage during the execution of complex spatial queries.
- Establish appropriate backup strategies for spatially enabled databases.
Implementation steps
Step 1: Start by creating a spatial table in the source Amazon RDS for PostgreSQL database. To define spatial tables and load sample data, follow these steps. The example below includes a geometry column defined using the 3857 spatial reference system (SRID):
- Create the sample table in your source database using the DDL provided in the prerequisites section (2c).
- Insert sample data into the
province_boundarytable with the following query:INSERT INTO province_boundary ( name, osm_id, country_code, concave_hull, admin_level ) SELECT 'lionkingbaby', 123456, 'TN', ST_SetSRID( ST_MakeLine( ARRAY( SELECT ST_MakePoint(x * 0.01, sin(x * 0.01)) -- creates curved path FROM generate_series(1, 400000) AS x -- adjust this for size ~400KB ) ), 3857 ), 4; - Execute the following query to measure the geometry size (in bytes):
SELECT name, pg_column_size(concave_hull) AS hull_size_bytes FROM province_boundary;
Step 2: Prepare the Target Database and Configure AWS DMS
- Before initiating the AWS DMS task, connect to your target Amazon RDS for PostgreSQL database and create the PostGIS extension:
CREATE EXTENSION postgis;Note: AWS DMS automatically creates the target table if it doesn’t exist, regardless of the table preparation mode selected. However, the PostGIS extension must be installed beforehand.
- Configure the AWS DMS replication task by creating a new AWS DMS task with the following settings:
- Migration type: Full load and CDC
- Target table preparation mode: Do nothing
- LOB mode: Limited LOB mode
- Maximum LOB size: 2048 KB
Note: “The maximum LOB size for geospatial data cannot exceed 980,000 KB.” - To determine the appropriate LOB size for your spatial data, refer to the queries provided in the AWS DMS documentation for determining LOB column sizes.
- Enable CloudWatch logs: Essential for monitoring migration progress, troubleshooting issues, and validating successful data transfer.
- Activate log context for detailed monitoring.
- Add selection rules to include your spatial tables (e.g., schema: public, table: province_boundary).
- After configuring all settings, start the AWS DMS task to commence the migration.
- Monitor the migration process.
Once the task begins, keep an eye on the CloudWatch logs to verify successful data transfer. You should observe log entries similar to the following:
2025-08-08T12:06:55 [SOURCE_UNLOAD] I: Unload finished for table 'public'.'province_boundary' (Id = 1). 7 rows sent. (streamcomponent.c:4010)
2025-08-08T12:06:55 [TARGET_LOAD] I: Load finished for table 'public'.'province_boundary' (Id = 1). 7 rows received. 0 rows skipped. Volume transferred 29203256. {connectionId:22438} (streamcomponent.c:4303)
Common errors encountered during replication
During spatial data migration, you may encounter errors related to LOB handling or PostGIS data types. A typical error when AWS DMS fails to process geometry data is illustrated below:
00220859: 2025-04-03T16:51:27 [TASK_MANAGER] W: Table 'public'.'cluster' was errored/suspended (subtask 3 thread 1). Command failed to load data with exit error code 0 and exitwhy 1. Please check target database logs for more information.; Failed to wait for previous run; Failed to start load process for file '2'; Failed to load file '2' (replicationtask.c:3068)
Upon reviewing the database logs, an ‘invalid input json’ error was identified.
Generate test data on the source database
To test the spatial data migration with a larger dataset, follow this procedure to generate synthetic spatial data:
- Connect to your source Amazon RDS for PostgreSQL database using your preferred SQL client (e.g., psql, pgAdmin).
- Run the following script to generate test data. You can modify the value of `v_rows` to control how many rows are generated (the current script generates 10 rows):
BEGIN; DO $$ DECLARE v_pts_per_row integer := 20000; v_hull_pct float := 0.90; i integer; g geometry; BEGIN FOR i IN 1..v_rows LOOP SELECT ST_SetSRID( ST_ConcaveHull( ST_Collect(ARRAY( SELECT ST_MakePoint( i * 20000 + x * 2 + (random() * 50.0), i * 20000 + sin(x / 20.0) * 300.0 + (random() * 50.0) ) FROM generate_series(1, v_pts_per_row) AS x )), v_hull_pct, TRUE ), 3857 ) INTO g; INSERT INTO public.province_boundary(name, osm_id, country_code, concave_hull, admin_level) VALUES ( format('synthetic_%s', i), 900000 + i, 'TN', g, 4 ); END LOOP; END$$; CREATE INDEX IF NOT EXISTS idx_province_boundary_geom ON public.province_boundary USING GIST (concave_hull); COMMIT; - Verify the data was created successfully by checking the row count:
SELECT COUNT (*) FROM public.province_boundary;
Post-migration validation
To validate your data after migration, follow these steps:
- Compare row counts between source and target tables.
- Verify that geometry data types are preserved correctly. Run the following query on both source and target databases to compare geometry columns:
CREATE TABLE public.province_boundary ( province_id SERIAL PRIMARY KEY, -- Required: Primary key for DMS replication name TEXT NULL, osm_id INT4 NULL, country_code TEXT NULL, concave_hull GEOMETRY(GEOMETRY, 3857) NULL, -- Nullable for LOB handling admin_level INT4 NULL );Ensure the SRID and geometry type match between source and target.
- Rebuild spatial indexes as needed.
- Validate the functionality of PostGIS spatial functions (for example,
ST_Within,ST_Contains). - Reapply NOT NULL constraints to geometry columns if they were temporarily made nullable.
Consider the following best practices for your migration:
- Implement table parallelism for efficient data loading.
- Monitor task progress through CloudWatch logs.
Clean up
To prevent ongoing charges, ensure you clean up all resources after migration, including AWS DMS tasks, the AWS DMS replication instance, endpoints, and databases. For guidance on deleting AWS DMS tasks, refer to delete-replication-task. To delete the replication instance, consult the documentation on deleting a replication instance. If you created test databases specifically for this migration and no longer require them, delete the Amazon RDS for PostgreSQL instances.