Faster Is Not Always Better: Choosing the Right PostgreSQL Insert Strategy in Python (+Benchmarks)

January 8, 2026

Inserting a staggering 2 million records per second into Postgres is not just a theoretical exercise; it’s a tangible achievement. However, rather than fixating on micro-benchmarks, it is crucial to step back and consider a more significant question: which abstractions align best with our specific workload requirements?

This exploration will delve into five distinct methods for inserting data into Postgres using Python. The objective is not merely to identify the fastest method but to comprehend the trade-offs involved in terms of abstraction, safety, convenience, and performance.

By the end of this analysis, readers will gain insights into:

the strengths and weaknesses of ORM, Core, and driver-level inserts
when performance truly matters
how to select the appropriate tool without succumbing to over-engineering

Why Fast Inserts Matter

High-volume insert workloads are commonplace in various scenarios:

loading millions of records
syncing data from external APIs
backfilling analytics tables
ingesting events or logs into warehouses

Even minor inefficiencies can accumulate rapidly. Transforming a 3-minute insert job into a 10-second operation can significantly alleviate system load, liberate resources, and enhance overall throughput.

However, it is essential to recognize that faster does not inherently equate to better. In scenarios involving smaller workloads, sacrificing clarity and safety for marginal speed gains often proves counterproductive. The true aim lies in understanding when performance is critical and why it matters.

Which Tool Do We Use to Insert With?

To interact with our Postgres database, we require a database driver. In this context, we utilize psycopg3, complemented by SQLAlchemy as an additional layer. Here’s a brief differentiation:

Psycopg3 (the Driver)

psycopg3 serves as a low-level PostgreSQL driver for Python. This minimal abstraction communicates directly with Postgres, placing the onus of responsibility on the developer to write SQL, manage batching, and ensure correctness.

SQLAlchemy

SQLAlchemy operates atop database drivers like psycopg3, offering two distinct layers:

1) SQLAlchemy Core

This layer provides a SQL abstraction and execution framework that is database-agnostic, allowing developers to write Python expressions that Core translates into the appropriate SQL dialect while safely binding parameters.

2) SQLAlchemy ORM

The ORM, built on Core, offers even greater abstraction by mapping Python classes to database tables, tracking object states, and managing relationships. While the ORM enhances productivity and safety, it introduces overhead, particularly during bulk operations.

In essence, these three options exist along a spectrum:

ORM simplifies the use of Core
Core enhances the safety of using the Driver while maintaining database agnosticism

The Benchmark

To ensure a fair benchmarking process:

each method receives data in its intended format (ORM objects for ORM, dictionaries for Core, tuples for the Driver)
only the time spent transferring data from Python to Postgres is measured
no method incurs penalties for conversion tasks
the database operates within the same environment as our Python script, preventing bottlenecks from upload speeds

The aim is not to identify the fastest insert method but to comprehend what each approach excels at.

Insertion times per batch size for five different methods

1) Faster is Always Better?

What is better? A Ferrari or a Jeep?

The answer hinges on the problem you’re trying to solve. For navigating a forest, the Jeep is preferable. If speed is your goal, the Ferrari takes the lead.

This analogy extends to data insertion. Reducing a 10-second insert by 300 milliseconds may not warrant the added complexity and risk. Conversely, in other scenarios, such a gain could be invaluable.

Notably, the fastest method on paper may prove to be the slowest when considering:

maintenance costs
correctness guarantees
cognitive load

2) What is Your Starting Point?

The right insertion strategy relies less on row count and more on the existing structure of your data.

ORM, Core, and the driver are not adversarial tools; they are optimized for different objectives:

Method	Purpose
ORM (`add_all`)	Business logic, correctness, small batches
ORM(`bulk_save_object`)	ORM objects at scale
Core (`execute`)	Structured data, light abstraction
Driver (`executemany`)	Raw rows, high throughput
Driver (`COPY`)	Bulk ingestion, ETL, firehose workloads

The ORM excels in CRUD-heavy applications where clarity and safety are paramount, such as websites and APIs. Here, performance is typically “good enough,” and clarity takes precedence.

Core is ideal for scenarios requiring control without the need to write raw SQL, such as data ingestion, batch jobs, and analytics pipelines.

The Driver is tailored for maximum throughput, particularly in cases involving extensive writes, such as machine learning training sets, bulk loads, and low-latency ingestion services. While it minimizes overhead, it also necessitates manual SQL writing, increasing the risk of errors.

3) Don’t Mismatch Abstractions

The ORM isn’t slow. COPY isn’t magic.

Performance challenges arise when data is forced through an abstraction for which it was not designed:

Using Core with SQLAlchemy ORM objects can lead to slowdowns due to conversion overhead.
Utilizing ORM with tuples is often awkward and brittle.
Employing ORM bulk operations in ETL processes can result in wasted overhead.

At times, reverting to a lower level can actually enhance performance.

When to Choose Which?

A useful rule of thumb is as follows:

Layer	Use it when…
ORM	You are building an application (correctness and productivity)
Core	You are moving or transforming data (balance between safety and speed)
Driver	You are pushing performance limits (raw power and full responsibility)

Tech Optimizer