SQL Server vs. PostgreSQL query optimization: room for improvement?

For years, the PostgreSQL community has focused on enhancing migration paths from Oracle, developing tools that mirror Oracle’s SQL profile and SQL plan baseline functionalities. These tools, known as AQO and sr_plan extensions, have made PostgreSQL a more appealing option for those looking to transition from Oracle. In fact, PostgreSQL has been observed to outperform Oracle in certain scenarios, particularly in terms of automatic re-optimization. Migrations from Oracle to PostgreSQL tend to be smooth, aided by the introduction of session variable extensions designed to facilitate the transition. While PostgreSQL does have its share of enterprise-only features, it often integrates popular solutions directly into its core, making it a versatile choice for many users.

However, the migration from SQL Server to PostgreSQL presents a different set of challenges. Users have reported significant query slowdowns during these transitions, with problematic queries arising from databases of varying sizes, from gigabytes to terabytes. In one notable instance, a query that executed in 20 milliseconds on SQL Server took weeks to complete on PostgreSQL due to an inefficient query plan. This discrepancy raises questions about SQL Server’s technological advantages, warranting a closer examination.

Temp tables & parallel execution

During a SQL Server to PostgreSQL migration, a significant slowdown was observed in a simple JOIN operation involving GROUP BY on small tables. The schema for the query was straightforward:

SELECT sum(t1.x * t2.count) 
FROM t1, t2 
WHERE t1.x3 = t2.x3 AND t1.x4 = t2.x4 
GROUP BY t1.x1, t1.x2, t1.x3, t1.x4;

The execution plan in PostgreSQL revealed a HashAggregate taking an astonishing 4000 seconds to process just over 2,000 rows, while SQL Server completed the same operation in a mere 300 seconds. The JOIN operation exploded into 1.5 billion rows, with the GROUP BY clause alone consuming an hour of processing time.

To understand the underlying causes, we can compare the execution plans from both databases:

HashAggregate (actual time=4000s, rows=2.1E3)
  Group Key: t1.x1, t1.x2, t1.x3, t1.x4
  -> Nested Loop (actual time=500s, rows=1.5E9)
      -> Seq Scan on t1 (actual time=0.3s, rows=240000)
      -> Index Scan using t2_idx on t2 (actual time=0.3E-4s, rows=11)

SQL Server employs a Hash Join instead of a Nested Loop, which, while only marginally improving runtime, still contributes to overall efficiency.
SQL Server also parallelizes execution across eight threads, significantly enhancing performance.

Profiling the PostgreSQL execution showed that it spent excessive time on hash calculations and tuple comparisons. The query was tasked with grouping a billion incoming tuples into 21,000 groups, each containing around 70,000 tuples, leading to a considerable workload due to the high number of duplicates. Additionally, the text data types of the columns further complicated the hashing and comparison processes.

Parallelism

SQL Server’s approach to leveraging multithreading for speeding up grouping operations is noteworthy. While PostgreSQL does utilize parallel workers, they operate as separate processes, introducing additional overhead. By default, PostgreSQL limits the number of worker processes per query to ensure fair CPU time distribution, which can hinder performance. Moreover, temporary tables are not visible to parallel workers, complicating optimization efforts.

To explore the impact of parallel workers in PostgreSQL, the temporary tables can be converted to regular tables, and the number of parallel workers can be increased. Adjusting settings such as min_parallel_table_scan_size and max_parallel_workers can lead to improved performance:

SET max_parallel_workers = 32; 
SET max_parallel_workers_per_gather = 16;
SET parallel_setup_cost = 0.001;
SET parallel_tuple_cost = 0.0001;
SET min_parallel_table_scan_size = 0;

After implementing these changes, PostgreSQL’s performance became comparable to SQL Server’s:

Finalize HashAggregate (actual time=416s)
 Group Key: t1.x1, t1.x2, t1.x3, t1.x4
 -> Gather (actual time=416s)
     Workers Launched: 9
     -> Partial HashAggregate (actual time=416s)
         Group Key: t1.x1, t1.x2, t1.x3, t1.x4
         ->  Nested Loop (actual time=68s)
             ->  Parallel Seq Scan on t1 (actual time=0.08s)
             ->  Index Scan using t2_idx on t2 (actual time=0.04)

This adjustment allowed PostgreSQL to match SQL Server’s performance, highlighting the potential benefits of optimizing parallel execution strategies.

Multi-clause expression selectivity estimation

Another challenge arises with JOINs or filters that involve multiple conditions. PostgreSQL estimates selectivity by evaluating the selectivity of each sub-expression, which often leads to underestimations in cases with multiple conditions. For instance, when evaluating the following expression:

t1.x1 = t2.x1 AND t1.x2 = t2.x2 AND t1.x3 = t2.x3 AND t1.x4 = t2.x4;

The planner calculates the total number of rows produced by the JOIN based on these partial estimates. However, this method frequently results in an underestimation of the actual rows produced, particularly when the data distribution is not uniform across the columns.

In contrast, SQL Server excels at handling such scenarios by collecting extensive statistics during index creation. This includes histograms for estimating JOIN and WHERE selectivity, allowing for more accurate cardinality estimations. PostgreSQL’s current extended statistics capabilities are limited, primarily focusing on scan filters rather than JOIN cardinality estimation. Fortunately, the PostgreSQL community is actively working on addressing this limitation.

Postgres Professional has introduced a method to partially compensate for this limitation by representing complex expressions differently. By defining a new composite type, users can leverage standard PostgreSQL tools to compute and store statistics for the expression, thereby improving selectivity estimation without relying solely on extended statistics.

SQL Server’s advanced internal parameterization and parameter value caching further enhance its optimization capabilities. For example, when joining three large tables, SQL Server efficiently caches subquery results, significantly speeding up execution times compared to PostgreSQL’s approach.

In summary, while PostgreSQL has made significant strides in improving migration experiences and performance, there remain areas where SQL Server’s optimizations provide distinct advantages. The ongoing exploration of these differences not only highlights potential paths for PostgreSQL’s evolution but also serves as a reminder of the importance of continuous improvement in database technologies.

Tech Optimizer