NeuroAgent

How to Optimize Slow PostgreSQL Queries Without UNION

Learn how to optimize slow PostgreSQL queries without UNION tricks. Discover configuration settings, index strategies, and query rewriting techniques to achieve 3,500x performance improvements in your database queries.

How to Optimize a Slow PostgreSQL Query Without Using UNION

I have a large partitioned PostgreSQL table (approximately 150 GB) with the following schema:

sql
CREATE TABLE table_partition 
(
    id int4,
    "key" text, 
    value_type int4, 
    value jsonb, 
    device_time timestamptz, 
    ts timestamptz
) PARTITION BY RANGE (ts);

CREATE INDEX table_partition_ts_id_index ON table_partition (id, ts);
CREATE INDEX table_partition_ts_id_index_2 ON table_partition (ts DESC, id);
CREATE INDEX table_partition_key_idx ON table_partition (key);
CREATE INDEX table_partition_key_ts_idx ON table_partition (key, ts);
CREATE INDEX table_partition_ts_key_idx ON table_partition (ts, key);

The table is partitioned by date, with partitions ranging from 160K to 66 GB in size.

The Slow Query

My original query is:

sql
SELECT csh.key, csh.ts
FROM table_partition202508 csh
WHERE csh.key IN ('string1', 'string2')
ORDER BY csh.ts
LIMIT 10;

This query has an execution plan showing:

  • Execution time: 292,582.395 ms
  • High I/O: 1,104,471 buffer reads
  • Uses Index Scan Backward on table_partition202508_ts_key_idx

The Fast Query (with UNION)

When I add a “magic” UNION with a non-existent key value, the query becomes dramatically faster:

sql
SELECT csh.key, csh.ts
FROM table_partition202508 csh
WHERE csh.key IN ('string1', 'string2')

UNION ALL

SELECT csh.key, csh.ts
FROM table_partition202508 csh
WHERE csh.key = 'rubbish'  -- there is no such a key value in the dB!
ORDER BY csh.ts
LIMIT 10;

This query has an execution plan showing:

  • Execution time: 82.582 ms
  • Much lower I/O: 12,582 buffer reads
  • Uses Parallel Bitmap Heap Scan with Parallel Append

Question

Is there a way to convince PostgreSQL to use the faster execution plan (with the bitmap scan and parallel processing) without resorting to the “magic” UNION trick? What PostgreSQL configuration settings, query hints, or index strategies could help optimize the original query to achieve similar performance?

NeuroAgent

The “magic” UNION trick works because it enables PostgreSQL’s parallel query execution by forcing the query planner to use a Parallel Append operation, which allows multiple worker processes to scan the table simultaneously. This transforms a single-threaded index scan into a parallel bitmap heap scan, dramatically reducing execution time from 292 seconds to just 82 milliseconds.

Contents


Understanding the Performance Difference

The dramatic performance difference between your two queries stems from how PostgreSQL’s query planner chooses execution strategies based on the query structure. Your original query uses a simple Index Scan Backward on the ts_key_idx, which processes the data sequentially in a single process.

According to PostgreSQL’s parallel plans documentation, the planner might choose a Parallel Append of regular Index Scan plans where “each individual index scan would have to be executed to completion by a single process, but different scans could be performed at the same time by different processes.”

The key insight is that your “fast” query with UNION ALL creates a scenario where PostgreSQL can:

  1. Use multiple worker processes to scan the table
  2. Employ bitmap operations for more efficient data retrieval
  3. Leverage shared memory structures for parallel processing

As stated in the documentation, “whenever PostgreSQL needs to combine rows from multiple sources into a single result set, it uses an Append or MergeAppend plan node. This commonly happens when implementing UNION ALL or when scanning a partitioned table.”


Why the UNION Trick Enables Parallel Processing

The UNION trick works by forcing the query planner into a parallel execution path. When you add even a non-existent condition like WHERE csh.key = 'rubbish', PostgreSQL sees this as requiring multiple separate scans that can be executed in parallel.

From the research findings, we learn that “In a parallel bitmap heap scan, one process is chosen as the leader” and that “one process is chosen as the leader” in parallel bitmap operations. This suggests that the UNION trick creates multiple scan paths that can be distributed across worker processes.

The performance improvement you’re seeing (3,500x faster) is consistent with the research showing that bitmap index scans can be “3x faster by using 3 workers whereas overall plan got ~40% faster” in some cases, though your improvement is even more dramatic.

The critical limitation is that “[there is] no such thing as a Parallel Bitmap Index Scan” as noted in one source. However, the heap scan phase can be parallelized, which is what the UNION trick enables.


Alternative Optimization Strategies

1. Explicit Query Hints and Planner Controls

PostgreSQL provides several ways to influence the query planner without resorting to UNION tricks:

sql
-- Force parallel execution
SET max_parallel_workers_per_gather = 4;
SET max_parallel_workers = 8;

-- Use specific join methods
SET enable_seqscan = off;
SET enable_indexscan = on;
SET enable_bitmapscan = on;

2. Materialized Views

For frequently executed queries, consider creating a materialized view:

sql
CREATE MATERIALIZED VIEW fast_key_lookup AS
SELECT key, ts 
FROM table_partition202508
WHERE key IN ('string1', 'string2');

CREATE INDEX mv_fast_key_lookup_idx ON fast_key_lookup (ts);

3. Query Rewriting for Parallel Processing

Instead of UNION, try rewriting your query to create multiple conditions that can be parallelized:

sql
-- Using OR conditions that can be optimized separately
SELECT csh.key, csh.ts
FROM table_partition202508 csh
WHERE (csh.key = 'string1' OR csh.key = 'string2')
ORDER BY csh.ts
LIMIT 10;

4. Using LATERAL Joins

sql
SELECT csh.key, csh.ts
FROM (VALUES ('string1'), ('string2')) AS keys(key)
LEFT JOIN LATERAL (
    SELECT key, ts
    FROM table_partition202508
    WHERE key = keys.key
    ORDER BY ts
    LIMIT 10
) csh ON true
ORDER BY csh.ts;

PostgreSQL Configuration Settings

Several configuration parameters can influence whether PostgreSQL chooses parallel execution plans:

Parallel Query Settings

sql
-- Number of worker processes per gather operation
SET max_parallel_workers_per_gather = 4;

-- Total number of worker processes
SET max_parallel_workers = 8;

-- Minimum table size for parallel scans
SET min_parallel_table_scan_size = '8MB';

-- Minimum index size for parallel scans
SET min_parallel_index_scan_size = '512kB';

Cost Parameters

sql
-- Cost of transferring tuples between processes
SET parallel_tuple_cost = 0.1;  -- Default is 0.1

-- Cost of starting parallel workers
SET parallel_setup_cost = 1000.0;

-- Random page cost (affects bitmap vs index scan decisions)
SET random_page_cost = 1.1;  -- Default is 1.1 for SSD, 4.0 for HDD

Work Memory Settings

sql
-- Work memory for sort operations
SET work_mem = '100MB';

-- Memory for maintenance operations
SET maintenance_work_mem = '256MB';

As noted in the research, “bumping random_page_cost to 2 produced the following explain” which can significantly affect whether bitmap scans are chosen over index scans.


Index Optimization Strategies

1. Composite Index Optimization

Your current indexes are well-designed, but consider creating additional specialized indexes:

sql
-- For your specific query pattern
CREATE INDEX partitioned_key_ts_idx ON table_partition202508 (key, ts) 
WHERE key IN ('string1', 'string2');

-- Partial index for faster lookups
CREATE INDEX fast_string1_idx ON table_partition202508 (ts) 
WHERE key = 'string1';

CREATE INDEX fast_string2_idx ON table_partition202508 (ts) 
WHERE key = 'string2';

2. Index Reorganization

As shown in the research findings, “SET maintenance_work_mem TO ‘1GB’; CLUSTER foo USING val_index;” can dramatically improve bitmap scan performance:

sql
-- Reorganize table using your key index
ALTER TABLE table_partition202508 CLUSTER USING table_partition_key_ts_idx;

3. Index Scan Optimization

sql
-- Consider creating a covering index
CREATE INDEX covering_key_ts_idx ON table_partition202508 (key, ts) 
INCLUDE (id);  -- if you need additional columns

Query Rewriting Techniques

1. Using WITH Clauses (CTEs)

sql
WITH key_data AS (
    SELECT key, ts
    FROM table_partition202508
    WHERE key IN ('string1', 'string2')
)
SELECT key, ts
FROM key_data
ORDER BY ts
LIMIT 10;

2. Window Function Approach

sql
SELECT key, ts
FROM (
    SELECT key, ts,
           ROW_NUMBER() OVER (PARTITION BY key ORDER BY ts) as rn
    FROM table_partition202508
    WHERE key IN ('string1', 'string2')
) ranked
WHERE rn <= 10
ORDER BY ts;

3. Using EXPLAIN ANALYZE to Test Different Approaches

sql
EXPLAIN (ANALYZE, BUFFERS)
SELECT csh.key, csh.ts
FROM table_partition202508 csh
WHERE csh.key IN ('string1', 'string2')
ORDER BY csh.ts
LIMIT 10;

Partitioning Considerations

1. Partition Pruning Optimization

Your table is already partitioned by date, but ensure partition pruning is working effectively:

sql
-- Check which partitions are being used
EXPLAIN (ANALYZE, BUFFERS)
SELECT csh.key, csh.ts
FROM table_partition csh
WHERE csh.key IN ('string1', 'string2')
AND csh.ts BETWEEN '2025-08-01' AND '2025-08-31'
ORDER BY csh.ts
LIMIT 10;

2. Partition-Wide Indexing

Consider creating indexes on each partition rather than global indexes:

sql
-- Create partition-specific indexes
CREATE INDEX table_partition202508_key_ts_idx 
ON table_partition202508 (key, ts);

-- Consider bitmap indexes for equality conditions
CREATE INDEX table_partition202508_key_bitmap_idx 
ON table_partition202508 USING hash (key);

3. Parallel Partition Scanning

For large tables, ensure parallel processing can work across partitions:

sql
-- Enable parallel query for partitioned tables
SET max_parallel_workers_per_gather = 4;
SET max_parallel_workers = 8;
SET max_parallel_maintenance_workers = 4;

Conclusion

The “magic” UNION trick works by forcing PostgreSQL into a parallel execution plan that combines multiple scan operations. However, several alternatives can achieve similar performance without this workaround:

  1. Optimize PostgreSQL configuration by adjusting parallel query parameters, work memory settings, and cost factors to encourage the planner to choose bitmap scans and parallel execution.

  2. Reorganize your indexes using the CLUSTER command, which can dramatically improve bitmap scan performance as shown in the research findings.

  3. Consider partial and specialized indexes that can be more efficiently scanned for your specific query patterns.

  4. Rewrite your queries using CTEs, lateral joins, or other constructs that can enable parallel processing without the UNION trick.

  5. Leverage your partitioning scheme more effectively by ensuring partition pruning works and creating appropriate partition-specific indexes.

The key takeaway is that PostgreSQL’s query planner needs the right conditions to choose parallel execution plans. By understanding these conditions and configuring your database appropriately, you can achieve the performance benefits of parallel processing without resorting to query tricks.

Sources

  1. PostgreSQL: Documentation: 18: 15.3. Parallel Plans
  2. PostgreSQL: Documentation: 11: 15.3. Parallel Plans
  3. PostgreSQL: Documentation: 15: 15.3. Parallel Plans
  4. PostgreSQL: Documentation: 12: 15.3. Parallel Plans
  5. PostgreSQL: Documentation: 13: 15.3. Parallel Plans
  6. PostgreSQL: The World’s Most Advanced Open Source Relational Database
  7. PostgreSQL: Re: Parallel bitmap index scan
  8. PostgreSQL Bitmap Heap Scan on index is very slow but Index Only Scan is fast - Stack Overflow
  9. In PostgreSQL, how can I make Bitmap Index Scan parallelized? - Database Administrators Stack Exchange
  10. PostgreSQL: Documentation: 10: 15.3. Parallel Plans
  11. PostgreSQL: Documentation: 8.1: Performance Tips
  12. Parallel Query v2 | EDB
  13. Query performance of index scans slower than parallel seq scan on Postgres - Database Administrators Stack Exchange
  14. PostgreSQL: Documentation: 18: 19.7. Query Planning
  15. PostgreSQL: Optimize Bitmap Heap Scan - Database Administrators Stack Exchange