Search Performance
As a general rule of thumb, the performance of expensive search queries can be greatly improved if they are able to access more parallel Postgres workers and more shared buffer memory.
Parallel Workers
The number of parallel workers depends on the server’s CPU count and certain
Postgres settings in postgresql.conf
.
max_parallel_workers
and max_worker_processes
control how many workers are available to parallel scans.
max_worker_processes
is a global limit for the number of available workers across all connections, and
max_parallel_workers
specifies how many of those workers can be used for parallel scans.
Next, max_parallel_workers_per_gather
must be set. This setting is a limit for the number of parallel workers that a single parallel query can use. The default is 2
.
This setting can be set in postgresql.conf
to apply to all connections, or within a connection to apply to a single
session.
The number of parallel workers should not exceed the server’s CPU count. max_worker_processes
and max_parallel_workers
must be changed inside postgresql.conf
,
and Postgres must be restarted afterward.
Shared Buffers
shared_buffers
controls how much memory is available to the Postgres buffer cache. We recommend allocating no more than 40% of total memory
to shared_buffers
.
The pg_prewarm
extension can be used to load the BM25 index into the buffer cache after Postgres restarts. A higher shared_buffers
value allows more of the index to be
stored in the buffer cache.
Autovacuum
If an index experiences frequent writes, the search performance of some queries like sorting or
aggregates can degrade if VACUUM
has not been recently run. This is because writes can cause parts of Postgres’ visibility map
to go out of date, and VACUUM
updates the visibility map.
To determine if search performance is degraded by lack of VACUUM
, run EXPLAIN ANALYZE
over a query. A Parallel Custom Scan
in the query plan with a large number of Heap Fetches
typically means that VACUUM
should be run.
Postgres can be configured to automatically vacuum a table when a certain number of rows have been updated. Autovacuum settings
can be set globally in postgresql.conf
or for a specific table.
There are several autovacuum settings, but the important ones to note are:
autovacuum_vacuum_scale_factor
triggers an autovacuum if a certain percentage of rows in a table have been updated.autovacuum_vacuum_threshold
triggers an autovacuum if an absolute number of rows have been updated.autovacuum_naptime
ensures that vacuum does not run too frequently.
This means that setting autovacuum_vacuum_scale_factor
to 0
and autovacuum_vacuum_threshold
to 100000
will trigger an autovacuum
for every 100000
row updates. To determine the right autovacuum settings for your table, we recommend examining the table’s write volumes and
adjusting the autovacuum threshold accordingly.
Was this page helpful?