Overview
Basic Usage
ParadeDB supports aggregates through plain SQL functions like COUNT
, SUM
, and AVG
. For instance, the following query tallies the total number of search results for a full text query.
Fast Aggregates
Fast aggregates are an enterprise-only feature. Contact us for access.
The performance of aggregates over very large result sets (millions of rows or more) is a well-known problem in Postgres.
In ParadeDB Enterprise, aggregates over full text search results are drastically accelerated by fast fields. Fast fields are stored in a column-oriented fashion, which makes them much quicker to read and process.
The following steps are necessary to prepare the ParadeDB instance for fast fields.
Configure Parallel Workers
ParadeDB Enterprise uses Postgres parallel workers. By default, Postgres allows two workers per parallel query.
Increasing the number of parallel workers allows parallel queries to use all of the available hardware on the host machine and can deliver significant
speedups. For instance, if the host machine has 16 CPUs, max_worker_processes
, max_parallel_workers
, and max_parallel_workers_per_gather
should be set to 16.
Remember to restart Postgres in order for postgresql.conf
changes to take effect.
Run VACUUM
VACUUM
updates the table’s visibility map,
which speeds up Postgres’ visibility checks. Specifying INDEX_CLEANUP = false
allows VACUUM
to run much
faster over large tables by skipping the cleanup of dead rows in the index.
Make sure to pass the table name, not the index name, to VACUUM
.
If the table experiences frequent updates, we recommend configuring autovacuum.
Run Cold Query
The following query allows Postgres to cache the underlying index files. This is a one-time cold query that will not be fast, but allows subsequent queries to be fast.
Replace mock_items
with your table name, and id
with your key field.
Index Fast Fields
You can check which fields are indexed as fast by running paradedb.schema
.
In order for a query to be accelerated, all of the returned columns must be fast fields. For instance, the following query is accelerated because rating
is a fast
field, and COUNT(*)
does not specify any fields.
You can verify if a query will be accelerated by running EXPLAIN
. Accelerated queries will show either a Custom Scan
or Parallel Index Only Scan
in the query plan.
On the other hand, a query that selects a non-fast field like description
is not accelerated.
If we run EXPLAIN
on this query, we’ll see a regular Index Scan
, which indicates that no fast field optimization occurred.
That’s it! ParadeDB is now ready for fast aggregates.
Was this page helpful?