Index Build
Maintenance Working Memory
The default Postgres maintenance_work_mem
value of 64MB
is quite conservative and can significantly slow down index builds. For large indexes,
we strongly recommend increasing maintenance_work_mem
.
maintenance_work_mem
should not exceed the server’s available memory.
Indexing Memory per Thread
paradedb.create_index_memory_budget
defaults to 1024MB
. It sets the amount of memory to dedicate per indexing thread before the index segment needs to be
written to disk. The value is measured in megabytes. In terms of raw indexing performance, larger is generally better.
If set to 0
, maintenance_work_mem
divided by indexing parallelism will be used.
In addition to improving build times, this setting also affects the number of segments created in the index. This is because, while ParadeDB tries to maintain as many segments as CPUs, a segment that cannot fit into memory will be split into a new segment. As a result, an insufficient paradedb.create_index_memory_budget
can lead to significantly more segments than available CPUs, which degrades search performance. To check if the chosen value value is high enough, you can compare the index’s segment count with the server’s CPU count.
Indexing Progress
ParadeDB can write indexing progress to the Postgres logs, which helps monitor the progress of a long-running CREATE INDEX
statement.
First, Postgres logs must be enabled in postgresql.conf
and Postgres must be restarted afterward.
Next, set paradedb.log_create_index_progress
to true
. This creates Postgres LOG:
entries every 100,000
rows with information on the indexing rate (in rows per second).
The logs will appear in the directory specified in log_directory
.
Indexing Threads
paradedb.create_index_parallelism
sets the number of threads used during CREATE INDEX
. The default is 0
, which
automatically detects the “available parallelism” of the host computer.
Was this page helpful?