Throughput
Several settings can be used to tune the throughput of INSERT
/UPDATE
/COPY
statements to the BM25 index.
Statement Parallelism
paradedb.statement_parallelism
controls the number of indexing threads used during INSERT/UPDATE/COPY
. The default is 0
, which automatically detects the
“available parallelism” of the host computer.
If your typical update patterns are single-row atomic INSERT
s or UPDATE
s, then a value of 1
can prevent unnecessary threads from being spun up. For bulk inserts
and updates, a larger value is better.
Statement Memory Budget
paradedb.statement_memory_budget
defaults to 1024MB
. It sets the amount of memory to dedicate per indexing thread before the index segment needs to be
written to disk. The value is measured in megabytes. In terms of raw indexing performance, larger is generally better.
If set to 0
, maintenance_work_mem
divided by statement parallelism will be used.
If your typical update patterns are single-row atomic INSERT
s or UPDATE
s, then a value of 15MB
can prevent unnecessary memory from being allocated. For bulk inserts
and updates, a larger value is better.
Like paradedb.create_index_memory_budget
, this setting can affect the number of segments in the index.
Was this page helpful?