Several settings can be used to tune the throughput of INSERT/UPDATE/COPY statements to the BM25 index.

Statement Parallelism

paradedb.statement_parallelism controls the number of indexing threads used during INSERT/UPDATE/COPY. The default is 1 which will generally ensure only one segment is created by the INSERT/UPDATE/COPY statement.

A value of zero will detect the “available parallelism” of the host computer.

If your typical update patterns are single/few-row atomic INSERTs or UPDATEs, then a value of 1 can prevent extra segments from being created that later must be merged. For bulk inserts and updates, a larger value is better.

SET paradedb.statement_parallelism = 1;

Statement Memory Budget

paradedb.statement_memory_budget defaults to 1024MB. It sets the amount of memory to dedicate per indexing thread before the index segment needs to be written to disk. The value is measured in megabytes. In terms of raw indexing performance, larger is generally better.

If set to 0, maintenance_work_mem divided by statement parallelism will be used.

If your typical update patterns are single-row atomic INSERTs or UPDATEs, then a value of 15MB can prevent unnecessary memory from being allocated. For bulk inserts and updates, a larger value is better.

SET paradedb.statement_memory_budget = 15;

Like paradedb.create_index_memory_budget, this setting can affect the number of segments in the index.