INSERT
/UPDATE
/COPY
statements to the BM25 index.
Move Merging to the Background
During everyINSERT
/UPDATE
/COPY
/VACUUM
, the BM25 index runs a compaction process that looks for opportunities to merge segments
together. The goal is to consolidate smaller segments into larger ones, reducing the total number of segments and improving query performance.
Segments become candidates for merging if their combined size meets or exceeds one of several configurable layer thresholds. These thresholds define target
segment sizes — such as 10KB
, 100KB
, 1MB
, etc. For each layer, the compactor checks if there are enough smaller segments whose total size adds up to the threshold.
By default, layer sizes 1KB
, 10KB
, 100KB
and 1MB
are merged in the foreground, while layer sizes 10MB
, 100MB
, 1GB
, 10GB
, 100GB
, and 1TB
are merged in the background. layer_sizes
configures the foreground layers, while background_layer_sizes
configures the background layers.
background_layer_sizes
to 0
disables background merging, and setting layer_sizes
to 0
disables foreground merging.
Increase Work Memory for Bulk Updates
work_mem
controls how much memory to allocate to a single INSERT
/UPDATE
/COPY
statement. Each statement that writes to a BM25 index is required to have at least 15MB
memory. If
work_mem
is below 15MB
, it will be ignored and 15MB
will be used.
If your typical update patterns are large, bulk updates (not single-row updates) a larger value may be better.
maintenance_work_mem
.
Enable Mutable Segments
Themutable_segment_rows
setting enables use of mutable segments, which buffer the rows written to the index in order to amortize the cost of indexing them.
mutable_segment_rows
to a value greater than 0 will cause that many rows to be cheaply buffered at write time, and to instead be
indexed at read time (i.e. during SELECT
statements). This can significantly increase the number of writes that an index can sustain,
at the cost of a proportional decrease in read throughput.
The content of mutable segments are indexed and held in memory at read time,
so large
mutable_segment_rows
values are not advised.