Basic Usage

For large indexes, tuning Postgres’ default memory and parallel worker settings prior to running CREATE INDEX is strongly recommended. Please see index tuning for details.

The following code block creates a BM25 index called search_idx over multiple columns of the mock_items table. All columns that are relevant to the search query, including columns used for sorting and filtering, should be indexed for optimal performance.

CREATE INDEX search_idx ON mock_items
USING bm25 (id, description, category, rating, in_stock, created_at, metadata, weight_range)
WITH (key_field='id');

Syntax

CREATE INDEX <index_name> ON <schema_name>.<table_name>
USING bm25 (<columns>)
WITH (key_field='<key_field>');
index_name

The name of the index being created. If unspecified, Postgres will automatically choose a name.

schema_name

The name of the schema that the table belongs to. If unspecified, CURRENT SCHEMA is used.

table_name
required

The name of the table being indexed.

columns
required

A comma-separated list of columns to index, starting with the key field. Text, numeric, datetime, boolean, range, enum, and JSON types can be indexed.

key_field
required

The name of a column in the table that represents a unique identifier for each record. Usually, this is the same column that is the primary key of the table.

Choosing a Key Field

The key_field must have a UNIQUE constraint. A non-unique key field is likely to lead to undefined behavior or incorrect search results.

While the key field can be any unique text, numeric, or datetime value, an integer key field will be the most performant. The Postgres SERIAL type is an easy way to create a unique integer column.

Finally, the key field must be the first column in the target list.

-- Recommended: key_field is the first column in the list
CREATE INDEX search_idx ON mock_items
USING bm25 (id, description)
WITH (key_field = 'id');

-- NOT recommended: key_field is not the first column
CREATE INDEX search_idx ON mock_items
USING bm25 (description, id)
WITH (key_field = 'id');

Partitioned Index

In Postgres, a partitioned index is an index created over a partitioned table. A BM25 index can be created over a partitioned table in the same way as a normal table.

Partial Index

The following code block demonstrates how to pass predicates to CREATE INDEX to construct a partial index. Partial indexes are useful for reducing index size on disk and improving update speeds over non-indexed rows.

CREATE INDEX search_idx ON mock_items
USING bm25 (id, description)
WITH (key_field='id')
WHERE category = 'Electronics' AND rating > 2;

Concurrent Indexing

To create a new index without blocking writes to your table, use the CONCURRENTLY keyword:

CREATE INDEX CONCURRENTLY search_idx_v2 ON mock_items
USING bm25 (id, description, category, rating, in_stock)
WITH (key_field='id');

This is particularly useful when you need to:

  • Reindex with different settings
  • Update an existing index without downtime
  • Change the indexed columns

pg_search can only use a single BM25 index per table - the most recently created one will automatically be used for queries. After creating a new index concurrently and verifying it works as expected, you can safely drop the old index:

DROP INDEX search_idx;

Delete Index

The following command deletes a BM25 index.

DROP INDEX search_idx;