This is a beta feature available in versions 0.22.0 and above.
Postgres allows a maximum of 32 columns in an index definition, but because ParadeDB benefits from pushing filters and ranking signals into the BM25 index this can become a limitation.
To index more than 32 columns in a single BM25 index,
wrap columns in a ROW() expression cast to a composite type. ParadeDB will unpack the composite type and index each
field individually.
Creating a Composite Type
First, define a composite type whose field names and types match the columns you want to index:
CREATE TYPE item_fields AS (description TEXT, category TEXT, rating INTEGER);
Then reference the columns in a ROW() expression cast to the composite type:
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (ROW(description, category, rating)::item_fields))
WITH (key_field='id');
Each field in the composite type is indexed as if it were a standalone column. Queries use the field names
directly with the standard operators:
SELECT description, category FROM mock_items
WHERE description &&& 'running shoes';
Configuring Tokenizers
Fields in the composite type can use tokenizers and
token filters by specifying them as the field type:
CREATE TYPE item_fields AS (
description pdb.simple('stemmer=english'),
category pdb.literal,
in_stock BOOLEAN
);
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (ROW(description, category, in_stock)::item_fields))
WITH (key_field='id');
Constraints
The following are not supported and will produce an error:
- Anonymous ROW expressions:
ROW(a, b) without a type cast is not allowed. Always cast to a named composite type.
- Nested composites: A composite type cannot contain another composite type as a field.
- Duplicate field names: Field names must be unique across all composite types and regular columns in the index.