Basic Usage
You can change how individual fields are tokenized by passing JSON strings to theWITH
clause of CREATE INDEX
.
For instance, the following statement configures an ngram tokenizer for the description
field.
Configure Multiple Fields
To configure multiple fields, simply pass more keys to the JSON string. For instance, the following statement specifies tokenizers for both thedescription
and category
fields.
All Configuration Options
Text Fields
Options for columns of typeVARCHAR
, TEXT
, UUID
, and their corresponding array types
should be passed to text_fields
.
text_fields
accepts the following keys.
See fast fields for when this option
should be set to
true
.tokenizer
See tokenizers for how to configure the
tokenizer.
normalizer
See normalizers for how to
configure the normalizer.
Advanced Options
Advanced Options
JSON Fields
Options for columns of typeJSON
and JSONB
should be passed to json_fields
.
json_fields
accepts the following keys.
See fast fields for when this option should be set to
true
.tokenizer
See tokenizers for how to configure the tokenizer.
normalizer
See normalizers for how to configure the normalizer.
If
true
, JSON keys containing a .
will be expanded. For instance, if expand_dots
is true
,
{"metadata.color": "red"}
will be indexed as if it was {"metadata": {"color": "red"}}
.Advanced Options
Advanced Options
Advanced Options
In addition to text and JSON, ParadeDB exposes options for numeric, datetime, boolean, range, and enum fields. For most use cases, it is not necessary to change these options.Numeric Fields
Options for columns of typeSMALLINT
, INTEGER
, BIGINT
, OID
, REAL
, DOUBLE PRECISION
, NUMERIC
, and their corresponding array types
should be passed to numeric_fields
.
Advanced Options
Advanced Options
Whether the field is indexed. Must be
true
in order for the field to be
tokenized and searchable.Fast fields can be random-accessed rapidly. Fields used for aggregation must
have
fast
set to true
. Fast fields are also useful for accelerated
scoring and filtering.Boolean Fields
Options for columns of typeBOOLEAN
and BOOLEAN[]
should be passed to boolean_fields
.
CREATE_INDEX
accepts several configuration options for boolean_fields
:
Advanced Options
Advanced Options
Whether the field is indexed. Must be
true
in order for the field to be
tokenized and searchable.Fast fields can be random-accessed rapidly. Fields used for aggregation must
have
fast
set to true
. Fast fields are also useful for accelerated
scoring and filtering.Datetime Fields
Options for columns of typeDATE
, TIMESTAMP
, TIMESTAMPTZ
, TIME
, TIMETZ
, and their corresponding array types should be passed to datetime_fields
.
CREATE INDEX
accepts several configuration options for datetime_fields
:
Advanced Options
Advanced Options
Whether the field is indexed. Must be
true
in order for the field to be
tokenized and searchable.Fast fields can be random-accessed rapidly. Fields used for aggregation must
have
fast
set to true
. Fast fields are also useful for accelerated
scoring and filtering.Enumerated Types
Options for custom Postgres enums should be passed tonumeric_fields
.
Enums should be queried with term queries.
If the ordering of the enum is changed with ADD VALUE ... [ BEFORE | AFTER ]
, the BM25 index should be dropped
and recreated to account for the new enum ordinal values.