Skip to main content
After a tokenizer splits up text into tokens, token filters apply additional processing to each token. Common examples include stemming to reduce words to their root form, or ASCII folding to remove accents. Token filters can be added to any tokenizer besides the literal tokenizer, which by definition must preserve the source text exactly. To add a token filter to a tokenizer, append a configuration string to the argument list:
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.simple('stemmer=english', 'ascii_folding=true')))
WITH (key_field='id');
I