Remove trailing and leading whitespace from a token
The trim filter removes leading and trailing whitespace from a token (but not whitespace in the middle). If a token consists
entirely of whitespace, the token is eliminated entirely.This filter is useful for tokenizers that don’t already split on whitespace, like the literal normalized
tokenizer or certain language-specific tokenizers.
Copy
Ask AI
CREATE INDEX search_idx ON mock_itemsUSING bm25 (id, (description::pdb.literal_normalized('trim=true')))WITH (key_field='id');
To demonstrate this token filter, let’s compare the output of the following two statements:
Copy
Ask AI
SELECT ' token with whitespace '::pdb.literal_normalized::text[], ' token with whitespace '::pdb.literal_normalized('trim=true')::text[];
Expected Response
Copy
Ask AI
text | text----------------------------------+--------------------------- {" token with whitespace "} | {"token with whitespace"}(1 row)