Highlighting is an expensive process and can slow down query times.
We recommend passing a
LIMIT
to any query where paradedb.snippet
is called to restrict the
number of snippets that need to be generated.Highlighting is not supported for queries that use fuzziness, like
paradedb.fuzzy_term
.<b></b>
tags. This can
be modified with the start_tag
and end_tag
arguments.
Basic Usage
paradedb.snippet(<column>)
can be added to any query where an @@@
operator is present.
The following query generates highlighted snippets against the description
field.
<b></b>
encloses the snippet. This can be configured with start_tag
and end_tag
:
Fragment Size
For every highlighted term, a fragment of sizemax_num_chars
is created containing the term and its surrounding text. A fragment can contain
multiple highlighted terms if they are within max_num_chars
distance of one another. By default, max_num_chars
is set to 150
.
paradedb.snippet
uses a two-tiered scoring system to determine which fragment to display:
- Each highlighted term receives a score based on its inverse document frequency. This means that fragments containing rarer terms will score higher.
- If there is a tie, the fragment that appears earlier in the source text will be displayed.
Byte Offsets
paradedb.snippet_positions(<column>)
returns the byte offsets in the original text where the snippets would appear. It returns an array of
tuples, where the the first element of the tuple is the byte index of the first byte of the highlighted region, and the second element is the byte index after the last byte of the region.
Expected Response
Expected Response
Snippet Limit and Offset
Bothparadedb.snippet
and paradedb.snippet_positions
accept limit
and offset
arguments. A limit
restricts the number of
highlighted terms, while an offset
ignores the first offset
highlighted terms. This can be useful for paginating
through documents that contain large numbers of highlighted terms.
Expected Response
The
limit
and offset
arguments must be wrapped in double quotes because they are reserved keywords in Postgres.sleek
is not highlighted because an offset of 1
skips the first highlighted term.
Similarly, shoes
is not highlighted because of the limit 1
.