Basic Usage

Prerequisite Before performing full text search over a table, it must first be indexed.

The search function returns rows of a table that match a search query. By default, rows are sorted by relevance.

SELECT * FROM <index_name>.search('<query>');
index_name
required

The name of the index.

query
required

The ParadeQL query string. See the documentation below for how to construct this string.

ParadeQL

The query string accepts ParadeQL, a mini query language which can be used to construct more expressive queries.

Specifying Fields

Each query must specify which field to search over. In the following example, we are querying for “keyboard” against the “description” field.

'description:keyboard'

Phrases containing spaces should be wrapped in double quotes.

'description:"plastic keyboard"'

Efficient Filtering

Filters can be applied over numeric fields, which improves query times compared to standard SQL WHERE clauses.

-- Equality
'description:keyboard AND rating:4'

-- Simple range
'description:keyboard AND rating:>4'

-- Inclusive range
'description:keyboard AND rating:[2 TO 5]'

-- Exclusive range
'description:keyboard AND rating:{2 TO 5}'

-- Boolean
'description:keyboard AND is_available:false'

Filters only work over Postgres columns that have been indexed as numeric_fields or boolean_fields.

JSON Fields

Use . to search over text values nested inside JSON. For instance, the following query would search over a field with values like {"metadata": {"color": "white"}}.

'metadata.color:white'

When dealing with JSON arrays, the array elements are “flattened” so that each element can be searched individually. This means that if a JSON array is encountered, each element in the array is treated as a separate value and indexed accordingly. For example, given the following JSON structure:

{
  "metadata": {
    "colors": ["red", "green", "blue"]
  }
}

The JSON array in the colors field is flattened to emit separate terms for each color. This allows for individual search queries like:

'metadata.colors:red'
'metadata.colors:green'
'metadata.colors:blue'

Each of these queries would correctly match the document containing the JSON array.

Searching for integers in a nested JSON structure is not supported. For example:

SELECT * FROM <index_name>.search('metadata.attributes:4', stable_sort => true)

Datetime Fields

Search terms will use the UTC time zone if not specified and need to be in RFC3339 format for the search function.

-- To demonstrate time zones, these query strings are equivalent
'created_at:"2023-05-01T09:12:34Z"'
'created_at:"2023-05-01T04:12:34-05:00"'

Boosting

The ^ character is used to boost the scores of individual fields. Boosting is helpful for tuning the relevance scores of results.

'description:keyboard^2 OR category:electronics^3'

Boolean Operators

AND, OR, and NOT can be used to combine and filter multiple terms.

'description:keyboard OR category:toy'

Use parentheses to group terms and control the order of operations.

'(description:keyboard OR category:toy) AND description:metal'

Slop Operator

The ~ slop operator is used to match phrases separated by words in between. For instance, let’s say there exists a row with description set to “ergonomic metal keyboard.” Because words “ergonomic” and “keyboard” are separated by one word, the following query would find this row.

'"ergonomic keyboard"~1'

The slop operator is distinct from fuzzy search, which is used for typo tolerance.

Set Operator

The set operator is a more CPU-efficient way of combining multiple ORs.

'description:IN [keyboard, toy]'

Special Characters

The special characters + , ^, ```, :, {, }, ", [, ], (, ), ~, !, \\, \*, and SPACEmust be escaped by a\ inside the query term.

Limit and Offset

Specifying a limit and offset is a more efficient way of iterating through search results compared to SQL’s LIMIT and OFFSET options. For optimal query performance, we recommend using the limit_rows and offset_rows options instead of LIMIT and OFFSET.

SELECT *
FROM <index_name>.search(
  '<query>',
  limit_rows => <limit_rows>,
  offset_rows => <offset_rows>
)

Stable Ordering

Search results are always ordered based on their BM25 score, but we can use the stable_sort parameter for control over the order of equally-scored results.

If false, equally-scored results will be ordered based on their insertion order into the index. This is the default, and allows for the fastest possible query times.

If true, equally-scored results will be ordered based on their key_field, but query times will be slower. This is useful for testing or anytime where results need to be deterministic.

SELECT *
FROM <index_name>.search(
  '<query>',
  stable_sort => true
)
index_name
required

The name of the index.

query
required

The query string.

limit_rows
The maximum number of rows to return.
offset_rows

The number of rows to skip before starting to return rows.

stable_sort
default: false

A boolean specifying whether ParadeDB should stabilize the order of equally-scored results, at the cost of performance.