Skip to main content
Match queries are the go-to query type for text search in ParadeDB. There are two types of match queries: match disjunction and match conjunction.

Match Disjunction

Match disjunction uses the ||| operator and means “find all documents that contain one or more of the terms tokenized from this text input.” To understand what this looks like in practice, let’s consider the following query:
SELECT description, rating, category
FROM mock_items
WHERE description ||| 'running shoes';
This query returns:
     description     | rating | category
---------------------+--------+----------
 Sleek running shoes |      5 | Footwear
 White jogging shoes |      3 | Footwear
 Generic shoes       |      4 | Footwear
(3 rows)

How It Works

Let’s look at what the ||| operator does:
  1. Retrieves the tokenizer configuration of the description column. In this example, let’s assume description uses the simple tokenizer.
  2. Tokenizes the query string with the same tokenizer. This means running shoes becomes two tokens: running and shoes.
  3. Finds all rows where description contains any one of the tokens, running or shoes.
This is why all results have either running or shoes tokens in description.

Examples

Let’s consider a few more hypothetical documents to see whether they would be returned by match disjunction. These examples assume that the index uses the default tokenizer and token filters, and that the query is running shoes.
Original TextTokensMatchReasonRelated
Sleek running shoessleek running shoesContains both running and shoes.
Running shoes sleeksleek running shoesContains both running and shoes.Phrase
SLeeK RUNNING ShOeSsleek running shoesContains both running and shoes.Lowercasing
Sleek run shoesleek run shoeContains neither running nor shoes.Stemming
Sleke ruining shoezsleke ruining shoezContains neither running nor shoes.Fuzzy
White jogging shoeswhite jogging shoesContains shoes.Match conjunction

Match Conjunction

Suppose we want to find rows that contain both running and shoes. This is where the &&& match conjunction operator comes in. &&& means “find all documents that contain all terms tokenized from this text input.”
SELECT description, rating, category
FROM mock_items
WHERE description &&& 'running shoes';
This query returns:
     description     | rating | category
---------------------+--------+----------
 Sleek running shoes |      5 | Footwear
(1 row)
Note that White jogging shoes and Generic shoes are no longer returned because they do not have the token running.

How It Works

Match conjunction works exactly like match disjunction, except for one key distinction. Instead of finding documents containing at least one matching token from the query, it finds documents where all tokens from the query are a match.

Examples

Let’s consider a few more hypothetical documents to see whether they would be returned by match conjunction. These examples assume that the index uses the default tokenizer and token filters, and that the query is running shoes.
Original TextTokensMatchReasonRelated
Sleek running shoessleek running shoesContains both running and shoes.
Running shoes sleeksleek running shoesContains both running and shoes.Phrase
SLeeK RUNNING ShOeSsleek running shoesContains both running and shoes.Lowercasing
Sleek run shoesleek run shoeDoes not contain both running and shoes.Stemming
Sleke ruining shoezsleke ruining shoezDoes not contain both running and shoes.Fuzzy
White jogging shoeswhite jogging shoesDoes not contain both running and shoes.Match conjunction
If the query string only contains one token, then ||| and &&& are effectively the same:
-- These two queries produce the same results
SELECT description, rating, category
FROM mock_items
WHERE description ||| 'shoes';

SELECT description, rating, category
FROM mock_items
WHERE description &&& 'shoes';

Using a Custom Tokenizer

By default, the match query automatically tokenizes the query string with the same tokenizer used by the field it’s being searched against. This behavior can be overridden by explicitly casting the query to a different tokenizer.
SELECT description, rating, category
FROM mock_items
WHERE description ||| 'running shoes'::pdb.whitespace;
I