INSERT
/UPDATE
/COPY
statement creates a new segment. Each segment has its own inverted index and columnar index, which means that the BM25 index
is actually a collection of many inverted/columnar indexes, each of which allows for very dense intersection queries to rapidly filter matches.
@@@
, to Postgres. @@@
means “find me all rows that match the following full-text query.”
@@@
is present at least once in the query. If the query does not include @@@
, it is executed entirely by native Postgres.
@@@
is present in a query, ParadeDB will execute the query using a custom scan.
Custom scans are execution nodes set aside by Postgres that allow extensions to run custom logic during a query. They are more powerful and versatile than typical Postgres index scans because they
allow the extension to “take over” large parts of the query, including aggregates, WHERE
, and even GROUP BY
clauses.
From a performance perspective, custom scans significantly speed up queries by pushing down filters, aggregates, and other operations directly into the index, rather than applying them afterward in separate phases.
To understand what kind of scan is used, run EXPLAIN
:
EXPLAIN
shows a custom scan (or, in rare cases, a BM25 index scan), then that part of query is going through ParadeDB. Otherwise, the query passes through standard Postgres.
EXPLAIN ANALYZE
:
pg_search
are pgrx
, the
library for writing Postgres extensions in Rust, and Tantivy, a Rust-based search library
inspired by Lucene.