Supported JOIN Types
ParadeDB supports all PostgreSQL JOINs:INNER JOINLEFT / RIGHT / FULL OUTER JOINCROSS JOINLATERAL- Semi and Anti JOINs
Scoring in JOINs
When usingparadedb.score() or paradedb.snippet() inside JOINs:
- Scores and snippets are computed before the JOIN at the base table level.
- JOIN conditions never change the score, they only determine which rows are combined.
Performance Characteristics
Fast Cases
Queries are efficient when search filters can be applied directly to the underlying tables. In these cases, PostgreSQL can push down the||| operators so that each table does its own filtered index scan before the JOIN runs.
That means:
- Each table only contributes rows that already match the search condition.
- The JOIN operates on much smaller intermediate sets.
a.bio and b.content are filtered independently.
The planner runs efficient index scans on each table and then joins the results.
Slower Cases
Queries become slower when search conditions span multiple tables in a way that prevents PostgreSQL from pushing them down. The most common example is anOR across different tables:
a and b, PostgreSQL cannot apply it until after the join. As a result, both tables must be scanned in full, joined, and only then filtered.
The plan will have this shape:
||| query is in the Join Filter, not in the scan.
Diagnosing Performance
UseEXPLAIN to check the query plan:
Custom Scannodes with large row counts- ParadeDB operators inside JOIN conditions
Tantivy Query: all(full index scan)
Writing Faster JOIN Queries
Replace Cross-Table OR with UNION
If you don’t need scores/snippets and have a simple JOIN, express the OR as a UNION of two separately filtered joins. This lets PostgreSQL push each search predicate down to a Custom Index Scan and avoid a join-time filter.Use CTEs for Complex Queries
Use common table expressions (CTEs) to pre-filter each table with its own search condition, then join the smaller result sets together. If possible, add aLIMIT to each CTE to keep the result sets small.
Roadmap
We really want to remove the need to think about the way to doJOINs in ParadeDB. At the moment we are actively working on:
- A
CustomScan Join APIfor native join handling - Smarter cost estimation for the PostgreSQL planner