Distributed full-text search with Citus and ParadeDB
Citus transforms PostgreSQL into a distributed database with horizontal sharding. ParadeDB is fully compatible with Citus, enabling distributed full-text search across sharded tables.
Here’s a complete example of setting up distributed search with Citus:
Copy
Ask AI
CREATE EXTENSION citus;CREATE EXTENSION pg_search;-- Create a table with a distribution keyCREATE TABLE articles ( id SERIAL, author_id INT NOT NULL, title TEXT, body TEXT, PRIMARY KEY (author_id, id) -- Must include distribution column);-- Distribute the table across shardsSELECT create_distributed_table('articles', 'author_id');-- Create a BM25 index on the distributed tableCREATE INDEX articles_search_idx ON articlesUSING bm25 (id, title, body)WITH (key_field='id');-- Insert some dataINSERT INTO articles (author_id, title, body) VALUES (1, 'PostgreSQL Performance', 'Optimizing PostgreSQL queries for large datasets'), (1, 'Distributed Databases', 'Understanding sharding and replication strategies'), (2, 'Full-Text Search', 'Building search engines with PostgreSQL');-- Search across shardsSELECT id, title FROM articlesWHERE body ||| 'PostgreSQL distributed'ORDER BY id;-- Results:-- id | title-- ----+-------------------------- 1 | PostgreSQL Performance-- 3 | Full-Text Search
ParadeDB search operators work seamlessly with Citus distributed JOINs:
Copy
Ask AI
-- Create and distribute a second tableCREATE TABLE authors ( id INT PRIMARY KEY, name TEXT, bio TEXT);SELECT create_distributed_table('authors', 'id');-- JOIN with search operatorsSELECT a.name, ar.titleFROM authors aJOIN articles ar ON a.id = ar.author_idWHERE ar.body ||| 'PostgreSQL'ORDER BY a.name;-- Results:-- name | title-- -------+-------------------------- Alice | PostgreSQL Performance-- Bob | Full-Text Search
❌ Citus columnar tables — BM25 indexes and other PostgreSQL indexes (like GiST, GIN) cannot be created on Citus columnar tables due to limitations in Citus’s columnar storage implementation. However, you can use regular distributed tables with BM25 indexes alongside columnar tables for analytics.