Key Differences
| Elasticsearch | ParadeDB | |
|---|---|---|
| Query language | JSON DSL | Standard SQL with search operators |
| Data model | Denormalized documents | Normalized relational tables with JOINs |
| Transactions | Per-document atomicity, eventual consistency | Full ACID transactions |
| Index storage | Separate cluster | Inside Postgres (same database) |
| Schema changes | Dynamic mapping or reindex | Defined at index creation; REINDEX to change |
| Updates/deletes | Expensive (reindex internally) | Native Postgres operations |
Migration Tips
- Start with your most common queries. Map your highest-traffic Elasticsearch queries using the full-text search reference.
- Use SQL JOINs instead of denormalization. Elasticsearch requires denormalized documents, but ParadeDB supports full SQL JOINs. You can normalize your schema and simplify your data model.
- Continue to use Postgres tooling. Backups, replication, monitoring, and CI/CD integrate with standard Postgres tools you already use.
Feature Comparison
Query Capabilities
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Full-text search (BM25) | ✅ | ✅ | Match and phrase operators |
| Fuzzy matching | ✅ | ✅ | Max edit distance of 2 via pdb.fuzzy() |
| Phrase matching | ✅ | ✅ | ### operator |
| Phrase prefix | ✅ | ✅ | pdb.phrase_prefix() |
| Regular expressions | ✅ | ✅ | pdb.regex() |
| Wildcard queries | ✅ | ✅ | Via regex |
| Boolean queries | ✅ | ✅ | Via SQL AND/OR/NOT or paradedb.boolean |
| Proximity search | ✅ | ✅ | ## operator |
| More Like This | ✅ | ✅ | paradedb.more_like_this |
| Nested queries | ✅ | ✅ | Via SQL JOINs |
| Parent-child queries | ✅ | ✅ | Via SQL JOINs |
| Geo queries | ✅ | ❌ | Use PostGIS |
| Percolator (reverse search) | ✅ | ❌ | |
| Script-based scoring | ✅ | ❌ | |
| Suggesters (autocomplete) | ✅ | ✅ | Via search_tokenizer (index with ngram, search with unicode) or fuzzy prefix |
Text Analysis
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Custom tokenizers | ✅ | ✅ | 12+ built-in tokenizers |
| Token filters | ✅ | ✅ | 7 filters: lowercase, stemmer, stopwords, ascii_folding, alpha_num_only, trim, token_length |
| Character filters | ✅ | ❌ | |
| Synonyms | ✅ | ⚠️ | Coming soon |
| Different search-time analyzer | ✅ | ✅ | Via search_tokenizer or multiple tokenizers per field |
| Multi-language support | ✅ | ✅ | Chinese (Jieba), Japanese/Korean (Lindera), ICU |
| Stemming | ✅ | ✅ | 19 languages |
| Stopwords | ✅ | ✅ | 29 languages |
| N-gram tokenization | ✅ | ✅ | Configurable min/max gram size |
Aggregations
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Bucket aggregations | ✅ | ✅ | terms, histogram, date_histogram, range, filters |
| Metric aggregations | ✅ | ✅ | avg, sum, min/max, count, stats, percentiles, cardinality, top_hits |
| Pipeline aggregations | ✅ | ✅ | Use SQL window functions (SUM() OVER(), LAG(), etc.) |
| Nested aggregations | ✅ | ✅ | Use pdb.agg() with SQL GROUP BY |
| ES-compatible JSON syntax | — | ✅ | pdb.agg() accepts ES JSON directly |
| SQL GROUP BY | Limited | ✅ | Full SQL aggregation support |
Since
pdb.agg() accepts Elasticsearch-compatible JSON, many of your existing
aggregation queries can be migrated with minimal changes. See the aggregates
documentation.Scoring and Relevance
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| BM25 scoring | ✅ | ✅ | pdb.score() |
| Custom boost | ✅ | ✅ | pdb.boost() type cast |
| Constant score | ✅ | ✅ | pdb.const() |
| Disjunction max | ✅ | ✅ | paradedb.disjunction_max() |
| Function score | ✅ | ❌ | Use boost / const as partial alternatives |
| Script scoring | ✅ | ❌ | |
| Decay functions | ✅ | ❌ | |
| Field value factor | ✅ | ❌ |
Highlighting
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Snippet highlighting | ✅ | ✅ | pdb.snippet() |
| Multiple snippets | ✅ | ✅ | pdb.snippets() |
| Custom tags | ✅ | ✅ | start_tag, end_tag parameters |
| Byte offsets | ❌ | ✅ | pdb.snippet_positions() |
| Fuzzy highlighting | ✅ | ❌ |
Index Management
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Create index | ✅ | ✅ | CREATE INDEX ... USING bm25 |
| Drop index | ✅ | ✅ | DROP INDEX |
| Reindex | ✅ | ✅ | REINDEX |
| Index aliases | ✅ | ✅ | Via Postgres views |
| Index templates | ✅ | ❌ | |
| Dynamic mapping | ✅ | ❌ | Schema defined at index creation; requires REINDEX to change |
| Multi-field index | ✅ | ✅ | All columns included in one index per table |
Data Operations
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| ACID transactions | ❌ | ✅ | Full Postgres ACID compliance |
| Real-time indexing | ⚠️ | ✅ | ES is near-real-time (requires refresh); ParadeDB provides immediate read-after-write |
| JOINs | ❌ | ✅ | Full SQL JOIN support |
| UPDATE / DELETE | ⚠️ | ✅ | ES internally reindexes; Postgres handles natively |
| Bulk insert | ✅ | ✅ | COPY or batch INSERT |
| SQL queries | ❌ | ✅ | Full SQL including subqueries, CTEs, window functions |
Deployment and Operations
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
| Horizontal sharding | ✅ | ⚠️ | Via Citus for distributed workloads |
| Read replicas | ✅ | ✅ | Postgres streaming replication |
| Kubernetes | ✅ | ✅ | CNPG / Helm charts |
| Docker | ✅ | ✅ | Official Docker image |
| Logical replication ingest | ❌ | ✅ | Sync from existing Postgres |
| Cross-cluster search | ✅ | ❌ | |
| Snapshot / restore | ✅ | ✅ | Via Postgres backup tools (pg_dump, WAL archiving) |
| Monitoring | ✅ | ✅ | pg_stat, pganalyze, standard Postgres tools |
Pagination
| Feature | Elasticsearch | ParadeDB | Notes |
|---|---|---|---|
from / size | ✅ | ✅ | SQL LIMIT / OFFSET |
scroll API | ✅ | ❌ | Use SQL cursors (DECLARE / FETCH) instead |
search_after | ✅ | ❌ | Use keyset pagination (WHERE id > last_id ORDER BY id) instead |
| Top K optimization | ✅ | ✅ | paradedb.limit_fetch_multiplier |