Operating Model
When ParadeDB is used as a logical subscriber:- Your application writes to tables on the publisher
- PostgreSQL logical replication applies those row changes to matching tables on ParadeDB
- ParadeDB maintains BM25 indexes locally on the subscriber
- Search and analytics queries run against ParadeDB instead of the primary
Logical replication copies row changes into ParadeDB, but it does not copy
BM25 indexes from the publisher. For the deployment described in this guide,
build the BM25 indexes you plan to query on the ParadeDB subscriber.
Baseline Workflow
1. Wait for the Initial Copy to Finish
Let PostgreSQL finish copying the base table data before you build BM25 indexes. This avoids extra indexing work during the bootstrap phase. On ParadeDB, you can check whether the initial copy is still running with:worker_type = 'table synchronization'. If you want a stricter per-table
check, run:
r
(ready).
2. Build BM25 Indexes on ParadeDB
Once the replicated tables are caught up, create BM25 indexes locally on ParadeDB:INSERT, UPDATE, and DELETE operations will
keep the BM25 index current automatically.
3. Query ParadeDB
Your application can now issue search queries to ParadeDB without adding search indexes to the primary database:Day-2 Operations
Choose Publication and Subscription Boundaries
For large or high-churn production tables, use one publication and one subscription per large table, or group only small related tables together. This gives each subscription its own leader apply worker and replication slot. In normal steady-state replication, PostgreSQL does not parallelize ordinary change application across tables within a single subscription, so one hot table can delay other tables that share that apply worker. A publication per table alone does not provide that isolation unless it also has its own subscription. If you split replication this way, size the replication worker settings for the number of subscriptions you plan to run:- On the publisher, set
max_replication_slotsto at least the number of subscriptions plus reserve for initial table synchronization workers. During bootstrap, each active table synchronization worker can temporarily consume its own replication slot on the publisher. With the defaultmax_sync_workers_per_subscription = 2, leave room for the main subscription plus up to two extra sync slots per bootstrapping subscription, and setmax_wal_sendershigh enough to cover the same plus any physical replicas. - On the subscriber, set
max_active_replication_originsandmax_logical_replication_workersto at least the number of subscriptions plus reserve for table synchronization workers.max_worker_processesmust also be high enough to accommodate those logical replication workers and any other background workers used by the system. max_sync_workers_per_subscriptioncontrols initial-copy parallelism when a subscription is created or refreshed. The default is2, so multi-table publications normally copy at most two tables at a time unless you raise it.
Add New Tables
When you want ParadeDB to index a new table:- Apply the new table DDL on the publisher
- Apply the same DDL on ParadeDB
- Make sure the publication includes the table
- Refresh the subscription
- Build a BM25 index on ParadeDB if the table should be searchable
FOR ALL TABLES, the new table is included automatically. If
it uses FOR TABLES IN SCHEMA ..., new tables in those schemas are included
automatically. If it was created from an explicit table list, add the table
manually. If you do not want the table on ParadeDB, do not include it in the
publication.
Change Indexed Columns
If you add or remove a column that is part of a BM25 index:- Apply the table change on both the publisher and ParadeDB
- Let replication catch up again
- Rebuild the BM25 index on ParadeDB
Roll Out DDL Safely
PostgreSQL logical replication does not replicate schema changes. That means the publisher and ParadeDB must be kept in sync manually. In practice, most teams do this through their existing migration runner or framework tooling, whether that is Rails migrations, Django migrations, Prisma Migrate, or another migration system. For additive changes such asADD COLUMN, the safest rollout is usually:
- Apply the additive DDL on ParadeDB first
- Apply the same DDL on the publisher
- Let replication continue normally
- Rebuild any BM25 indexes whose indexed column list changed
NULL if no default
is defined, until the publisher starts sending that column.
If the new column must be NOT NULL, give it a compatible default on both
sides or use a coordinated maintenance window. Otherwise replicated INSERT
operations can fail before the publisher-side change is in place.
If the change is not additive, such as a column rename, drop, or incompatible
type change, use a short maintenance window, pause writes to the affected
tables if possible, and coordinate both sides explicitly:
Handle Tables Without Primary Keys
PostgreSQL needs a replica identity to replicateUPDATE and DELETE
operations. A primary key is best. Another suitable unique index can also be
used as the replica identity. If a table has no suitable key, you can use the
per-table fallback:
REPLICA IDENTITY FULL is set
per published table and should be treated as a fallback rather than the default
design.
PostgreSQL explicitly warns that subscriber-side row lookup can become very
inefficient under FULL if there is no suitable index to help find matching
rows. In practice, FULL can also increase WAL and replication traffic for
UPDATE and DELETE because the entire row is used as the identity instead of
a smaller key.
Monitor the Publisher
Permanent logical replication is operationally safe only if you watch the publisher, not just the subscriber. The most important signal is how much WAL a logical slot is retaining.retained_walgrowing steadily because the subscriber is not acknowledging WAL quickly enoughinactive_sincebecoming non-NULLfor longer than expectedwal_statusshowing that the slot is under pressure- Filesystem usage on the volume that contains
pg_wal
max_slot_wal_keep_size on the publisher.
This caps how much WAL a slot may retain, but it can also invalidate a lagging
subscriber, so it should be paired with alerting and a reseed plan.
Monitor the Subscriber
Use the subscriber to confirm that apply workers are healthy and that errors are not accumulating:latest_end_time stops advancing or apply_error_count increases, inspect
the subscriber logs immediately.
Troubleshoot Apply Failures
One common cause of apply-worker failures is schema drift between the publisher and subscriber. Two common log patterns for schema drift are:- Inspect the subscriber logs for the first schema-mismatch error, not just the worker restart message
- Compare the affected table definition on the publisher and ParadeDB
- Apply the missing DDL on ParadeDB
- Re-enable or refresh the subscription if needed
- Rebuild any BM25 indexes affected by the schema change
- Inspect the subscriber logs for the first conflict error and note the finish LSN and replication origin if PostgreSQL logged them
- Resolve the underlying issue on the subscriber, such as conflicting local data, missing privileges, or row-level security policy interference
- Resume replication normally once the conflict is removed
- Only if you intentionally want to discard that remote transaction, use
ALTER SUBSCRIPTION ... SKIPwith care
Emergency: WAL Keeps Accumulating on the Publisher
If the logical slot on the publisher is filling disk and ParadeDB cannot catch up quickly enough, the priority is protecting the publisher.- First, fix the subscriber if the issue is simple and recent, such as a schema mismatch or networking issue
- If the publisher is running out of disk and the subscriber can be rebuilt, remove the subscription or drop the logical slot so the publisher can recycle WAL again
- Recreate the subscription and reseed ParadeDB once the publisher is safe
pg_wal growth when you are
intentionally giving up the current replica state, drop the slot on the
publisher:
Common Pitfalls
- Starting with pre-populated subscriber tables while using
copy_data = true - Applying DDL on only one side of the replication link
- Forgetting that new tables must be added to the publication and refreshed on the subscription
- Writing directly to subscribed tables on ParadeDB, which can create conflicts with incoming replicated changes
- Leaving a broken logical slot unattended on the publisher until
pg_walfills disk - Assuming
ALTER SUBSCRIPTION ... DISABLErelieves publisher-side WAL pressure