In order for ParadeDB to run as a logical subscriber, ParadeDB must be using
Postgres 17+.
In production, ParadeDB is commonly deployed as a logical subscriber to your
primary Postgres. Your application continues to write to the source database,
while ParadeDB receives the same row changes and maintains local BM25 indexes
for search and analytics.
This deployment model is useful when:
- Your primary Postgres runs on a managed service such as AWS RDS, Aurora,
Cloud SQL, AlloyDB, or Azure Database for PostgreSQL
- You want search and analytics queries to run away from your OLTP workload
- You want to keep Postgres as the system of record and add ParadeDB as a
dedicated read and search node
Logical replication copies row changes, not schema changes or indexes. The
published tables must already exist on ParadeDB, and any DDL must be applied
on both sides. For ongoing operations, see the Logical Replication
Operational Guide.
ParadeDB supports logical replication from any primary Postgres.
Managed Postgres Providers
Each managed provider has its own prerequisite steps for enabling logical
replication. In every case, the managed database is the publisher and
ParadeDB is the subscriber.
- AWS RDS/Aurora: Follow AWS’
tutorial
- Azure Database for PostgreSQL: Follow Azure’s
tutorial
- Cloud SQL for PostgreSQL: Follow Google’s
tutorial
- AlloyDB for PostgreSQL: Follow Google’s
tutorial
Self-Hosted Postgres
The example below shows a minimal self-hosted setup where Postgres publishes
changes and ParadeDB subscribes to them.
Environment Setup
We’ll use the following environment:
Publisher
- OS: Ubuntu 24.04
- IP: 192.168.0.30
- Database Name:
marketplace
- Replication User:
replicator
- Replication Password:
passw0rd
Subscriber (ParadeDB)
- OS: Ubuntu 24.04
- IP: 192.168.0.31
Ensure that postgresql.conf on the publisher has the following settings:
listen_addresses = 'localhost,192.168.0.30'
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10
Leave headroom in max_replication_slots and max_wal_senders for the initial
copy phase, not just the steady-state subscription. During bootstrap,
PostgreSQL can create temporary table synchronization slots and connections on
the publisher. With the default max_sync_workers_per_subscription = 2, plan
for the main subscription plus up to two extra sync workers per actively
bootstrapping subscription.
Then allow the subscriber to connect in pg_hba.conf:
local replication all peer
host replication all 127.0.0.1/32 scram-sha-256
host replication all ::1/128 scram-sha-256
host replication all 192.168.0.0/24 scram-sha-256
Create a replication user:
sudo -u postgres createuser --pwprompt --replication replicator
2. Create the Source Schema on the Publisher
Create a database and a table on the publisher:
sudo -u postgres -H createdb marketplace
CREATE TABLE mock_items (
id SERIAL PRIMARY KEY,
description TEXT,
rating INTEGER CHECK (rating BETWEEN 1 AND 5),
category VARCHAR(255),
in_stock BOOLEAN,
metadata JSONB,
created_at TIMESTAMP,
last_updated_date DATE,
latest_available_time TIME
);
INSERT INTO mock_items (description, category, in_stock, latest_available_time, last_updated_date, metadata, created_at, rating)
VALUES ('Red sports shoes', 'Footwear', true, '12:00:00', '2024-07-10', '{}', '2024-07-10 12:00:00', 1);
Because mock_items has a primary key, it already has a valid replica
identity for INSERT, UPDATE, and DELETE, so no additional replica
identity configuration is needed here.
3. Bootstrap the Schema on ParadeDB
Logical replication does not copy schema definitions, so create the same
database and tables on ParadeDB before you subscribe. A schema-only dump is the
simplest way to do this:
createdb -h 192.168.0.31 -U postgres marketplace
pg_dump --schema-only --no-owner --no-privileges \
-h 192.168.0.30 -U postgres marketplace \
| psql -h 192.168.0.31 -U postgres marketplace
The target tables on ParadeDB should start empty if you are using the default
initial copy behavior of CREATE SUBSCRIPTION.
4. Install and Load pg_search on ParadeDB
Deploy ParadeDB on the subscriber using Docker,
Kubernetes, or as an
extension, then load the extension in the
subscriber database:
CREATE EXTENSION pg_search;
5. Create a Publication on the Publisher
CREATE PUBLICATION marketplace_pub FOR TABLE mock_items;
If you plan to replicate several large or update-heavy tables, avoid putting
all of them into one catch-all subscription by default. Our usual
recommendation is one publication/subscription pair per large hot table, and to
group only smaller related tables together.
In normal steady-state replication, PostgreSQL runs a single apply worker per
enabled subscription, so changes for all tables inside that subscription are
applied through that worker on the subscriber. Separate subscriptions give each
group its own apply worker and replication slot.
6. Create a Subscription on ParadeDB
CREATE SUBSCRIPTION marketplace_sub
CONNECTION 'host=192.168.0.30 port=5432 dbname=marketplace user=replicator password=passw0rd application_name=marketplace_sub'
PUBLICATION marketplace_pub;
By default, PostgreSQL copies existing rows from the publisher and then keeps
streaming new changes. If you do not want the initial copy, create the
subscription with WITH (copy_data = false) and backfill the tables by another
method.
Initial-copy parallelism is controlled by
max_sync_workers_per_subscription, which defaults to 2. If the publication
contains many tables, it is normal to see only two tables copying at once while
the rest wait for those workers to free up.
For steady-state write throughput, do not confuse that initial-copy parallelism
with normal apply throughput. If several high-churn tables share one
subscription, their changes still funnel through the subscription’s main apply
worker unless you are using PostgreSQL’s separate parallel-apply features for
streamed in-progress transactions.
7. Verify Replication
First check that the existing row is present on ParadeDB:
SELECT id, description, category
FROM mock_items
ORDER BY id;
Then insert a new row on the publisher:
INSERT INTO mock_items (description, category, in_stock, latest_available_time, last_updated_date, metadata, created_at, rating)
VALUES ('Blue running shoes', 'Footwear', true, '14:00:00', '2024-07-10', '{}', '2024-07-10 14:00:00', 2);
Now verify that the new row arrives on ParadeDB:
SELECT id, description, category
FROM mock_items
WHERE description = 'Blue running shoes';
At this point, the base table is replicating correctly and you can create BM25
indexes locally on ParadeDB. Continue to the Logical Replication Operational
Guide for BM25
index build timing, monitoring, WAL retention, and troubleshooting.