Overview

ParadeDB Analytics transforms Postgres into a fast analytical query engine over external object stores like Amazon S3 and table formats like CSV, Parquet, Delta, or Iceberg. Queries are pushed down to DuckDB, which delivers excellent analytical performance.

Getting Started

The following example queries an example dataset of 3 million NYC taxi trips from January 2024, hosted in a public S3 bucket provided by ParadeDB.

CREATE FOREIGN DATA WRAPPER parquet_wrapper
HANDLER parquet_fdw_handler VALIDATOR parquet_fdw_validator;

-- Provide S3 credentials
CREATE SERVER parquet_server FOREIGN DATA WRAPPER parquet_wrapper;

-- Create foreign table with auto schema creation
CREATE FOREIGN TABLE trips ()
SERVER parquet_server
OPTIONS (files 's3://paradedb-benchmarks/yellow_tripdata_2024-01.parquet');

-- Success! Now you can query the remote Parquet file like a regular Postgres table
SELECT COUNT(*) FROM trips;
  count
---------
 2964624
(1 row)

That’s it! To query your own data, please refer to the data import documentation.

If no columns are provided to CREATE FOREIGN TABLE, the appropriate Postgres schema will automatically be created.

For Further Assistance

The paradedb.help function opens a GitHub Discussion that the ParadeDB team will respond to.

SELECT paradedb.help(
  subject => $$Something isn't working$$,
  body => $$Issue description$$
);