Overview

Regular Postgres tables, known as heap tables, organize data by row. While this makes sense for operational data, it is inefficient for analytical queries, which often scan a large amount of data from a subset of the columns in a table.

ParadeDB introduces special tables called parquet tables. These tables behave like regular Postgres tables but have two primary advantages:

  1. Significantly faster aggregate queries due to a column-oriented layout and query engine
  2. Lower disk space due to Parquet as the storage format

How to Use Parquet Tables

-- USING parquet must be provided
CREATE TABLE movies (name text, rating int) USING parquet;

-- Insert and query data
INSERT INTO movies VALUES ('Star Wars', 9), ('Indiana Jones', 8);
SELECT AVG(rating) FROM movies;

-- Clear the table
TRUNCATE movies;

That’s it! parquet tables accept standard Postgres queries, so there’s nothing new to learn.

When to Use Parquet Tables

Because column-oriented storage formats are not designed for fast updates, parquet tables are for append-only data. UPDATE and DELETE clauses are not supported. Data that is frequently updated should be stored in regular Postgres heap tables.

Copying into a Parquet Table

This example demonstrates how to copy data from an existing heap table into a parquet table.

-- Create regular table
CREATE TABLE heap (a int, b int);
INSERT INTO heap VALUES (1, 2);

-- Copy into parquet table
CREATE TABLE copy USING parquet AS SELECT * FROM heap;

For Further Assistance

The paradedb.help function opens a Github Discussion that the ParadeDB team will respond to.

SELECT paradedb.help(
  subject => $$Something isn't working$$,
  body => $$Issue description$$
);

Was this page helpful?