We’re a lean team that likes to ship at incredibly high velocity.
Here are the main features we’re currently working on.
In Progress
Analytics
- A custom scan node for aggregates. This will allow “plain SQL” aggregates to go through the same fast execution path as
our aggregate UDFs, further accelerating aggregates like
COUNT
, and SQL clauses like GROUP BY
.
Write Throughput
- Background merging. Improves write performance by merging index segments asynchronously without blocking inserts.
- Data model optimizations. Reduces the storage overhead of index segments.
- Pending list. Buffers recent write before flushing them to the LSM tree.
JOIN Improvements
- Scoring and highlighting across JOINs. BM25 score and snippet functions can be used in
JOIN
queries.
- Smarter JOIN planning for search indexes. Apply index-aware optimizations and cost estimation strategies when multiple BM25-indexed tables are joined.
- Faster JOIN performance through predicate pushdown. Search predicates are selectively pushed down to relevant tables based on indexability and selectivity, improving
JOIN
query speed.
Improved UX
- More intuitive index configuration. Overhaul the complicated JSON
WITH
index options.
- More ORM friendly. Overhaul the query builder functions to use actual column references instead of string literals.
- New operators. In addition to the existing
@@@
operator, introduce new operators for different query types (e.g. phrase, term, conjunction/disjunction).
Long Term
Managed Cloud
- Today, you can deploy ParadeDB either self-hosted or with ParadeDB BYOC. We are working on a fully managed cloud offering,
with a focus on scalability and supporting distributed workloads.
Deeper Analytics Improvements
- Push Postgres visibility rules into the index. This is currently a filter applied post index scan that adds overhead to large scans.
- Evaluate more industry-standard OLAP tools. A new file format? Query execution library?
Vector Search Improvements
- Postgres (and by extension, ParadeDB) uses
pgvector
for vector search. Contingent on demand and internal resources, we may
investigate what improvements can be made to the known limitations of pgvector
.
We’re Hiring
We’re tackling some of the hardest and (in our opinion) most impactful problems in Postgres. If you want to be a part of it,
please check out our open roles!
Responses are generated using AI and may contain mistakes.