New Features 🎉
- Aggregate-on-JOIN support: a full pipeline for running aggregates over join results, including:
- 3+ table join support and
DISTINCTaggregates in AggregateScan. LEFT/RIGHT/FULL JOINandCOUNT(DISTINCT).HAVINGclause and per-aggregateFILTERclause.ORDER BYwithinSTRING_AGGandARRAY_AGG.- JSON sub-field
GROUP BYin the DataFusion aggregate path. - Date/Timestamp projection and
STDDEV/VARIANCEaggregates. - Extended aggregate TopK detection to
GROUP BYcolumn andMIN/MAXordering. - TopK optimization for aggregate queries via Tantivy and DataFusion.
- 3+ table join support and
- New
edge_ngramtokenizer for search-as-you-type use cases. - Support for
IS NULL/IS NOT NULLpredicates in Top-K JoinScan. - Per-field tunable BM25
k1andbparameters via typmod. - Explicit control of prefix behavior for fuzzy queries.
keep_whitespaceoption for Lindera tokenizers.- Support for the
citextPostgres type.
Performance Improvements 🚀
- Lazily set up tokenizers, avoiding unnecessary initialization overhead.
- Conditionally enable scoring in row estimation for faster planning.
- Cache
anyelement_search_opoidsto reduce repeated catalog lookups. - Cache
score/snippetfunction OIDs. - Cache
segment_idtosegment_ordinalmapping duringSearchIndexReaderconstruction.
Stability Improvements 💪
- Suppress LIMIT pushdown when non-pushable post-filters are present.
- Fix Top K for prepared statements.
- Recognize identity expressions (e.g.
id + 0) in JoinScanORDER BYmatching. - Re-add meaningful term filter.
- Fix boolean comparison edge cases.
- Allow unaliased, tokenized index fields to be selected by name when the same column is indexed multiple times.
- Support
varchar,text[], andcitextparameters in operators with generic plans.