0.21.5 - ParadeDB

Stability Improvements 💪

Fixed a rare query hang caused by an upstream Tantivy change
Fixed an issue where pg_dump/pg_restore would not work if a pdb.* tokenizer contained filters

New Features 🎉

Columnar Storage for Literal Normalized Fields

The literal normalized tokenizer now gets indexed as columnar by default (in addition to the inverted index). This means that literal normalized fields can take part in aggregate and Top N queries without the need for a separate literal tokenizer. For example, if we use the literal normalized tokenizer on the description field:

CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.literal_normalized))
WITH (key_field = 'id');

We can ORDER BY description and get a fast Top N query:

SELECT * FROM mock_items
WHERE id @@@ pdb.all()
ORDER BY description
LIMIT 10;

And description can be used with pdb.agg:

SELECT description, pdb.agg('{"value_count": {"field": "id"}}') FROM mock_items
WHERE id @@@ pdb.all()
GROUP BY description
ORDER BY description
LIMIT 10;

This change does not apply to indexes that have already been created. After upgrading to 0.21.5, you will need to reindex to use this feature.

The full changelog is available on the GitHub Release.

Changelog

​Stability Improvements 💪

​New Features 🎉

​Columnar Storage for Literal Normalized Fields

Stability Improvements 💪

New Features 🎉

Columnar Storage for Literal Normalized Fields