Terms - ParadeDB

If a text or JSON field is in the GROUP BY or ORDER BY clause, it must use the literal tokenizer.

A terms aggregation counts the number of occurrences for every unique value in a field. For example, the following query groups the mock_items table by rating, and calculates the number of items for each unique rating.

SELECT rating, pdb.agg('{"value_count": {"field": "id"}}') FROM mock_items
WHERE id @@@ pdb.all()
GROUP BY rating
LIMIT 10;

Expected Response

 rating |       agg
--------+-----------------
      4 | {"value": 16.0}
      5 | {"value": 12.0}
      3 | {"value": 9.0}
      2 | {"value": 3.0}
      1 | {"value": 1.0}
(5 rows)

Ordering by the bucketing field is supported:

SELECT rating, pdb.agg('{"value_count": {"field": "id"}}') FROM mock_items
WHERE id @@@ pdb.all()
GROUP BY rating
ORDER BY rating
LIMIT 10;

Ordering by the aggregate value is not yet supported.

For performance reasons, we strongly recommend adding a LIMIT to the GROUP BY. Terms aggregations without a LIMIT consume more memory and are slower to execute. If a query does not have a limit and more than 65000 unique values are found in a field, an error will be returned.

Documentation