Facets/aggregations are a ParadeDB enterprise feature. Contact us for access.

Histogram

Histogram is a bucket aggregation where buckets are created dynamically based on a specified interval. Each document value is rounded down to its bucket. For example, if you have a price of 18 and an interval of 5, the document will fall into the bucket with the key 15. The formula used for this is: ((val - offset) / interval).floor() * interval + offset.

SELECT search_idx.aggregate('{
  "rating_histogram": {
    "histogram": {"field": "rating", "interval": 1}
  }
}');
field
required

The field to aggregate on.

interval
required

The interval to chunk your data range. Each bucket spans a value range of [0..interval). Must be a positive value.

offset
default: 0

Shift the grid of buckets by the specified offset.

min_doc_count
default: 0

The minimum number of documents in a bucket to be returned.

hard_bounds

Limits the data range to [min, max] closed interval.

extended_bounds

Extends the value range of the buckets.

keyed
default: false

Whether to return the buckets as a hash map.

is_normalized_to_ns
default: false

Whether the values are normalized to ns for date time values.

Date Histogram

Similar to histogram, but can only be used with date types. Currently, only fixed time intervals are supported.

SELECT search_idx.aggregate('{
  "created_at_histogram": {
    "date_histogram": {"field": "rating", "fixed_interval": "1h"}
  }
}');
field
required

The field to aggregate on.

fixed_interval
required

The interval to chunk your data range. Each bucket spans a value range of [0..fixed_interval). Accepted values should end in ms, s, m, h, or d.

offset
default: 0

Shift the grid of buckets by the specified offset.

min_doc_count
default: 0

The minimum number of documents in a bucket to be returned.

hard_bounds

Limits the data range to [min, max] closed interval.

extended_bounds

Extends the value range of the buckets.

keyed
default: false

Whether to return the buckets as a hash map.

Range

Range aggregation allows you to define custom buckets for specific ranges.

SELECT search_idx.aggregate('{
  "ranges": {
    "range": {"field": "rating", "ranges": [
        { "to": 3.0 },
        { "from": 3.0, "to": 7.0 },
        { "from": 7.0, "to": 20.0 },
        { "from": 20.0 }
    ]}
  }
}');
field
required

The field to aggregate on.

ranges
required

A list of ranges to aggregate on.

keyed
default: false

Whether to return the buckets as a hash map.

Terms

Terms aggregation creates a bucket for every unique term and counts the number of occurrences.

SELECT search_idx.aggregate('{
  "rating_terms": {
    "terms": {"field": "rating"}
  }
}');
field
required

The field to aggregate on.

size
default: 10

The number of terms to return.

segment_size
default: 100

The number of terms to fetch from each segment.

show_term_doc_count_error
default: false

Whether to include the document count error.

min_doc_count
default: 1

The minimum number of documents in a term to be returned.

order
The order in which to return the terms.
missing

The value to use for documents missing the field.

Nested Aggregations

Buckets can contain sub-aggregations. For example, creating buckets with the range aggregation and then calculating the average on each bucket:

SELECT bm25_search.aggregate('{
  "range_rating": {
    "range": {
      "field": "rating",
      "ranges": [
        { "from": 1, "to": 3 },
        { "from": 3, "to": 5 }
      ]
    },
    "aggs": {
      "average_in_range": { "avg": { "field": "rating"} }
    }
  }
}');