BM25 scores measure how relevant a score is for a given query. Higher scores indicate higher relevance.

Basic Usage

The paradedb.score(<key_field>) function produces a BM25 score and can be added to any query where any of the ParadeDB operators are present.
SELECT id, paradedb.score(id)
FROM mock_items
WHERE description ||| 'shoes'
ORDER BY paradedb.score(id)
LIMIT 5;
In order for a field to be factored into the BM25 score, it must be present in the BM25 index. For instance, consider this query:
SELECT id, paradedb.score(id)
FROM mock_items
WHERE description ||| 'keyboard' OR rating < 2
ORDER BY paradedb.score(id)
LIMIT 5;
While BM25 scores will be returned as long as description is indexed, including rating in the BM25 index definition will allow results matching rating < 2 to rank higher than those that do not match.

Joined Scores

First, let’s create a second table called orders that can be joined with mock_items:
CALL paradedb.create_bm25_test_table(
  schema_name => 'public',
  table_name => 'orders',
  table_type => 'Orders'
);

ALTER TABLE orders
ADD CONSTRAINT foreign_key_product_id
FOREIGN KEY (product_id)
REFERENCES mock_items(id);

CREATE INDEX orders_idx ON orders
USING bm25 (order_id, product_id, order_quantity, order_total, customer_name)
WITH (key_field = 'order_id');
Next, let’s compute a “combined BM25 score” over a join across both tables.
SELECT o.order_id, o.customer_name, m.description, paradedb.score(o.order_id) + paradedb.score(m.id) as score
FROM orders o
JOIN mock_items m ON o.product_id = m.id
WHERE o.customer_name ||| 'Johnson' AND m.description ||| 'running shoes'
ORDER BY score DESC, o.order_id
LIMIT 5;

Score Refresh

The scores generated by the BM25 index may be influenced by dead rows that have not been cleaned up by the VACUUM process. Running VACUUM on the underlying table will remove all dead rows from the index and ensures that only rows visible to the current transaction are factored into the BM25 score.
VACUUM mock_items;
This can be automated with autovacuum.