In order for a field to be factored into the BM25 score, it must be present in the BM25 index. For instance,
consider this query:
SELECT id, pdb.score(id)FROM mock_itemsWHERE description ||| 'keyboard' OR rating < 2ORDER BY pdb.score(id) DESCLIMIT 5;
While BM25 scores will be returned as long as description is indexed, including rating in the BM25 index definition will allow results matching
rating < 2 to rank higher than those that do not match.
Next, let’s compute a “combined BM25 score” over a join across both tables.
The Django example assumes an Order model with product = models.ForeignKey(MockItem, db_column='product_id', to_field='id', ...).
SELECT o.order_id, o.customer_name, m.description, pdb.score(o.order_id) + pdb.score(m.id) as scoreFROM orders oJOIN mock_items m ON o.product_id = m.idWHERE o.customer_name ||| 'Johnson' AND m.description ||| 'running shoes'ORDER BY score DESC, o.order_idLIMIT 5;
The scores generated by the BM25 index may be influenced by dead rows that have not been cleaned up by the VACUUM process.Running VACUUM on the underlying table will remove all dead rows from the index and ensures that only rows visible to the current
transaction are factored into the BM25 score.