What is HNSW?
HNSW (Hierarchical Navigable Small World) is an algorithm used for efficient approximate nearest neighbor search in high-dimensional spaces. It constructs a multi-layered graph structure, where each layer is a subset of the previous one, enabling faster navigation through the data points. This approach allows HNSW to achieve state-of-the-art search speed and accuracy, especially for large-scale datasets.
ParadeDB uses the
pgvector HNSW index, which offers a significant performance boost over the original
Creating a HNSW Index
The following command creates an HNSW index over a column:
CREATE INDEX ON <schema_name>.<table_name>
USING hnsw (<column_name> <distance_metric>);
The name of the schema, or namespace, of the table. The schema name only needs
to be provided if the table is not in the
The name of the table being indexed.
The name of the column being indexed.
The distance metric used for measuring similarity between two vectors. Use
vector_l2_ops for L2 distance,
vector_ip_ops for inner product, and
vector_cosine_ops for cosine distance.
The following example demonstrates how to pass options when creating the HNSW index:
CREATE INDEX ON mock_items
USING hnsw (embedding vector_l2_ops)
WITH (m = 16, ef_construction = 64);
The maximum number of connections per layer. A higher value increases recall but also increases index size and construction time.
A higher value creates a higher quality graph, which increases recall but also construction time.
Deleting a HNSW Index
The following command deletes a HNSW index:
DROP INDEX <index_name>;
The name of the index you wish to delete.
Recreating a HNSW Index
Like a BM25 index, an HNSW index only needs to be recreated if the name of the indexed column changes. To recreate the index, simply delete and create it using the SQL commands above.