> ## Documentation Index
> Fetch the complete documentation index at: https://docs.paradedb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Stemmer

> Reduces words to their root form for a given language

Stemming is the process of reducing words to their root form. In English, for example, the root form of "running" and "runs" is "run".
Stemming can be configured for any tokenizer besides the [literal](/documentation/tokenizers/available-tokenizers/literal) tokenizer. Stemmers
in ParadeDB are based on stemming algorithms obtained from the official [Snowball website](https://snowballstem.org/).

To set a stemmer, append `stemmer=<language>` to the tokenizer's arguments.

```sql theme={null}
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.simple('stemmer=english')))
WITH (key_field='id');
```

Valid languages are `arabic`, `czech`, `danish`, `dutch`, `english`, `finnish`, `french`, `german`, `greek`, `hungarian`, `italian`, `norwegian`, `polish`, `portuguese`, `romanian`, `russian`, `spanish`, `swedish`, `tamil`, and `turkish`.

To demonstrate this token filter, let's compare the output of the following two statements:

```sql theme={null}
SELECT
  'I am running'::pdb.simple::text[],
  'I am running'::pdb.simple('stemmer=english')::text[];
```

```ini Expected Response theme={null}
      text      |    text
----------------+------------
 {i,am,running} | {i,am,run}
(1 row)
```
