regex_pattern tokenizer tokenizes text using a regular expression. The regular expression can be specified with the pattern parameter.
For instance, the following tokenizer creates tokens only for words starting with the letter h:
- Lazy quantifiers such as
+? - Word boundaries such as
\b
Expected Response