Features

  • Add support for the ICU tokenizer. This tokenizer tokenizes text into words on word boundaries, as defined in the Unicode Standard Annex #29 - Unicode Text Segmentation.

Full Changelog

The full changelog is available here.