The Chinese compatible tokenizer is like the simple tokenizer — it lowercases non-CJK characters and splits on any non-alphanumeric character. Additionally, it treats each CJK character as its own token.Documentation Index
Fetch the complete documentation index at: https://docs.paradedb.com/llms.txt
Use this file to discover all available pages before exploring further.
Expected Response