2026-04-16
- New Claude tokenizer notes
- string of "x" * 80000 -> 80011 tokens
- unlike old one which tokenizes into "xxxx" chunks
- tokenizing each x as its own char?
- repeated "c" gets tokenized as "cc", like old
- token count endpoint allows requests of up to ~30M tokens
- after that you get request to big errors
- string of "x" * 80000 -> 80011 tokens