2026-04-16

New Claude tokenizer notes
- string of "x" * 80000 -> 80011 tokens
  - unlike old one which tokenizes into "xxxx" chunks
  - tokenizing each x as its own char?
  - repeated "c" gets tokenized as "cc", like old
- token count endpoint allows requests of up to ~30M tokens
  - after that you get request to big errors