Why Frontier LLMs Fail at Parsing Japanese Documents (and What Makes Japanese Unique)
Frontier LLMs fail at Japanese documents because Japanese mixes three scripts, omits spaces between words, and is often written vertically — pushing models’ error rates on vertical text roughly tenfold. Here is what makes Japanese documents unique, where the models break, and what actually works.
6 min read
