State-of-the-art OCR model by Datalab — converts document images to markdown/HTML. Supports 90+ languages, math, tables, forms, handwriting, and complex layouts.
Model: datalab-to/chandra-ocr-2 (5B params)
ocr_layout: structured output with layout blocks. ocr: plain HTML output.