Nebula · Quality & Validation

How we evaluate document intelligence quality.

Nebula is evaluated on whether its output can support real LLM and RAG workflows — answering questions, reasoning over tables, interpreting charts, and preserving document hierarchy in Japanese and English business documents.

Try Nebula

Evaluation principles

We measure what enterprises actually care about.

Our evaluation rubric is built around what enterprise teams need from document intelligence — usable output, real document coverage, and Japanese-strong quality.

Real documents, not toy benchmarks

We evaluate on the documents enterprises actually process — IR releases, board decks, statements, government forms — not curated test sets.

Japanese as a first-class workload

Japanese business and legal documents are part of our core evaluation set, not an afterthought tested on a handful of pages.

Output, not surface text

Surface character accuracy is a floor. We evaluate whether the output is actually useful to the LLM and RAG systems consuming it.

Validated before we publish

When a capability shows up on this page as validated, it has been evaluated on representative customer-facing documents — not just a single example.

Methodology

Five things every Nebula evaluation answers.

Each criterion below is applied across customer-representative documents. Where output falls short, the gap is documented and routed back into the product roadmap.

Downstream LLM answerability

Can a model answer real business questions using the converted Markdown and structured JSON, without going back to the original PDF?

Markdown usability

Are headings, reading order, lists, footnotes, and document hierarchy preserved end-to-end?

Table & chart reasoning

Do tables and chart series support numerical and comparative reasoning after conversion?

Japanese business documents

Does the system handle Japanese legal, financial, and IR materials, including mixed JP/EN pages?

Enterprise document structure

Can slides, reports, statements, forms, and operational files remain useful after transformation?

Validated capabilities

Where Nebula has been validated, and where we are still expanding.

Where we have measured a capability against representative customer documents, we say so. Where the evaluation set is still small, we say that too.

Validated · IPCC, METI, financial reports

Charts

Bar, line, pie, and multi-panel scientific figures returned as structured chart data.

Validated · BLS A-1, MUFG segment data

Tables

Multi-level grouped headers, hierarchical row labels, merged cells, numerical fidelity.

Validated · Lincoln, Sōseki manuscripts

Handwritten documents

Cursive English manuscripts and vertical Japanese 自筆原稿 — transcribed verbatim.

Validated · IRS Form 1040, 国税庁別表四

Forms

Multi-section forms with checkboxes, line numbers, and dependents grids preserved.

Validated · expanding corpus

Japanese financial documents

Annual reports, IR releases, governance materials with Japanese business vocabulary.

Validated · expanding corpus

Legal & regulatory PDFs

Long-form Japanese legal text with footnotes, citations, and nested headings preserved in reading order.

FAQ

Common questions about evaluation.

How does Nebula evaluate document quality?

We evaluate on whether converted documents can support real downstream AI work — answering questions, reasoning over tables, interpreting charts, and preserving document hierarchy. Surface-level character matching is a baseline, not the goal.

Which document types have you validated?

Charts, tables, handwritten documents, forms, Japanese financial materials, and legal/regulatory PDFs are part of our validated set, with representative examples shown on the main Nebula page. We continue to expand the customer-representative corpora as new partners onboard.

How do you handle Japanese documents specifically?

Japanese business and legal documents are a core part of our evaluation rubric. We test mixed JP/EN layouts, bilingual tables, IR materials, and long-form regulatory PDFs. Japanese is a first-class workload, not a translated afterthought.

Can I send documents to be evaluated?

Yes. The fastest way is to try Nebula directly at nebula.ur-ai.net — sign in and run your own documents through it. We are especially interested in feedback on board decks, annual reports, regulatory filings, statements, expense files, and Japanese enterprise materials.

Bring your documents

Try Nebula on your own documents.

The fastest way to evaluate Nebula on your own corpus is to try it directly. Sign in, upload a board deck, annual report, regulatory filing, statement, or any Japanese enterprise document, and see the output for yourself.

Try Nebula

Back to Nebula overview