30410

NVIDIA Nemotron Parse 1.1

Kateryna Chumachenko, Amala Sanjay Deshmukh, Jarno Seppanen, Ilia Karmanov, Chia-Chih Chen, Lukas Voegtle, Philipp Fischer, Marek Wawrzos, Saeid Motiian, Roman Ageev, Kedi Wu, Alexandre Milesi, Maryam Moosaei, Krzysztof Pawelec, Padmavathy Subramanian, Mehrzad Samadi, Xin Yu, Celina Dear, Sarah Stoddard, Jenna Diamond, Jesse Oliver, Leanna Chraghchian, Patrick Skelly, Tom Balough, Yao Xu, Jane Polak Scowcroft, Daniel Korzekwa, Darragh Hanley, Sandip Bhaskar, Timo Roman, Karan Sapra, Andrew Tao, Bryan Catanzaro
NVIDIA
arXiv:2511.20478 [cs.LG], (25 Nov 2025)

@misc{chumachenko2025nvidianemotronparse11,

   title={NVIDIA Nemotron Parse 1.1},

   author={Kateryna Chumachenko and Amala Sanjay Deshmukh and Jarno Seppanen and Ilia Karmanov and Chia-Chih Chen and Lukas Voegtle and Philipp Fischer and Marek Wawrzos and Saeid Motiian and Roman Ageev and Kedi Wu and Alexandre Milesi and Maryam Moosaei and Krzysztof Pawelec and Padmavathy Subramanian and Mehrzad Samadi and Xin Yu and Celina Dear and Sarah Stoddard and Jenna Diamond and Jesse Oliver and Leanna Chraghchian and Patrick Skelly and Tom Balough and Yao Xu and Jane Polak Scowcroft and Daniel Korzekwa and Darragh Hanley and Sandip Bhaskar and Timo Roman and Karan Sapra and Andrew Tao and Bryan Catanzaro},

   year={2025},

   eprint={2511.20478},

   archivePrefix={arXiv},

   primaryClass={cs.LG},

   url={https://arxiv.org/abs/2511.20478}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

83

views

We introduce Nemotron-Parse-1.1, a lightweight document parsing and OCR model that advances the capabilities of its predecessor, Nemoretriever-Parse-1.0. Nemotron-Parse-1.1 delivers improved capabilities across general OCR, markdown formatting, structured table parsing, and text extraction from pictures, charts, and diagrams. It also supports a longer output sequence length for visually dense documents. As with its predecessor, it extracts bounding boxes of text segments, as well as corresponding semantic classes. Nemotron-Parse-1.1 follows an encoder-decoder architecture with 885M parameters, including a compact 256M-parameter language decoder. It achieves competitive accuracy on public benchmarks making it a strong lightweight OCR solution. We release the model weights publicly on Huggingface, as well as an optimized NIM container, along with a subset of the training data as part of the broader Nemotron-VLM-v2 dataset. Additionally, we release Nemotron-Parse-1.1-TC which operates on a reduced vision token length, offering a 20% speed improvement with minimal quality degradation.
No votes yet.
Please wait...

You must be logged in to post a comment.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: