The model is fully differentiable, allowing for fine-tuning and supporting various tasks such as domain adaptation and multilingual fine-tuning. It can handle tables, receipts, forms, multi-column layouts, and math notation, making it a versatile tool for document understanding. The model also predicts bounding boxes for embedded images, enhancing its functionality.
LightOnOCR-2-1B is part of a model family that includes variants for specific tasks, such as base models for fine-tuning and models with image bounding boxes. The model is available for use with transformers and can be deployed using vLLM. It has been trained on a large and high-quality corpus, resulting in improved performance and efficiency. The model's capabilities make it suitable for a wide range of applications.


