# PaddleOCR
**Repository Path**: hunter2x/PaddleOCR
## Basic Information
- **Project Name**: PaddleOCR
- **Description**: Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 1
- **Created**: 2026-04-18
- **Last Updated**: 2026-04-23
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
Global Leading OCR Toolkit & Document AI Engine
English | [简体中文](./readme/README_cn.md) | [繁體中文](./readme/README_tcn.md) | [日本語](./readme/README_ja.md) | [한국어](./readme/README_ko.md) | [Français](./readme/README_fr.md) | [Русский](./readme/README_ru.md) | [Español](./readme/README_es.md) | [العربية](./readme/README_ar.md)
[](https://pepy.tech/projects/paddleocr)
[](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)



[](https://www.paddleocr.com)
[](https://deepwiki.com/PaddlePaddle/PaddleOCR)
[](../LICENSE)
**PaddleOCR converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with industry-leading accuracy. With 70k+ Stars and trusted by top-tier projects like Dify, RAGFlow, and Cherry Studio, PaddleOCR is the bedrock for building intelligent RAG and Agentic applications.**
## 🚀 Key Features
### 📄 Intelligent Document Parsing (LLM-Ready)
> *Transforming messy visuals into structured data for the LLM era.*
* **SOTA Document VLM**: Featuring **PaddleOCR-VL-1.5 (0.9B)**, the industry's leading lightweight vision-language model for document parsing. It excels in parsing complex documents across 5 major "Real-World" challenges: **Warping, Scanning, Screen Photography, Illumination, and Skewed documents**, with structured outputs in **Markdown** and **JSON** formats.
* **Structure-Aware Conversion**: Powered by **PP-StructureV3**, seamlessly convert complex PDFs and images into **Markdown** or **JSON**. Unlike the PaddleOCR-VL series models, it provides more fine-grained coordinate information, including table cell coordinates, text coordinates, and more.
* **Production-Ready Efficiency**: Achieve commercial-grade accuracy with an ultra-small footprint. Outperforms numerous closed-source solutions in public benchmarks while remaining resource-efficient for edge/cloud deployment.
### 🔍 Universal Text Recognition (Scene OCR)
> *The global gold standard for high-speed, multilingual text spotting.*
* **100+ Languages Supported**: Native recognition for a vast global library. Our **PP-OCRv5** single-model solution elegantly handles multilingual mixed documents (Chinese, English, Japanese, Pinyin, etc.).
* **Complex Element Mastery**: Beyond standard text recognition, we support **natural scene text spotting** across a wide range of environments, including IDs, street views, books, and industrial components
* **Performance Leap**: PP-OCRv5 delivers a **13% accuracy boost** over previous versions, maintaining the "Extreme Efficiency" that PaddleOCR is famous for.
| Project Name | Description |
| ------------ | ----------- |
| [Dify](https://github.com/langgenius/dify)

|Production-ready platform for agentic workflow development.|
| [RAGFlow](https://github.com/infiniflow/ragflow)

|RAG engine based on deep document understanding.|
| [pathway](https://github.com/pathwaycom/pathway)

|Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.|
| [MinerU](https://github.com/opendatalab/MinerU)

|Multi-type Document to Markdown Conversion Tool|
| [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR)

|Free, Open-source, Batch Offline OCR Software.|
| [cherry-studio](https://github.com/CherryHQ/cherry-studio)

|A desktop client that supports for multiple LLM providers.|
| [haystack](https://github.com/deepset-ai/haystack)

|AI orchestration framework to build customizable, production-ready LLM applications.|
| [OmniParser](https://github.com/microsoft/OmniParser)

|OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.|
| [QAnything](https://github.com/netease-youdao/QAnything)

|Question and Answer based on Anything.|
| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|
## 👩👩👧👦 Contributors