RAGFlow

⭐ 82.2k Apache-2.0 Python 0.15.0

RAG engine for deep document understanding, featuring OCR, table extraction, and multi-channel retrieval — an expert solution for handling complex documents

📋 Info

GitHub Stars⭐ 82.2k Stars
LicenseApache-2.0
LanguagePython
Version0.15.0
Updated2026-06-01

📖 Overview

RAGFlow is a RAG engine focused on processing complex documents (82k Stars). Its core innovation lies in the DeepDoc deep document understanding system, which can accurately parse tables, images, multi-column layouts, and scanned documents using OCR. A multi-channel retrieval plus re-ranking strategy ensures high search accuracy. It supports over 20 different document formats. It offers visual pipeline configuration along with built-in knowledge base management and access control features. It is well-suited for handling complex documents such as legal contracts, financial reports, and research papers.

✨ Features

  • DeepDoc offers advanced document understanding capabilities including OCR, table recognition, and layout analysis.
  • Multi-channel retrieval + re-ranking (high retrieval accuracy)
  • Support for over 20 document formats.
  • Visual configuration for the RAG Pipeline
  • Knowledge base management + permission control

Advertisement

🚀 Quick Start

$ git clone https://github.com/infiniflow/ragflow.git
git clone https://github.com/infiniflow/ragflow.git

🔗 Related Tools