OpenSource-Hub

PageIndex

フレームワーク

VectifyAI/PageIndex

无向量、基于推理的 RAG 系统,使用分层树索引实现类人检索。

概要

PageIndex 是一个无向量、基于推理的 RAG 系统,从长文档构建分层树索引。它利用 LLM 进行上下文感知的类人检索,取代传统向量相似度搜索。在 FinanceBench 等专业文档分析基准上达到顶尖准确率。

README プレビュー

\n  \n\n  \n\n\n\n\n\n\n  \n\n\n# PageIndex: Vectorless, Reasoning-based RAG\n\nReasoning-based RAG  ◦  No Vector DB or Chunking  ◦  Context-Aware  ◦  Human-like Retrieval\n\n\n  🌐 Homepage  •  \n  🖥️ Chat Platform  •  \n  🔌 MCP & API  •  \n  📖 Docs  •  \n  💬 Discord  •  \n  ✉️ Contact \n\n  \n\n\n\n\n📢 Updates\n\n- 🔥 [**Agentic Vectorless RAG**](https://github.com/VectifyAI/PageIndex/blob/main/examples/agentic_vectorless_rag_demo.py) — A simple *agentic, vectorless RAG* [example](#agentic-vectorless-rag-an-example) with self-hosted PageIndex, using OpenAI Agents SDK.\n- [**Scale PageIndex to Millions of Documents**](https://pageindex.ai/blog/pageindex-filesystem) — *PageIndex File System* is a file-level tree layer that lets PageIndex reason over an entire corpus, not just a single document, enabling massive-scale document search.\n- [PageIndex Chat](https://chat.pageindex.ai) — Human-like document analysis agent [platform](https://chat.pageindex.ai) for professional long documents. Also available via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer).\n- [PageIndex Framework](https://pageindex.ai/blog/pageindex-intro) — Deep dive into PageIndex: an *agentic, in-context tree index* that enables LLMs to perform *reasoning-based, context-aware retrieval* over long documents.\n\n \n\n\n\n---\n\n# 📑 Introduction to PageIndex\n\nAre you frustrated with vector database retrieval accuracy for long professional documents? Traditional vector-based RAG relies on semantic *similarity* rather than true *relevance*. But **similarity ≠ relevance** — what we truly need in retrieval is **relevance**, and that requires **reasoning**. When working with professional documents that demand domain expertise and multi-step reasoning, similarity search often falls short.\n\nInspired by AlphaGo, we propose **[PageIndex](https://vectify.ai/pageindex)** — a **vectorless**,