|
30 | 30 | <details open> |
31 | 31 | <summary><h2>📢 Updates</h2></summary> |
32 | 32 |
|
33 | | -- 🔥 [**Agentic Vectorless RAG**](https://github.com/VectifyAI/PageIndex/blob/main/examples/agentic_vectorless_rag_demo.py) — A simple *agentic, vectorless RAG* [example](https://github.com/VectifyAI/PageIndex/blob/main/examples/agentic_vectorless_rag_demo.py) with self-hosted PageIndex, using OpenAI Agents SDK. |
| 33 | +- 🔥 [**Agentic Vectorless RAG**](https://github.com/VectifyAI/PageIndex/blob/main/examples/agentic_vectorless_rag_demo.py) with PageIndex — A simple *agentic, vectorless RAG* [example](https://github.com/VectifyAI/PageIndex/blob/main/examples/agentic_vectorless_rag_demo.py) with self-hosted PageIndex, using OpenAI Agents SDK. |
| 34 | +- [**Scale PageIndex to Millions of Documents**](https://pageindex.ai/blog/pageindex-filesystem) — The *PageIndex File System* is a file-level tree layer that lets PageIndex reason over an entire corpus, not just a single document, enabling massive-scale document search. |
34 | 35 | - [PageIndex Chat](https://chat.pageindex.ai) — Human-like document analysis agent [platform](https://chat.pageindex.ai) for professional long documents. Also available via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer). |
35 | 36 | - [PageIndex Framework](https://pageindex.ai/blog/pageindex-intro) — Deep dive into PageIndex: an *agentic, in-context tree index* that enables LLMs to perform *reasoning-based, human-like retrieval* over long documents. |
36 | 37 |
|
@@ -75,8 +76,8 @@ To learn more, please see a detailed introduction to the [PageIndex framework](h |
75 | 76 | The PageIndex service is available as a ChatGPT-style [chat platform](https://chat.pageindex.ai), or can be integrated via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer). |
76 | 77 |
|
77 | 78 | ### 🛠️ Deployment Options |
78 | | -- Self-host — run locally with this open-source repo. |
79 | | -- Cloud Service — try instantly with our [Chat Platform](https://chat.pageindex.ai/), or integrate via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer). |
| 79 | +- Self-host — run locally with this open-source repo (using standard PDF parsing). |
| 80 | +- Cloud Service — production-grade pipeline with enhanced OCR, tree building, and retrieval for best results. Try instantly with our [Chat Platform](https://chat.pageindex.ai/), or integrate via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer). |
80 | 81 | - _Enterprise_ — private or on-prem deployment. [Contact us](https://ii2abc2jejf.typeform.com/to/tK3AXl8T) or [book a demo](https://calendly.com/pageindex/meet) for more details. |
81 | 82 |
|
82 | 83 | ### 🧪 Quick Hands-on |
@@ -135,12 +136,14 @@ Below is an example PageIndex tree structure. Also see more example [documents]( |
135 | 136 | ... |
136 | 137 | ``` |
137 | 138 |
|
138 | | -You can generate the PageIndex tree structure with this open-source repo, or use our [API](https://pageindex.ai/developer). |
| 139 | +You can generate the PageIndex tree structure with this open-source repo; or use our [API](https://pageindex.ai/developer) for higher-quality results powered by our enhanced OCR and tree building pipeline. |
139 | 140 |
|
140 | 141 | --- |
141 | 142 |
|
142 | 143 | # ⚙️ Package Usage |
143 | 144 |
|
| 145 | +> **Note:** This open-source package uses standard Python PDF parsing. For use cases with complex PDFs, our [Cloud Service](https://pageindex.ai/developer) provides significantly better results with enhanced OCR, tree building, and retrieval. Available via [MCP](https://pageindex.ai/developer) or [API](https://pageindex.ai/developer). |
| 146 | +
|
144 | 147 | You can follow these steps to generate a PageIndex tree from a PDF document. |
145 | 148 |
|
146 | 149 | ### 1. Install dependencies |
|
0 commit comments