RAG Without Vectors: How PageIndex Retrieves by Reasoning
These articles are AI-generated summaries. Please check the original sources for full details.
RAG Without Vectors: How PageIndex Retrieves by Reasoning
PageIndex is a retrieval-augmented generation system that eliminates vector embeddings in favor of a hierarchical table-of-contents tree index. This system enables LLMs to reason over document structures to pinpoint relevant sections with high precision before reading full text.
Why This Matters
Traditional vector-based RAG pipelines often fail on complex professional documents because semantic similarity is a weak proxy for relevance in domains like finance or law. In these contexts, relevant information is frequently spread across multiple sections, requiring structural navigation that chunk-based embeddings cannot provide. PageIndex addresses this by using LLMs to navigate a document’s hierarchy, ensuring that retrieval is grounded in structural reasoning rather than mere proximity in embedding space, which significantly reduces retrieval noise and improves accuracy on specialized benchmarks.
Key Insights
- Hierarchical Tree Indexing: PageIndex constructs a tree of sections and subsections, storing titles and summaries in nodes to preserve the author’s original document structure (PageIndex, 2026).
- Reasoning-Driven Retrieval: The system uses advanced LLMs like GPT-5.4 to scan node summaries and decide which sections to explore before loading full text, mimicking human expert behavior (PageIndex, 2026).
- Benchmark Excellence: PageIndex demonstrated high retrieval accuracy on FinanceBench, outperforming traditional vector similarity models in domains requiring deep understanding (PageIndex, 2026).
- Cross-Cutting Query Resolution: By reasoning over the document tree, PageIndex can identify and stitch together multiple relevant nodes that are geographically separated in the text (PageIndex, 2026).
- Vectorless Infrastructure: The system eliminates the need for embedding generation and vector database management, relying instead on structured exploration and LLM reasoning (PageIndex, 2026).
Working Examples
Initialization of the PageIndex client and the asynchronous LLM helper function.
from pageindex import PageIndexClient
import pageindex.utils as utils
import os
from getpass import getpass
PAGEINDEX_API_KEY = getpass('Enter PageIndex API Key: ')
pi_client = PageIndexClient(api_key=PAGEINDEX_API_KEY)
import openai
OPENAI_API_KEY = getpass('Enter OpenAI API Key: ')
async def call_llm(prompt, model="gpt-5.4", temperature=0):
client = openai.AsyncOpenAI(api_key=OPENAI_API_KEY)
response = await client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
temperature=temperature
)
return response.choices[0].message.content.strip()
Building the hierarchical document tree from a PDF source.
pdf_url = "https://arxiv.org/pdf/1706.03762.pdf"
doc_id = pi_client.submit_document(pdf_url)["doc_id"]
import time
while not pi_client.is_retrieval_ready(doc_id):
time.sleep(5)
tree = pi_client.get_tree(doc_id, node_summary=True)["result"]
utils.print_tree(tree)
Practical Applications
- Financial Analysis: Extracting specific complexity metrics and trade-offs from SEC filings where data is spread across various disclosures. Pitfall: Vector-only search might miss quantitative tables if the query is purely conceptual and lacks direct semantic overlap.
- Legal Research: Navigating structured contracts or research papers to synthesize mechanisms across sections, such as identifying multi-head attention logic in the Transformer paper. Pitfall: Standard chunking can break the contextual link between a mechanism’s definition and its evaluation sections.
References:
Continue reading
Next article
Solving the Observability Gap in LLM Agent Trees and Nested Workflows
Related Content
VectifyAI Launches Mafin 2.5 and PageIndex: Achieving 98.7% Financial RAG Accuracy
VectifyAI has launched Mafin 2.5 and the open-source PageIndex framework, achieving a record-breaking 98.7% accuracy on FinanceBench. By replacing traditional vector similarity with hierarchical tree indexing, PageIndex solves the problem of structural context loss in complex financial documents like SEC filings and balance sheets.
Optimizing AI Context Windows: Why Longer Sessions Degrade Assistant Performance
AI assistants with 200,000-token windows degrade over sessions as history and system instructions consume the memory budget.
Anthropic Releases Claude Opus 4.8: #1 on Benchmarks, Parallel Subagents, and It Actually Tells You When Your Code Is Wrong
Claude Opus 4.8 tops the Artificial Analysis Intelligence Index with 88.6% on SWE-Bench, introduces Dynamic Workflows for running hundreds of parallel subagents, and is 4x more likely to flag your broken code than its predecessor.