Vector RAG worked, while full-document RAG failed

Vector RAG splits a long document into small pieces, then searches a vector database for the pieces most similar to the question. Vector-less RAG instead asks a model such as Gemini 2.5 Flash to read the whole document and build a tree-shaped reasoning structure from the full text. The test used a summarized version of the full One Piece story.

The vector RAG system produced the expected answer, but the vector-less RAG system failed while building the reasoning tree. The main issue is that reading the whole document may look more powerful, but it can break when the input is long or when the model has to create a complex structure from it.

Key points

  • Vector RAG breaks documents into chunks and retrieves only the most relevant parts.
  • Vector-less RAG asks the model to read the whole document and create a reasoning tree.
  • The test used a summarized version of the full One Piece story.
  • Vector RAG worked correctly in the test.
  • Vector-less RAG failed during the reasoning tree step.

Quick term guide

vector RAG
A RAG method that turns text meaning into numbers to find similar content.
vector database
A special type of storage that saves text as numbers so similar meanings can be found quickly, commonly used for AI memory
Gemini 2.5 Flash
A Google AI model designed for fast responses.
reasoning
The ability of the AI to think through complex steps to find a solution.
reasoning tree
A branching structure that tries to organize how an answer is built step by step.
AI agents
AI agents are AI tools that can carry out steps toward a goal, not just answer once.
retrieval
The step where a system finds the most relevant text for a question.
agent workflow
A set of steps an AI follows automatically to complete a series of tasks in order.
Read original