DEFINITION
RAG (Retrieval-Augmented Generation)
Retrieval-Augmented Generation
Definition
A technique that augments LLM generation by retrieving relevant context from a vector store before generating responses. For local LLMs with limited context windows, RAG enables processing of large codebases by storing vectors in a vector store, allowing the AI to understand code meaning without exceeding context limits.
Examples in the Wild
- Example 1:Storing millions of lines of code as vectors to enable semantic search
- Example 2:Retrieving relevant code snippets before generating fixes
- Example 3:Loading large repositories without blowing up context window