Retrieval-Augmented Generation (RAG)
Category: infrastructure
An architectural pattern that injects verified external database facts directly into an LLM prompt context before generation.
RAG detaches an AI agent's knowledge from its static weight training. When a query hits your cluster, a vector search pulls relevant ground-truth document chunks from ClickHouse and prepends them to the system prompt. This keeps responses anchored to real-time data, eliminating hallucinations without the massive compute cost of continuous model fine-tuning.
Common Examples
- We deployed a dense RAG pipeline to ensure our automated claims assistant only references active building code parameters stored in our local database.
- Implementing a RAG workflow dropped our model hallucination rate to near-zero by preventing the agent from guessing historical transaction details.