number headings: first-level 1, max 6, 1.1
icon: LiNotebook
Title: Overview - Plan of Attack
The basic or "Vanilla" RAG, also known as Naive RAG, exhibits several limitations, particularly when applied to complex use cases or in the development of production-ready applications.
As we saw in the previous topic, building an RAG prototype is relatively easy – investing around 20% of the effort yields an application with 80% performance. However, achieving a further 20% performance improvement requires the remaining 80% of the effort.
Below are some key reasons why Naive RAG may not always deliver the most effective and optimized outcomes.
To overcome these limitations of naive RAG, there are two aspects that are essential:
RAG is only as good as the retrieved documents’ relevance and quality. Fortunately, an emerging set of techniques can be employed to design and improve RAG systems.
The improvement of RAG is not just a matter of incremental updates, by installing newer Python package or calling any functions out-of-the-box, but many of them involves a comprehensive rethinking of its architecture and processes.
We can group the various improvements under 3 major categories:
You might also be interested in the GovTech playbook included in 6. Further Readings - WOG RAG Playbook, where the results of different techniques have been experimented on two specific use cases. This playbook can serve as a general reference point for starting your own experiments, particularly for techniques that have shown the greatest improvement in accuracy and the ability of the RAG pipeline.
Evaluation of RAG systems is essential to benchmark the overall performance of RAG output.
To evaluate RAG we can use metrics like:
These metrics provide a structured way to assess the quality of the generated answers and the relevance of the information retrieved by the system.
Enter RAGAS, a framework specifically designed for this purpose.
We will go into the details of RAG evaluation in 5. RAG Evaluation