icon: LiWrench
Title: Improving Post-Retrieval Processes
β¦ Re-ranking is a process of ordering the retrieved context chunks in the final prompt based on its score and relevancy.
β¦ This is important as researchers found better performance when the most relevant context is positioned at the start of the prompt.
β¦ **The technique consists of two very different steps:
We can notice that each new query, the similarity of the query with each of the documents needs to be calculated.
from langchain_cohere import CohereRerank
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
os.environ["COHERE_API_KEY"] = "YOUR API KEY FROM COHERE"
compressor = CohereRerank(top_n=3)
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=naive_retriever
)
Letβs see a comparison between a Naive Retriever (e.g., distance between embeddings) and a Reranking Retriever
ContextualCompressionRetriever
from LangChain
library to improve the quality of retrieved documents by compressing them.from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
from langchain_openai import OpenAI
llm = OpenAI(temperature=0)
compressor = LLMChainExtractor.from_llm(llm)
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor, base_retriever=retriever
)
compressed_docs = compression_retriever.invoke(
"Why do LLMs hallucinate?"
)
pretty_print_docs(compressed_docs)
LangChain Documentation: Contextual compression | π¦οΈπ LangChain
# Install the package
!pip install llmlingua
from llmlingua import PromptCompressor
llm_lingua = PromptCompressor()
compressed_prompt = llm_lingua.compress_prompt(prompt, instruction="", question="", target_token=200)
This note is not intended to exhaustively cover all techniques or methods available for improving Retrieval-Augmented Generation (RAG) processes.