tgoop.com/awesomedeeplearning/233
Last Update:
Retrieval-Augmented Generation for Large Language Models: A survey
This paper is a must read.
It covers everything you need to know about the RAG framework and its limitations. It also lists different state-of-the-art techniques to boost its performance in retrieval, augmentation, and generation.
The ultimate goal behind these techniques is to make this framework ready for scalability and production use, especially for use cases and industries where answer quality matters *a lot*.
These are the key ideas the paper discusses to make your RAG more efficient:
- 🗃️ Enhance the quality of indexed data by removing duplicate/redundant information and adding mechanisms to refresh outdated documents
- 🛠️ Optimize index structure by determining the right chunk size through quantitative evaluation
- 🏷️ Add metadata (e.g. date, chapters, or subsection) to the indexed documents to incorporate filtering functionalities that enhance efficiency and relevance
- ↔️ Align the input query with the documents by indexing the chunks of data by the questions they answer
- 🔍 Mixed retrieval: combine different search techniques like keyword-based and semantic search
- 🔄 ReRank: sort the retrieved documents to maximize diversity and optimize the similarity with a « template answer »
- 🗜️ Prompt compression: remove irrelevant context
- 💡 HyDE: generate a hypothetical answer to the input question and use it (with the query) to improve the search
- ✒️ Query rewrite and expansion to reformulate the user’s intent and remove ambiguity
Link: https://arxiv.org/abs/2312.10997
BY GenAi, Deep Learning and Computer Vision

Share with your friend now:
tgoop.com/awesomedeeplearning/233