RAG Evaluator (Python)
This document describes the Python script for evaluating the Retrieval-Augmented Generation (RAG) pipeline using the Ragas library.
Overview
This script provides a way to evaluate the performance of the RAG pipeline using a predefined dataset. It uses the Ragas library to calculate metrics such as faithfulness, answer relevancy, context recall, and context precision.
Key Functionalities:
- Loading an evaluation dataset.
- Preparing the dataset for Ragas.
- Running the Ragas evaluation.
- Printing the evaluation results.
Technology Stack:
- Ragas: The core library for RAG evaluation.
- Datasets: For loading and preparing the evaluation dataset.
- Pandas: For displaying the evaluation results.
- google-generativeai: For using Gemini as the LLM for Ragas metrics.
- langchain-google-genai: For integrating Gemini with Ragas.
How to Run
To run the RAG evaluator, execute the following command from the python-services/rag-evaluator directory:
python rag_evaluator.py