RAG Evaluator (Python)

This document describes the Python script for evaluating the Retrieval-Augmented Generation (RAG) pipeline using the Ragas library.

Overview

This script provides a way to evaluate the performance of the RAG pipeline using a predefined dataset. It uses the Ragas library to calculate metrics such as faithfulness, answer relevancy, context recall, and context precision.

Key Functionalities:

Loading an evaluation dataset.
Preparing the dataset for Ragas.
Running the Ragas evaluation.
Printing the evaluation results.

Technology Stack:

Ragas: The core library for RAG evaluation.
Datasets: For loading and preparing the evaluation dataset.
Pandas: For displaying the evaluation results.
google-generativeai: For using Gemini as the LLM for Ragas metrics.
langchain-google-genai: For integrating Gemini with Ragas.

How to Run

To run the RAG evaluator, execute the following command from the python-services/rag-evaluator directory:

python rag_evaluator.py

Overview​

How to Run​

Overview

How to Run