Skip to main content

RAG Evaluator (Python)

This document describes the Python script for evaluating the Retrieval-Augmented Generation (RAG) pipeline using the Ragas library.

Overview

This script provides a way to evaluate the performance of the RAG pipeline using a predefined dataset. It uses the Ragas library to calculate metrics such as faithfulness, answer relevancy, context recall, and context precision.

Key Functionalities:

  • Loading an evaluation dataset.
  • Preparing the dataset for Ragas.
  • Running the Ragas evaluation.
  • Printing the evaluation results.

Technology Stack:

  • Ragas: The core library for RAG evaluation.
  • Datasets: For loading and preparing the evaluation dataset.
  • Pandas: For displaying the evaluation results.
  • google-generativeai: For using Gemini as the LLM for Ragas metrics.
  • langchain-google-genai: For integrating Gemini with Ragas.

How to Run

To run the RAG evaluator, execute the following command from the python-services/rag-evaluator directory:

python rag_evaluator.py