Ragas Eval Service (Python)

This document describes the Flask application serving as a microservice for evaluating the Retrieval-Augmented Generation (RAG) pipeline using the Ragas library.

Overview

This Flask application provides an API to trigger the evaluation of the RAG pipeline. It uses a predefined dataset and the Ragas library to calculate metrics such as faithfulness, answer relevancy, context recall, and context precision.

Key Functionalities Exposed:

Receiving a request to evaluate the RAG pipeline.
Loading an evaluation dataset.
Preparing the dataset for Ragas.
Running the Ragas evaluation.
Returning the evaluation results.

Technology Stack:

Flask: For creating the web server and API endpoints.
Ragas: The core library for RAG evaluation.
google-generativeai: For using Gemini as the LLM for Ragas metrics.

API Endpoints

`POST /evaluate-rag`

Triggers the evaluation of the RAG pipeline.

Responses:

200 OK: A JSON object with the evaluation results.
500 Internal Server Error: If the evaluation fails.

Overview​

API Endpoints​

POST /evaluate-rag​

Overview

API Endpoints

`POST /evaluate-rag`