Nougat Processor Service (Python)
This document describes the FastAPI application serving as a microservice for converting PDF documents to Markdown using Nougat OCR.
Overview
This FastAPI application provides an API to process PDF files with Nougat and return the extracted Markdown text. It's a simple service that can be used by other services in the ChainAlign ecosystem to extract text from PDF documents.
Key Functionalities Exposed:
- Receiving a PDF file.
- Processing the PDF file with Nougat.
- Returning the extracted Markdown text.
Technology Stack:
- FastAPI: For creating the web server and API endpoints.
- Nougat-OCR: The core library for PDF processing.
API Endpoints
POST /process-pdf
Processes a PDF file and returns the extracted Markdown text.
Request Body:
A PDF file.
Responses:
200 OK: A JSON object with the extracted Markdown text.400 Bad Request: If no file is uploaded.500 Internal Server Error: If the Nougat processing fails.