Cognee Microservice (Python)
This document describes the Flask application serving as a microservice wrapper for the Cognee library, enabling interaction with ChainAlign's knowledge graph and language model capabilities.
Overview
This Flask application exposes a set of REST API endpoints to abstract Cognee's core functionalities, allowing other services in the ChainAlign ecosystem to interact with the knowledge graph and language model capabilities without needing to directly integrate with the Cognee library.
Key Functionalities Exposed:
- Adding new content to the knowledge base.
- Triggering the 'cognify' process to analyze content and build the graph.
- Searching the knowledge graph with natural language queries.
- Executing a full Extract-Cognify-Load (ECL) pipeline for batch processing.
Technology Stack:
- Flask: For creating the web server and API endpoints.
- Cognee: The core library for knowledge graph and RAG functionalities.
- PostgreSQL: Used as the backend for both the graph database and the vector store.
- Google Gemini: The default Large Language Model (LLM) for text generation and analysis.
- asyncio: Used to run Cognee's asynchronous methods within Flask's synchronous routes.
Logging Configuration
The service uses Python's standard logging module. The log level is configurable via the LOG_LEVEL environment variable (defaulting to INFO).
Cognee Initialization
The service relies on several environment variables for Cognee configuration. These must be set in the deployment environment:
COGNEE_GRAPH_DB_URL: Connection string for the PostgreSQL graph database.COGNEE_VECTOR_DB_URL: Connection string for the PostgreSQL vector database.COGNEE_LLM_PROVIDER: The LLM provider (e.g., "google").COGNEE_LLM_MODEL: The specific LLM model (e.g., "gemini-pro").GEMINI_API_KEY: API key for Google Gemini.
Helper Functions
run_async(func)
A decorator that allows running an asynchronous function within a synchronous Flask route. It creates a new asyncio event loop to run the async function to completion.
API Endpoints
POST /cognee/add
Adds a single piece of content to the Cognee knowledge base for later processing.
Request Body:
{
"content": "The text content to be added.",
"metadata": { "source": "document_name.pdf" }
}
Responses:
200 OK:{ "status": "success", "message": "Content added successfully." }400 Bad Request:{ "error": "Missing 'content' field in request body." }500 Internal Server Error:{ "error": "Internal server error during content addition." }
POST /cognee/cognify
Triggers the 'cognify' process on all content that has been added but not yet processed. This involves extracting entities, relationships, and other structured data to build the knowledge graph. This endpoint currently triggers a full cognify process. Future enhancements may allow specifying datasets or content IDs to process.
Responses:
200 OK:{ "status": "success", "message": "Cognify process initiated" }500 Internal Server Error:{ "error": "Internal server error during cognify process." }
POST /cognee/search
Performs a semantic search over the cognified knowledge graph.
Request Body:
{
"query": "What is the impact of supply chain disruptions on revenue?"
}
Responses:
200 OK:{ "status": "success", "results": [...] }400 Bad Request:{ "error": "Missing 'query' field in request body." }500 Internal Server Error:{ "error": "Internal server error during search." }
POST /cognee/ecl-pipeline
Executes a full Extract-Cognify-Load (ECL) pipeline for a batch of documents. This is a convenience endpoint that chains the 'add' and 'cognify' steps.
Request Body:
{
"documents": [
{ "content": "First document content.", "metadata": { "source": "doc1.txt" } },
{ "content": "Second document content.", "metadata": { "source": "doc2.txt" } }
]
}
Responses:
200 OK:{ "status": "success", "message": "ECL pipeline executed successfully." }400 Bad Request:{ "error": "Missing or invalid 'documents' field." }500 Internal Server Error:{ "error": "Internal server error during ECL pipeline execution." }