Supabase Sidecar Embedding Engine
This document describes the Supabase Sidecar Embedding Engine, a cost-efficient, autonomous document embedding system.
Overview
The Supabase Sidecar Embedding Engine is a production-ready embedding pipeline built natively within the Supabase ecosystem. It uses a "database-as-orchestrator" pattern with a sidecar architecture to process document embeddings at scale without requiring expensive, dedicated infrastructure.
Key Functionalities:
- Sidecar Architecture: Embeddings are stored in a separate table from the source documents, preventing table bloat and maintaining query performance.
- Database-Native Orchestration: Uses PostgreSQL extensions (
pg_cron,pg_net,pgmq) to manage the entire embedding workflow. - Autonomous Processing: A self-healing, auto-scaling system that processes embeddings continuously without human intervention.
Technology Stack:
- Supabase: The core platform, providing the PostgreSQL database, Edge Functions, and other backend services.
- PostgreSQL: The database, with the
pgvector,pgmq,pg_cron, andpg_netextensions. - Deno: The runtime for the Edge Functions.
Edge Functions
process-embedding-queue
This Edge Function is the heart of the autonomous embedding system. It processes documents from the queue, generates embeddings using Supabase AI (gte-small model), and stores them in the sidecar table.
Database Schema
The database schema is defined in the supabase/migrations directory. It includes tables for source documents and document embeddings, as well as the pgmq queue.