Skip to main content

Supabase Sidecar Embedding Engine

This document describes the Supabase Sidecar Embedding Engine, a cost-efficient, autonomous document embedding system.

Overview

The Supabase Sidecar Embedding Engine is a production-ready embedding pipeline built natively within the Supabase ecosystem. It uses a "database-as-orchestrator" pattern with a sidecar architecture to process document embeddings at scale without requiring expensive, dedicated infrastructure.

Key Functionalities:

  • Sidecar Architecture: Embeddings are stored in a separate table from the source documents, preventing table bloat and maintaining query performance.
  • Database-Native Orchestration: Uses PostgreSQL extensions (pg_cron, pg_net, pgmq) to manage the entire embedding workflow.
  • Autonomous Processing: A self-healing, auto-scaling system that processes embeddings continuously without human intervention.

Technology Stack:

  • Supabase: The core platform, providing the PostgreSQL database, Edge Functions, and other backend services.
  • PostgreSQL: The database, with the pgvector, pgmq, pg_cron, and pg_net extensions.
  • Deno: The runtime for the Edge Functions.

Edge Functions

process-embedding-queue

This Edge Function is the heart of the autonomous embedding system. It processes documents from the queue, generates embeddings using Supabase AI (gte-small model), and stores them in the sidecar table.

Database Schema

The database schema is defined in the supabase/migrations directory. It includes tables for source documents and document embeddings, as well as the pgmq queue.