Functional Specification Document: LLM-Agnostic Sanitization Layer with Provider Gateway

Version: 1.0 Date: October 30, 2025 Milestone: M53 (Post-Demo Enhancement) Status: 📋 FSD - Ready for Implementation Planning

Executive Summary

The LLM-Agnostic Sanitization Layer transforms ChainAlign's AI Compliance & Trust system from being tightly coupled to a single LLM provider (Gemini) to a flexible, modular architecture that supports any provider (Claude, LLaMA, Groq, future models) with zero changes to core business logic.

This is achieved through the LLM Provider Gateway (Adapter Pattern) and Standardized Tokenization, ensuring ChainAlign can:

🔄 Switch providers without code changes
💰 Choose the most cost-effective model per task
⚡ Adapt to rapid LLM model evolution
🛡️ Maintain consistent audit trails and compliance

The Problem: Current Tightly-Coupled Architecture

Current State

AIGateway.js
    ↓ (hardcoded Gemini logic)
RedactionEngine (Python)
    ↓ (Gemini-specific API calls)
Gemini API

Issues:

❌ Switching to Claude requires changes in AIGateway.js
❌ Model upgrades (Gemini 1.0 → 2.0) require code changes
❌ Adding a new provider requires deep knowledge of core logic
❌ Tokenization tied to Gemini's tokenizer
❌ Cost calculations only work for Gemini pricing

The Solution: LLM Provider Gateway Architecture

Proposed Architecture

AIGateway.js (Business Logic)
    ↓ (PromptObject: {prompt, tokens, model})
LLMProviderGateway (Router)
    ├── GeminiAdapter
    ├── AnthropicAdapter
    ├── LLamaAdapter
    ├── GroqAdapter
    └── [Future Providers]
    ↓ (StandardizedResponse: {text, tokens_used, cost})
RedactionEngine (Business Logic)

Key Principle: Core business logic only knows about standardized interfaces, never provider-specific details.

Component Specifications

1. Standard Prompt Object

File: backend/src/types/PromptObject.ts

interface PromptObject {
  // Content
  prompt: string;
  context?: string;
  systemMessage?: string;

  // Metadata
  modelId: string;           // e.g., "gpt-4", "claude-3-sonnet", "llama-2-70b"
  provider: string;          // e.g., "openai", "anthropic", "llama"

  // Token Management
  promptTokens: number;      // Pre-calculated token count
  estimatedTotalTokens: number;

  // Configuration
  temperature?: number;
  maxTokens?: number;
  topP?: number;

  // Audit Trail
  userId: string;
  tenantId: string;
  requestId: string;
  timestamp: Date;
}

2. Standardized LLM Response

File: backend/src/types/LLMResponse.ts

interface LLMResponse {
  // Output
  text: string;
  rawResponse: Record<string, any>;  // For debugging

  // Token Accounting
  promptTokens: number;
  completionTokens: number;
  totalTokens: number;

  // Cost Tracking
  estimatedCostUSD: number;
  costBreakdown: {
    inputCost: number;
    outputCost: number;
  };

  // Provider Info
  provider: string;
  modelUsed: string;

  // Metadata
  latencyMs: number;
  success: boolean;
  error?: string;
  retryCount: number;

  // Audit
  requestId: string;
  timestamp: Date;
}

3. LLM Provider Gateway (Router)

File: backend/src/services/llm/LLMProviderGateway.js

import GeminiAdapter from './adapters/GeminiAdapter.js';
import AnthropicAdapter from './adapters/AnthropicAdapter.js';
import LLamaAdapter from './adapters/LLamaAdapter.js';
import GroqAdapter from './adapters/GroqAdapter.js';

class LLMProviderGateway {
  constructor() {
    this.adapters = {
      gemini: new GeminiAdapter(),
      anthropic: new AnthropicAdapter(),
      llama: new LLamaAdapter(),
      groq: new GroqAdapter(),
    };
  }

  /**
   * Main entry point: Convert PromptObject to LLMResponse
   * @param {PromptObject} promptObject
   * @returns {Promise<LLMResponse>}
   */
  async executePrompt(promptObject) {
    const adapter = this.getAdapter(promptObject.provider);

    if (!adapter) {
      throw new Error(`Unknown provider: ${promptObject.provider}`);
    }

    try {
      const startTime = Date.now();

      // Route to appropriate adapter
      const rawResponse = await adapter.call(promptObject);

      // Standardize response
      const standardizedResponse = adapter.standardizeResponse(rawResponse, promptObject);

      // Add latency tracking
      standardizedResponse.latencyMs = Date.now() - startTime;

      return standardizedResponse;
    } catch (error) {
      return this.handleError(error, promptObject);
    }
  }

  getAdapter(provider) {
    return this.adapters[provider.toLowerCase()];
  }

  handleError(error, promptObject) {
    // Standardized error handling across all providers
    return {
      text: null,
      success: false,
      error: error.message,
      provider: promptObject.provider,
      modelUsed: promptObject.modelId,
      requestId: promptObject.requestId,
      timestamp: new Date(),
    };
  }
}

export default new LLMProviderGateway();

4. Base Adapter Interface

File: backend/src/services/llm/adapters/BaseAdapter.js

/**
 * All provider adapters must extend this interface
 * Ensures consistent behavior across all providers
 */
class BaseAdapter {
  /**
   * Send prompt to the provider's API
   * @param {PromptObject} promptObject
   * @returns {Promise<Object>} Provider's native response
   */
  async call(promptObject) {
    throw new Error('Subclass must implement call()');
  }

  /**
   * Convert provider-specific response to StandardizedResponse
   * @param {Object} rawResponse Provider's native response
   * @param {PromptObject} promptObject Original request
   * @returns {LLMResponse}
   */
  standardizeResponse(rawResponse, promptObject) {
    throw new Error('Subclass must implement standardizeResponse()');
  }

  /**
   * Count tokens using provider's tokenizer
   * @param {string} text
   * @returns {Promise<number>} Token count
   */
  async countTokens(text) {
    throw new Error('Subclass must implement countTokens()');
  }

  /**
   * Get model's limits (context window, max output, etc.)
   * @param {string} modelId
   * @returns {Object}
   */
  getModelLimits(modelId) {
    throw new Error('Subclass must implement getModelLimits()');
  }

  /**
   * Calculate cost for a request
   * @param {number} inputTokens
   * @param {number} outputTokens
   * @param {string} modelId
   * @returns {number} Cost in USD
   */
  calculateCost(inputTokens, outputTokens, modelId) {
    throw new Error('Subclass must implement calculateCost()');
  }
}

export default BaseAdapter;

5. Gemini Adapter (Example Implementation)

File: backend/src/services/llm/adapters/GeminiAdapter.js

import BaseAdapter from './BaseAdapter.js';
import { GoogleGenerativeAI, HarmCategory, HarmBlockThreshold } from '@google/generative-ai';

class GeminiAdapter extends BaseAdapter {
  constructor() {
    super();
    this.client = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
    this.modelMap = {
      'gemini-default': 'gemini-2.0-pro',  // ← Model upgrade isolated here
      'gemini-fast': 'gemini-2.0-flash',
    };
  }

  async call(promptObject) {
    const modelId = this.mapModelId(promptObject.modelId);
    const model = this.client.getGenerativeModel({ model: modelId });

    const response = await model.generateContent({
      contents: [
        {
          role: 'user',
          parts: [{ text: promptObject.prompt }],
        },
      ],
      safetySettings: [
        {
          category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
          threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
      ],
      generationConfig: {
        temperature: promptObject.temperature || 0.7,
        topP: promptObject.topP || 0.9,
        maxOutputTokens: promptObject.maxTokens || 2048,
      },
    });

    return {
      text: response.response.text(),
      usage: {
        promptTokens: response.response.usageMetadata.promptTokenCount,
        completionTokens: response.response.usageMetadata.candidatesTokenCount,
      },
      rawResponse: response,
    };
  }

  standardizeResponse(rawResponse, promptObject) {
    const usage = rawResponse.usage;
    const totalTokens = usage.promptTokens + usage.completionTokens;
    const cost = this.calculateCost(
      usage.promptTokens,
      usage.completionTokens,
      promptObject.modelId
    );

    return {
      text: rawResponse.text,
      rawResponse: rawResponse.rawResponse,
      promptTokens: usage.promptTokens,
      completionTokens: usage.completionTokens,
      totalTokens: totalTokens,
      estimatedCostUSD: cost,
      costBreakdown: {
        inputCost: (usage.promptTokens / 1000000) * 0.075,    // Gemini 2.0 Pro pricing
        outputCost: (usage.completionTokens / 1000000) * 0.3,
      },
      provider: 'gemini',
      modelUsed: this.mapModelId(promptObject.modelId),
      success: true,
      requestId: promptObject.requestId,
      timestamp: new Date(),
    };
  }

  async countTokens(text) {
    const model = this.client.getGenerativeModel({ model: 'gemini-2.0-pro' });
    const response = await model.countTokens(text);
    return response.totalTokens;
  }

  getModelLimits(modelId) {
    const limits = {
      'gemini-2.0-pro': {
        contextWindow: 1000000,
        maxOutputTokens: 16384,
        costPerMillion: { input: 75, output: 300 },
      },
      'gemini-2.0-flash': {
        contextWindow: 1000000,
        maxOutputTokens: 16384,
        costPerMillion: { input: 7.5, output: 30 },
      },
    };
    return limits[this.mapModelId(modelId)];
  }

  calculateCost(inputTokens, outputTokens, modelId) {
    const mapped = this.mapModelId(modelId);
    const limits = this.getModelLimits(mapped);
    const inputCost = (inputTokens / 1000000) * limits.costPerMillion.input;
    const outputCost = (outputTokens / 1000000) * limits.costPerMillion.output;
    return inputCost + outputCost;
  }

  mapModelId(logicalId) {
    return this.modelMap[logicalId] || logicalId;
  }
}

export default GeminiAdapter;

6. Anthropic Adapter (New Provider Example)

File: backend/src/services/llm/adapters/AnthropicAdapter.js

import BaseAdapter from './BaseAdapter.js';
import Anthropic from '@anthropic-ai/sdk';

class AnthropicAdapter extends BaseAdapter {
  constructor() {
    super();
    this.client = new Anthropic({
      apiKey: process.env.ANTHROPIC_API_KEY,
    });
  }

  async call(promptObject) {
    const message = await this.client.messages.create({
      model: promptObject.modelId,
      max_tokens: promptObject.maxTokens || 1024,
      messages: [
        {
          role: 'user',
          content: promptObject.prompt,
        },
      ],
    });

    return {
      text: message.content[0].text,
      usage: {
        promptTokens: message.usage.input_tokens,
        completionTokens: message.usage.output_tokens,
      },
      rawResponse: message,
    };
  }

  standardizeResponse(rawResponse, promptObject) {
    const usage = rawResponse.usage;
    const totalTokens = usage.promptTokens + usage.completionTokens;
    const cost = this.calculateCost(
      usage.promptTokens,
      usage.completionTokens,
      promptObject.modelId
    );

    return {
      text: rawResponse.text,
      rawResponse: rawResponse.rawResponse,
      promptTokens: usage.promptTokens,
      completionTokens: usage.completionTokens,
      totalTokens: totalTokens,
      estimatedCostUSD: cost,
      costBreakdown: {
        inputCost: (usage.promptTokens / 1000000) * 3,        // Claude 3 Sonnet
        outputCost: (usage.completionTokens / 1000000) * 15,
      },
      provider: 'anthropic',
      modelUsed: promptObject.modelId,
      success: true,
      requestId: promptObject.requestId,
      timestamp: new Date(),
    };
  }

  async countTokens(text) {
    const response = await this.client.messages.countTokens({
      model: 'claude-3-5-sonnet-20241022',
      messages: [
        {
          role: 'user',
          content: text,
        },
      ],
    });
    return response.input_tokens;
  }

  getModelLimits(modelId) {
    const limits = {
      'claude-3-5-sonnet-20241022': {
        contextWindow: 200000,
        maxOutputTokens: 4096,
        costPerMillion: { input: 3, output: 15 },
      },
      'claude-3-opus-20240229': {
        contextWindow: 200000,
        maxOutputTokens: 4096,
        costPerMillion: { input: 15, output: 75 },
      },
    };
    return limits[modelId];
  }

  calculateCost(inputTokens, outputTokens, modelId) {
    const limits = this.getModelLimits(modelId);
    const inputCost = (inputTokens / 1000000) * limits.costPerMillion.input;
    const outputCost = (outputTokens / 1000000) * limits.costPerMillion.output;
    return inputCost + outputCost;
  }
}

export default AnthropicAdapter;

7. Tokenization Service (LLM-Agnostic)

File: backend/src/services/llm/TokenizationService.js

import TokenizerFactory from '../tokenizer/TokenizerFactory.js';

class TokenizationService {
  /**
   * Count tokens for ANY provider
   * @param {string} text
   * @param {string} provider
   * @param {string} modelId
   * @returns {Promise<number>}
   */
  async countTokens(text, provider, modelId) {
    const adapter = this.getAdapter(provider);
    return adapter.countTokens(text);
  }

  /**
   * Pre-flight check before sending to LLM
   * @param {PromptObject} promptObject
   * @returns {Promise<{valid: boolean, reason?: string}>}
   */
  async validatePromptSize(promptObject) {
    const adapter = this.getAdapter(promptObject.provider);
    const limits = adapter.getModelLimits(promptObject.modelId);

    if (promptObject.estimatedTotalTokens > limits.contextWindow) {
      return {
        valid: false,
        reason: `Prompt (${promptObject.estimatedTotalTokens} tokens) exceeds ${promptObject.modelId} context window (${limits.contextWindow} tokens)`,
      };
    }

    return { valid: true };
  }

  getAdapter(provider) {
    // Delegates to LLMProviderGateway
    return LLMProviderGateway.getAdapter(provider);
  }
}

export default new TokenizationService();

8. AIGateway Integration (Business Logic - NO Provider Knowledge)

File: backend/src/services/AIGateway.js (Updated)

import LLMProviderGateway from './llm/LLMProviderGateway.js';
import TokenizationService from './llm/TokenizationService.js';

class AIGateway {
  /**
   * Execute with cost and token checks - PROVIDER AGNOSTIC
   */
  async executeWithSanitization(userPrompt, context, options = {}) {
    const {
      modelId = process.env.DEFAULT_LLM_MODEL,
      provider = this.detectProvider(modelId),
      userId,
      tenantId,
    } = options;

    // Step 1: Tokenize
    const promptTokens = await TokenizationService.countTokens(
      userPrompt,
      provider,
      modelId
    );
    const contextTokens = await TokenizationService.countTokens(
      context,
      provider,
      modelId
    );
    const totalTokens = promptTokens + contextTokens;

    // Step 2: Validate
    const validation = await TokenizationService.validatePromptSize({
      provider,
      modelId,
      estimatedTotalTokens: totalTokens,
    });

    if (!validation.valid) {
      throw new Error(validation.reason);
    }

    // Step 3: Check cost
    const adapter = LLMProviderGateway.getAdapter(provider);
    const estimatedCost = adapter.calculateCost(totalTokens, 1000, modelId); // Rough estimate

    if (estimatedCost > this.costThreshold) {
      throw new Error(
        `Estimated cost ($${estimatedCost}) exceeds threshold ($${this.costThreshold})`
      );
    }

    // Step 4: Build prompt object (no provider-specific logic)
    const promptObject = {
      prompt: userPrompt,
      context,
      modelId,
      provider,
      promptTokens,
      estimatedTotalTokens: totalTokens,
      userId,
      tenantId,
      requestId: generateUUID(),
      timestamp: new Date(),
    };

    // Step 5: Execute (delegates to gateway)
    const response = await LLMProviderGateway.executePrompt(promptObject);

    // Step 6: Log audit trail
    await this.auditLog.record({
      eventType: 'LLM_CALL',
      userId,
      tenantId,
      provider: response.provider,
      modelUsed: response.modelUsed,
      promptTokens: response.promptTokens,
      completionTokens: response.completionTokens,
      estimatedCost: response.estimatedCostUSD,
      success: response.success,
      latencyMs: response.latencyMs,
    });

    return response;
  }

  detectProvider(modelId) {
    if (modelId.startsWith('gpt-')) return 'openai';
    if (modelId.startsWith('claude-')) return 'anthropic';
    if (modelId.includes('llama')) return 'llama';
    if (modelId.includes('groq')) return 'groq';
    return process.env.DEFAULT_LLM_PROVIDER || 'gemini';
  }
}

export default new AIGateway();

Handling Change & Evolution

Scenario 1: Google Releases Gemini 3.0

Before (Tightly Coupled): Modify AIGateway.js, update all hardcoded model names After (Gateway Pattern):

// In GeminiAdapter.js - only ONE place to change
this.modelMap = {
  'gemini-default': 'gemini-3.0-pro',  // ← Changed
  'gemini-fast': 'gemini-3.0-flash',   // ← Changed
};

// Update pricing in getModelLimits()
'gemini-3.0-pro': {
  contextWindow: 2000000,
  costPerMillion: { input: 50, output: 200 },  // ← Changed
}

// ✅ ZERO changes needed elsewhere!

Scenario 2: Switch to Claude for Finance Tasks

Before (Tightly Coupled): Rewrite all LLM calling logic in AIGateway.js After (Gateway Pattern):

// Configuration change only
const response = await AIGateway.executeWithSanitization(
  prompt,
  context,
  {
    modelId: 'claude-3-opus-20240229',  // ← Just change this
    provider: 'anthropic',              // ← Just change this
  }
);

// ✅ Business logic remains unchanged!

Scenario 3: Add LLaMA for On-Premise Deployment

Before (Tightly Coupled): Rewrite AIGateway.js to support new provider After (Gateway Pattern):

// Create LlamaAdapter.js extending BaseAdapter
// Register in LLMProviderGateway
this.adapters.llama = new LlamaAdapter();

// ✅ Immediately available to entire platform!
// ✅ AIGateway knows nothing about it!

Python Redaction Engine (Mirrored Pattern)

File: python-services/redaction-engine-service/llm_gateway.py

from abc import ABC, abstractmethod
from typing import Dict
from enum import Enum

class LLMProvider(str, Enum):
    GEMINI = "gemini"
    ANTHROPIC = "anthropic"
    LLAMA = "llama"

class BaseLLMAdapter(ABC):
    @abstractmethod
    async def call(self, prompt_object: dict) -> dict:
        pass

    @abstractmethod
    def standardize_response(self, raw_response: dict, prompt_object: dict) -> dict:
        pass

class GeminiRedactionAdapter(BaseLLMAdapter):
    async def call(self, prompt_object: dict) -> dict:
        # Gemini-specific implementation
        pass

class LLMGateway:
    def __init__(self):
        self.adapters = {
            LLMProvider.GEMINI: GeminiRedactionAdapter(),
            LLMProvider.ANTHROPIC: AnthropicRedactionAdapter(),
        }

    async def execute_redaction(self, prompt_object: dict) -> dict:
        adapter = self.adapters[prompt_object['provider']]
        raw_response = await adapter.call(prompt_object)
        return adapter.standardize_response(raw_response, prompt_object)

Compliance & Audit Trail

Audit Log Schema

CREATE TABLE llm_audit_logs (
  id UUID PRIMARY KEY,
  event_type VARCHAR(50),
  user_id UUID NOT NULL REFERENCES users(user_id),
  tenant_id UUID NOT NULL REFERENCES tenants(tenant_id),

  -- LLM Details
  provider VARCHAR(50),                    -- gemini, anthropic, llama, etc.
  model_used VARCHAR(100),                 -- gpt-4, claude-3-opus, etc.

  -- Token Accounting
  prompt_tokens INTEGER,
  completion_tokens INTEGER,
  total_tokens INTEGER,

  -- Cost Tracking
  estimated_cost_usd DECIMAL(10, 6),
  cost_breakdown JSONB,                    -- { input_cost, output_cost }

  -- Performance
  latency_ms INTEGER,

  -- Status
  success BOOLEAN,
  error_message TEXT,
  retry_count INTEGER DEFAULT 0,

  -- Timestamps
  created_at TIMESTAMP DEFAULT NOW(),

  -- Redaction Details (if applicable)
  redacted_field_count INTEGER,
  redaction_confidence DECIMAL(3, 2)
);

Testing Strategy

Unit Tests

// tests/unit/GeminiAdapter.test.js
describe('GeminiAdapter', () => {
  it('standardizes Gemini response to LLMResponse format', () => {
    // Mock Gemini response
    // Call standardizeResponse()
    // Assert LLMResponse schema
  });

  it('calculates cost correctly for Gemini models', () => {
    const cost = adapter.calculateCost(1000, 500, 'gemini-2.0-pro');
    expect(cost).toBeCloseTo(0.112, 3);  // (1000*0.075 + 500*0.3) / 1M
  });
});

// tests/unit/AnthropicAdapter.test.js
describe('AnthropicAdapter', () => {
  it('standardizes Anthropic response to LLMResponse format', () => {
    // Same test structure - BOTH adapters tested identically
  });
});

// tests/unit/LLMProviderGateway.test.js
describe('LLMProviderGateway', () => {
  it('routes to correct adapter based on provider', async () => {
    const response = await gateway.executePrompt({
      provider: 'gemini',
      modelId: 'gemini-2.0-pro',
      prompt: 'test',
    });
    expect(response.provider).toBe('gemini');
  });

  it('produces standardized response for ANY provider', async () => {
    const geminiResponse = await gateway.executePrompt({ provider: 'gemini' });
    const anthropicResponse = await gateway.executePrompt({ provider: 'anthropic' });

    // Both should have same schema
    expect(geminiResponse).toHaveProperty('totalTokens');
    expect(anthropicResponse).toHaveProperty('totalTokens');
  });
});

Integration Tests

describe('AIGateway (Provider Agnostic)', () => {
  it('works with ANY provider transparently', async () => {
    // Test with Gemini
    const geminiResult = await gateway.executeWithSanitization(
      'Redact: SSN-123-45-6789',
      '',
      { provider: 'gemini', modelId: 'gemini-2.0-pro' }
    );

    // Test with Anthropic - SAME LOGIC
    const anthropicResult = await gateway.executeWithSanitization(
      'Redact: SSN-123-45-6789',
      '',
      { provider: 'anthropic', modelId: 'claude-3-opus-20240229' }
    );

    // Both should work identically
    expect(geminiResult.success).toBe(true);
    expect(anthropicResult.success).toBe(true);

    // Both should have redacted output
    expect(geminiResult.text).not.toContain('123-45-6789');
    expect(anthropicResult.text).not.toContain('123-45-6789');
  });
});

Implementation Phases

Phase 1: Foundation (2-3 weeks)

Define TypeScript types (PromptObject, LLMResponse)
Implement BaseAdapter interface
Implement LLMProviderGateway router
Migrate existing Gemini logic to GeminiAdapter

Phase 2: Provider Expansion (2-3 weeks)

Implement AnthropicAdapter
Implement LlamaAdapter
Implement GroqAdapter
Add provider detection logic

Phase 3: Tokenization & Cost (1-2 weeks)

Implement TokenizationService
Integrate with each adapter's tokenizer
Add cost calculation to adapters
Update audit logging

Phase 4: Testing & Hardening (2 weeks)

Comprehensive unit tests for each adapter
Integration tests (provider-agnostic)
Performance benchmarks
Load testing

Success Metrics

✅ Zero changes to AIGateway.js when adding a new provider
✅ Model upgrades handled in adapter only
✅ Cost calculations accurate within 2% for all providers
✅ Token counts validated against provider APIs
✅ Audit logs capture 100% of LLM interactions
✅ Response time <200ms for token counting
✅ Support for ≥4 major LLM providers

Risks & Mitigation

Risk	Mitigation
Provider API changes break adapter	Version adapters, maintain historical compatibility layer
Token count discrepancies	Validate against provider APIs quarterly, add reconciliation logs
Cost calculation errors	Audit against actual billing monthly, alert on variance >5%
Adapter complexity grows	Strict interface contracts, code reviews for new adapters
Performance degradation	Benchmark each adapter, cache tokenization results

Dependencies

@google/generative-ai (Gemini)
@anthropic-ai/sdk (Anthropic)
transformers (for LLaMA tokenization)
js-tiktoken (OpenAI tokenization fallback)
uuid (request IDs)
PostgreSQL (audit logs)

Future Enhancements

Provider Load Balancing - Route to cheapest provider for each task type
Fallback Chain - If Gemini fails, auto-retry with Anthropic
Model Selection AI - ML model to choose best provider per request
Cost Optimization Engine - Suggest provider swaps based on usage patterns
Real-time Pricing Updates - Auto-sync pricing from provider APIs

Conclusion

The LLM-Agnostic Sanitization Layer with Provider Gateway transforms ChainAlign from a Gemini-dependent system into a flexible, modular platform that embraces the rapid evolution of LLM technology.

By centralizing provider-specific logic into adapters and maintaining strict interface contracts, ChainAlign ensures:

🔄 Easy provider switching
💰 Cost optimization
⚡ Rapid model adoption
🛡️ Consistent compliance

Status: Ready for post-demo implementation Estimated Timeline: 6-8 weeks (Phases 1-4) Team: 2-3 backend engineers

Document Version: 1.0 Last Updated: October 30, 2025 Owner: Architecture Team Status: 📋 FSD - Ready for Implementation Planning

Executive Summary​

The Problem: Current Tightly-Coupled Architecture​

Current State​

The Solution: LLM Provider Gateway Architecture​

Proposed Architecture​

Component Specifications​

1. Standard Prompt Object​

2. Standardized LLM Response​

3. LLM Provider Gateway (Router)​

4. Base Adapter Interface​

5. Gemini Adapter (Example Implementation)​

6. Anthropic Adapter (New Provider Example)​

7. Tokenization Service (LLM-Agnostic)​

8. AIGateway Integration (Business Logic - NO Provider Knowledge)​

Handling Change & Evolution​

Scenario 1: Google Releases Gemini 3.0​

Scenario 2: Switch to Claude for Finance Tasks​

Scenario 3: Add LLaMA for On-Premise Deployment​

Python Redaction Engine (Mirrored Pattern)​

Compliance & Audit Trail​

Audit Log Schema​

Testing Strategy​

Unit Tests​

Integration Tests​

Implementation Phases​

Phase 1: Foundation (2-3 weeks)​

Phase 2: Provider Expansion (2-3 weeks)​

Phase 3: Tokenization & Cost (1-2 weeks)​

Phase 4: Testing & Hardening (2 weeks)​

Success Metrics​

Risks & Mitigation​

Dependencies​

Future Enhancements​

Conclusion​