FSD: Intent-Driven Orchestration & Intelligent Search System

Version: 3.0 Date: November 14, 2024 Status: Final Draft Author: Pramod Prasanth Supersedes: FSD Dual-Engine Hybrid Search System v2.2

1.0 Overview

1.1. Purpose

This document specifies the functional requirements for ChainAlign's Intent-Driven Orchestration & Intelligent Search System. This evolved architecture builds upon the Dual-Engine Hybrid Search foundation to provide a unified "Judgment Bar" interface that intelligently routes user intent to either:

Orchestration Pathway ("Zero-Input" Workflow): Pre-populates workbenches with context-aware data from multiple sources
Search Pathway (Information Retrieval): Returns filtered, ranked results for exploration and review

The system uses LLM-based intent parsing, session context enrichment (Zep), and multi-service data scouting (Cognee/GraphRAG + pgvector + PostgreSQL + Typesense) to eliminate manual data entry and transform ChainAlign from a search tool into an intelligent decision workspace orchestrator.

1.2. Core Architecture Evolution

Previous Architecture (v2.2):

User Query → Search Orchestrator → [PostgreSQL + Typesense] → Search Results

New Architecture (v3.0):

Natural Language → Intent Parser (LLM) → Context Enrichment (Zep) → Router
                                                                      ├→ Orchestration Path → Multi-Service Scout → Pre-populated Workbench
                                                                      └→ Search Path → Dual-Engine Search → Results List

1.3. Key Innovations

Natural Language First: Users express intent in plain language, not search syntax
Context-Aware: Zep memory enriches queries with session history, constraints, and discussions
Multi-Source Intelligence: Combines knowledge graph (Cognee), semantic memory (pgvector), transactional data (PostgreSQL), and text search (Typesense)
Action-Oriented: Routes to workspaces for decision-making, not just information display
Zero-Input Workflows: Pre-populates entire scenarios based on parsed intent
LLM-Agnostic: Provider abstraction supports Gemini, OpenAI, Anthropic, etc.

2.0 Goals and Objectives

2.1. Primary Goals

Eliminate Manual Data Entry: Pre-populate 80% of workflows through intelligent context gathering
Unified Interface: Single "Judgment Bar" for all user interactions (actions + search)
Context-Aware Intelligence: Leverage session memory to anticipate user needs
Multi-Modal Routing: Intelligent pathway selection based on intent type
Maintain Search Excellence: Preserve existing dual-engine search capabilities

2.2. Success Metrics

Time to Insight: Reduce time from query to actionable workspace by 70%
Data Entry Reduction: Decrease manual field population from 10+ fields to 2-3 validation checks
Intent Accuracy: 90%+ correct routing between Orchestration vs. Search pathways
Response Latency:
- Intent parsing: <500ms (P95)
- Context enrichment: <300ms (P95)
- Scout queries (parallel): <2000ms (P95)
- Search queries: <500ms (P95) - maintain existing performance

3.0 Proposed Architecture

3.1. Component Roles

Component	Role	Purpose	Integration
Intent Parser	LLM-based intent deconstruction	Parse natural language into structured JSON (action, entities, type)	Gemini/OpenAI/Anthropic via LLMClient
Context Enricher	Session memory integration	Retrieve user role, constraints, recent discussions from Zep	Zep (M73)
Intent Router	Pathway selector	Route to Orchestration or Search based on intent.action	Custom logic
Multi-Service Scout	Parallel data gatherer	Execute 4+ parallel queries across Cognee, pgvector, PostgreSQL, Typesense	M72 (Cognee) + RAGService + Repositories
Workbench Populator	Workspace builder	Assemble pre-populated workspace data from scout results	Custom service
Search Orchestrator	Dual-engine search (v2.2)	Execute PostgreSQL + Typesense hybrid search	Existing implementation

3.2. Architectural Diagram

3.3. Data Flow: Orchestration Path Example

User Input: "Explore shifting production of SKU-X from China to Mexico"

Phase 1: Intent Parsing (LLM)

{
  "action": "simulate_scenario",
  "scenario_type": "move_production",
  "entities": [
    { "type": "product", "value": "SKU-X" },
    { "type": "location", "value": "China", "role": "source" },
    { "type": "location", "value": "Mexico", "role": "target" }
  ],
  "intent_type": "action",
  "confidence": 0.95
}

Phase 2: Context Enrichment (Zep)

{
  "action": "simulate_scenario",
  "entities": [...],
  "session_context": {
    "user_role": "Supply_Planner",
    "active_constraints": ["q4_budget_freeze"],
    "recent_discussions": ["mexico_port_labor_issues"],
    "past_decisions": ["decision-456: China selected for cost in 2023"]
  }
}

Phase 3: Multi-Service Scout (Parallel)

Scout Query 1 (Cognee/GraphRAG):

query GetScenarioContext {
  sku: entity(name: "SKU-X") { id, properties }
  source: entity(name: "China") { id, properties }
  target: entity(name: "Mexico") { id, properties }
  sku_constraints: relationships(from: sku.id, type: "HAS_CONSTRAINT")
  source_suppliers: relationships(from: source.id, type: "HAS_SUPPLIER")
  target_logistics: relationships(from: target.id, type: "HAS_LOGISTICS_ROUTE")
}

Scout Query 2 (pgvector - Semantic):

SELECT note_content, decision_rationale, source_document
FROM judgment_embeddings
ORDER BY embedding <=> gemini_embed('Mexico logistics risk OR China supplier quality SKU-X')
LIMIT 5;

Scout Query 3 (PostgreSQL - Factual):

SELECT landed_cost, lead_time, current_inventory
FROM sku_master_data
WHERE sku_id = 'SKU-X' AND region = 'China';

Scout Query 4 (Typesense - Text Search):

{
  q: 'SKU-X Mexico China production',
  filter_by: 'tenant_id:tenant-123',
  query_by: 'content,notes,description'
}

Phase 4: Workbench Population

{
  "redirectTo": "/what-if-workbench",
  "prePopulated": true,
  "baseScenario": {
    "title": "Current State: Production in China",
    "data": { "landed_cost": 10.50, "lead_time": 28, ... },
    "locked": true
  },
  "targetScenario": {
    "title": "Proposed State: Production in Mexico",
    "data": { "est_landed_cost": 12.00, "est_lead_time": 14, ... },
    "editable": true
  },
  "contextAndRisks": {
    "constraints": ["PFAS_Compliant_Material"],
    "risks": ["Mexico_Port_Labor_Issues"],
    "pastReasoning": ["2023 decision based on cost"],
    "suppliers": ["ABC_Materials_China", "Requires new supplier qualification"]
  }
}

3.4. Data Flow: Search Path Example

User Input: "Find all decisions about Mexico from last quarter"

Phase 1: Intent Parsing (LLM)

{
  "action": "search",
  "search_type": "decision_history",
  "entities": [
    { "type": "location", "value": "Mexico" },
    { "type": "time_range", "value": "last_quarter" }
  ],
  "intent_type": "information_retrieval",
  "confidence": 0.92
}

Phase 2: Context Enrichment (Zep)

{
  "action": "search",
  "entities": [...],
  "session_context": {
    "user_role": "Supply_Planner",
    "relevant_context": ["Previous searches about Mexico in past 7 days"]
  }
}

Phase 3: Search Orchestrator (Dual-Engine)

Routes to existing Search Orchestrator (v2.2)
Executes PostgreSQL + Typesense hybrid query
Returns ranked results list

Output: Decision Timeline with filtered results

4.0 Functional Requirements

FR-1: Intent Parser Service (NEW)

FR-1.1 (Natural Language Parsing): The Intent Parser must accept natural language queries and use LLM-based parsing to structure them into actionable JSON.

Input: Raw text string Output: Structured intent JSON with:

action (enum: simulate_scenario, compare_entities, analyze_risk, search, retrieve_conversations, find_documents)
scenario_type (optional: move_production, change_supplier, adjust_inventory)
entities (array: type, value, role)
intent_type (enum: action, information_retrieval)
confidence (float: 0-1)

FR-1.2 (LLM Provider Agnosticism): The Intent Parser must use the LLMClient abstraction layer, supporting multiple providers (Gemini, OpenAI, Anthropic) via configuration.

FR-1.3 (Function Calling Schema): The Intent Parser must use strict function calling schemas to ensure structured output:

const intentSchema = {
  name: 'parse_user_intent',
  description: 'Parse user query into structured intent',
  parameters: {
    type: 'object',
    properties: {
      action: {
        type: 'string',
        enum: ['simulate_scenario', 'compare_entities', 'analyze_risk', 'search', 'retrieve_conversations', 'find_documents']
      },
      scenario_type: {
        type: 'string',
        enum: ['move_production', 'change_supplier', 'adjust_inventory', 'other']
      },
      entities: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            type: { type: 'string', enum: ['product', 'location', 'supplier', 'constraint', 'time_range'] },
            value: { type: 'string' },
            role: { type: 'string', enum: ['source', 'target', 'comparison'] }
          }
        }
      },
      intent_type: {
        type: 'string',
        enum: ['action', 'information_retrieval']
      },
      confidence: { type: 'number', minimum: 0, maximum: 1 }
    },
    required: ['action', 'intent_type', 'confidence']
  }
};

FR-1.4 (Low Confidence Handling): When confidence < 0.7, the system must trigger Socratic Inquiry Engine (M70) to generate clarifying questions:

if (intent.confidence < 0.7) {
  const clarifyingQuestions = await SocraticInquiryService.generateSocraticQuestions({
    decisionContext: rawQuery,
    decisionType: 'intent_disambiguation',
    decisionScope: 'single_query',
    tenantId,
    user
  });

  return {
    status: 'needs_clarification',
    questions: clarifyingQuestions,
    originalIntent: intent
  };
}

FR-1.5 (Fallback to Search): When intent parsing fails or action is unrecognized, the system must gracefully fallback to search pathway.

FR-2: Context Enrichment Service (NEW)

FR-2.1 (Zep Integration): The Context Enricher must call Zep (M73) to retrieve session-specific context before executing queries.

Context Retrieved:

User role and permissions
Active constraints (e.g., "q4_budget_freeze")
Recent discussions and topics
Past decision references
Relevant facts from previous sessions

FR-2.2 (Context Injection): The enriched context must be injected into all downstream queries:

GraphRAG queries filtered by relevant entities
Semantic searches biased toward recent topics
PostgreSQL queries filtered by active constraints

FR-2.3 (Privacy & Tenant Isolation): Context retrieval must respect tenant boundaries and user privacy settings. Cross-tenant context leakage is prohibited.

FR-3: Intent Router (NEW)

FR-3.1 (Pathway Selection): The Intent Router must analyze the action field and route to the appropriate pathway:

Action Type	Pathway	Output
`simulate_scenario`	Orchestration	What-If Workbench
`compare_entities`	Orchestration	Comparison Dashboard
`analyze_risk`	Orchestration	Risk Dashboard
`search`	Search	Results List
`retrieve_conversations`	Search	Timeline (Zep)
`find_documents`	Search	Document List

FR-3.2 (Routing Logic):

async routeIntent(enrichedIntent) {
  const orchestrationActions = ['simulate_scenario', 'compare_entities', 'analyze_risk'];
  const searchActions = ['search', 'retrieve_conversations', 'find_documents'];

  if (orchestrationActions.includes(enrichedIntent.action)) {
    return await this.executeOrchestrationPath(enrichedIntent);
  } else if (searchActions.includes(enrichedIntent.action)) {
    return await this.executeSearchPath(enrichedIntent);
  } else {
    // Fallback to search
    return await this.executeSearchPath({ ...enrichedIntent, action: 'search' });
  }
}

FR-3.3 (Audit Trail): All routing decisions must be logged for analytics and debugging.

FR-4: Multi-Service Scout (NEW - Orchestration Path)

FR-4.1 (Parallel Execution): The Scout must execute queries across all 4+ services in parallel to minimize latency:

const scoutResults = await Promise.all([
  this.cogneeService.traverseGraph({ ... }),      // Structural query
  this.ragService.semanticSearch({ ... }),        // Semantic query
  this.skuRepository.getBaselineData({ ... }),    // Factual query
  this.typesenseService.search({ ... })           // Text search
]);

FR-4.2 (Service-Specific Queries):

Cognee/GraphRAG (Structural):

Purpose: Find relationships, constraints, suppliers, logistics routes
Query Type: Graph traversal (BFS/DFS, relationship filtering)
Output: Constraint nodes, supplier nodes, logistics nodes

pgvector (Semantic):

Purpose: Retrieve past reasoning, hidden risks, unstructured context
Query Type: Vector similarity search on embeddings
Output: Notes, rationale, source documents

PostgreSQL (Factual):

Purpose: Get transactional baseline data (metrics, inventory, costs)
Query Type: SQL SELECT with filters
Output: Numerical metrics (landed_cost, lead_time, inventory)

Typesense (Text):

Purpose: Fast typo-tolerant text search across documents
Query Type: Full-text search with filters
Output: Relevant documents, notes, descriptions

FR-4.3 (Result Normalization): Scout results must be normalized into a consistent schema before merging:

{
  structural: { constraints: [], suppliers: [], logistics: [] },
  semantic: { risks: [], rationale: [], documents: [] },
  factual: { baseline_metrics: { ... } },
  textual: { matching_documents: [] }
}

FR-4.4 (Partial Results): Scout must stream partial results via WebSocket as each service responds, prioritizing fast services (Typesense, Cognee) over slow services (complex PostgreSQL joins).

FR-4.5 (Graceful Degradation): If any single service fails, Scout must continue with remaining services and flag missing data:

{
  structural: { status: 'success', data: [...] },
  semantic: { status: 'failed', error: 'pgvector timeout', data: [] },
  factual: { status: 'success', data: {...} },
  textual: { status: 'success', data: [...] }
}

FR-5: Workbench Populator (NEW - Orchestration Path)

FR-5.1 (Data Assembly): The Workbench Populator must assemble scout results into a pre-populated workspace structure:

{
  action: 'simulate_scenario',
  redirectTo: '/what-if-workbench',
  prePopulated: true,
  baseScenario: {
    title: 'Current State: ...',
    data: { ...factualResults },
    locked: true,
    metrics: [
      { label: 'Landed Cost', value: '$10.50', unit: 'USD' },
      { label: 'Lead Time', value: '28', unit: 'days' }
    ]
  },
  targetScenario: {
    title: 'Proposed State: ...',
    data: { ...estimatedMetrics },
    editable: true,
    sliders: [
      { field: 'landed_cost', min: 8, max: 15, initial: 12, step: 0.5 },
      { field: 'lead_time', min: 10, max: 40, initial: 14, step: 1 }
    ]
  },
  contextAndRisks: {
    constraints: [...structuralResults.constraints],
    risks: [...semanticResults.risks],
    pastReasoning: [...semanticResults.rationale],
    suppliers: [...structuralResults.suppliers]
  }
}

FR-5.2 (Action-Specific Templates): The Populator must support different templates based on action type:

simulate_scenario → What-If Workbench template
compare_entities → Comparison Dashboard template
analyze_risk → Risk Dashboard template

FR-5.3 (Validation & Completeness): The Populator must validate that required fields are present and flag incomplete data:

{
  completeness: {
    baseScenario: 100,  // All required fields present
    targetScenario: 75, // Missing 2 of 8 fields
    contextAndRisks: 60 // Missing supplier qualification data
  },
  warnings: [
    'No historical data for Mexico location - using industry estimates',
    'PFAS compliance constraint requires manual verification'
  ]
}

FR-5.4 (User Validation Mode): Pre-populated workspaces must clearly indicate which fields are AI-generated vs. user-verified:

{
  fields: [
    { name: 'landed_cost', value: 12.00, source: 'ai_estimate', confidence: 0.7, editable: true },
    { name: 'lead_time', value: 14, source: 'historical_data', confidence: 0.95, editable: true }
  ]
}

FR-6: Search Orchestrator (EXISTING - v2.2)

FR-6.1 through FR-6.5: All existing Search Orchestrator requirements from v2.2 are preserved and apply to the Search Pathway:

Query Intent Analysis (textual vs. analytical vs. hybrid)
Asynchronous Execution with 202 Accepted response
Result Merging Strategies (INTERSECTION, UNION)
Normalization and De-duplication
Tenant Isolation

FR-6.6 (Integration with Intent Router): The Search Orchestrator must accept enriched intents from the Intent Router and extract relevant parameters:

async executeSearchPath(enrichedIntent) {
  const searchParams = {
    q: enrichedIntent.textQuery || this.buildTextQuery(enrichedIntent.entities),
    filters: this.buildFilters(enrichedIntent.entities, enrichedIntent.session_context),
    merge_strategy: 'INTERSECTION',
    pagination: { page: 1, limit: 25 }
  };

  return await this.searchOrchestrator.execute(searchParams);
}

FR-7: Real-time Status and Result Streaming (ENHANCED)

FR-7.1 (WebSocket Communication): The system must use WebSocket to stream progress updates for all pathways.

FR-7.2 (Phase Progress Events): For Orchestration Path, emit progress events for each phase:

// Phase 1: Intent Parsing
{ event: 'PARSING_INTENT', progress: 20, timestamp: '...' }

// Phase 2: Context Enrichment
{ event: 'ENRICHING_CONTEXT', progress: 40, context: {...}, timestamp: '...' }

// Phase 3: Scouting Data (with sub-phases)
{ event: 'SCOUTING_DATA', progress: 60, phase: 'cognee_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 70, phase: 'pgvector_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 80, phase: 'postgres_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 90, phase: 'typesense_query', status: 'complete', timestamp: '...' }

// Phase 4: Workbench Ready
{ event: 'WORKBENCH_READY', progress: 100, data: {...}, redirectTo: '/what-if-workbench', timestamp: '...' }

FR-7.3 (Search Path Events): For Search Path, emit existing search orchestrator events:

{ event: 'SEARCH_STARTED', search_id: '...', timestamp: '...' }
{ event: 'PARTIAL_RESULTS', results: [...], source: 'typesense', timestamp: '...' }
{ event: 'SEARCH_COMPLETE', results: [...], total_count: 50, timestamp: '...' }

FR-7.4 (Error Events): All errors must be streamed to the client with actionable context:

{
  event: 'ERROR',
  phase: 'context_enrichment',
  error: 'Zep service unavailable',
  fallback: 'Continuing without session context',
  retryable: true,
  timestamp: '...'
}

FR-8: Data Synchronization (EXISTING - v2.2)

All existing synchronization requirements from v2.2 remain in effect:

FR-8.1 (Mechanism): PostgreSQL triggers + pgmq worker
FR-8.2 (Max Latency): Under 5 seconds for critical tables
FR-8.3 (Fallback): Graceful degradation if sync worker fails

FR-9: Tenant Isolation & Security (ENHANCED)

FR-9.1 (Multi-Layer Enforcement): Tenant isolation must be enforced at every layer:

Intent Parser: Extract tenant_id from JWT
Context Enricher: Filter Zep queries by tenant_id
Scout Services: Inject tenant_id into all queries
Search Orchestrator: Apply existing tenant filters

FR-9.2 (Cross-Tenant Blocking): Queries attempting to access multi-tenant data must be rejected with error:

{
  error: 'TENANT_ISOLATION_VIOLATION',
  message: 'Query cannot span multiple tenants',
  tenant_id: 'tenant-123'
}

FR-9.3 (Audit Logging): All intent parsing, context retrieval, and data access must be logged to audit trail with:

User ID
Tenant ID
Raw query
Parsed intent
Services accessed
Results returned
Timestamp

5.0 Unified API Contract

5.1. Single Unified Endpoint

Endpoint: POST /api/orchestrator/execute-intent

Purpose: Accept natural language or structured queries, route to appropriate pathway, and stream results.

5.2. Request Body

{
  "query": "string | null",
  "sessionId": "string | null",
  "options": {
    "forcePathway": "orchestration | search | null",
    "enableStreaming": true,
    "timeout": 10000
  }
}

Fields:

query (required): Natural language query or structured text
sessionId (optional): Zep session ID for context enrichment
options.forcePathway (optional): Override automatic routing for testing
options.enableStreaming (optional): Enable WebSocket streaming (default: true)
options.timeout (optional): Max execution time in ms (default: 10000)

5.3. Response (202 Accepted)

{
  "intent_id": "intent-789",
  "status": "processing",
  "pathway": "orchestration | search",
  "estimated_time_ms": 2000,
  "websocket_channel": "intent_updates_intent-789"
}

5.4. WebSocket Events

Orchestration Path Events:

// Phase 1
{ "event": "PARSING_INTENT", "intent_id": "intent-789", "progress": 20 }

// Phase 2
{ "event": "ENRICHING_CONTEXT", "intent_id": "intent-789", "progress": 40, "context": {...} }

// Phase 3
{ "event": "SCOUTING_DATA", "intent_id": "intent-789", "progress": 75, "partial_results": {...} }

// Phase 4
{
  "event": "WORKBENCH_READY",
  "intent_id": "intent-789",
  "progress": 100,
  "data": {...},
  "redirectTo": "/what-if-workbench?intent=intent-789",
  "completeness": {...}
}

Search Path Events:

{ "event": "SEARCH_STARTED", "intent_id": "intent-789", "search_id": "search-123" }
{ "event": "PARTIAL_RESULTS", "intent_id": "intent-789", "results": [...], "source": "typesense" }
{ "event": "SEARCH_COMPLETE", "intent_id": "intent-789", "results": [...], "total_count": 50 }

5.5. Clarification Flow (Low Confidence)

When confidence < 0.7, system responds with clarifying questions:

Response (200 OK):

{
  "status": "needs_clarification",
  "intent_id": "intent-789",
  "originalQuery": "Explore shifting production...",
  "questions": [
    {
      "id": "q1",
      "category": "assumptions",
      "question": "Are you considering any specific suppliers in Mexico?",
      "priority": "high"
    },
    {
      "id": "q2",
      "category": "constraints",
      "question": "What budget constraints apply to this shift?",
      "priority": "medium"
    }
  ],
  "suggestedAnswers": {
    "q1": ["ABC Materials Mexico", "XYZ Logistics Mexico", "Other (specify)"]
  }
}

User Provides Answers:

POST /api/orchestrator/clarify-intent

{
  "intent_id": "intent-789",
  "answers": [
    { "question_id": "q1", "answer": "ABC Materials Mexico" },
    { "question_id": "q2", "answer": "Q4 budget freeze applies" }
  ]
}

System Re-parses with enriched context and continues orchestration.

6.0 Implementation Architecture

6.1. Service Layer Structure

backend/src/services/
├── orchestration/
│   ├── IntentParserService.js          ← Phase 1: LLM-based parsing
│   ├── ContextEnricherService.js       ← Phase 2: Zep integration
│   ├── IntentRouter.js                 ← Phase 3: Pathway routing
│   ├── MultiServiceScout.js            ← Phase 3a: Parallel query executor
│   ├── WorkbenchPopulatorService.js    ← Phase 4: Workspace assembly
│   └── IntentOrchestrator.js           ← Main coordinator
├── search/
│   └── SearchOrchestrator.js           ← Existing dual-engine search (v2.2)
├── ZepService.js                       ← M73: Zep Memory
├── CogneeService.js                    ← M72: Cognee GraphRAG
├── RAGService.js                       ← pgvector semantic search
└── llm/
    ├── LLMClient.js                    ← Provider abstraction
    ├── ProviderRegistry.js
    └── providers/
        ├── GeminiProvider.js
        ├── OpenAIProvider.js           ← NEW
        └── AnthropicProvider.js        ← NEW

6.2. LLM Provider Abstraction (Enhanced)

Configuration:

# .env
LLM_PROVIDER=gemini                     # Primary provider
LLM_PROVIDER_INTENT_PARSING=gemini      # Override for intent parsing
LLM_PROVIDER_EMBEDDINGS=openai          # Override for embeddings
LLM_PROVIDER_LONG_CONTEXT=anthropic     # Override for long context

GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key
ANTHROPIC_API_KEY=your-key

Usage in IntentParserService:

import LLMClient from '../llm/LLMClient.js';

class IntentParserService {
  async parseIntent(query) {
    const provider = process.env.LLM_PROVIDER_INTENT_PARSING || 'gemini';

    const result = await LLMClient.chatWithTools({
      provider,
      messages: [{ role: 'user', content: query }],
      tools: [this.intentParsingSchema],
      context: { operation: 'intent_parsing' }
    });

    return JSON.parse(result.content);
  }
}

6.3. Database Schema Extensions

New Table: intent_log

CREATE TABLE intent_log (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL,
  tenant_id UUID NOT NULL,
  raw_query TEXT NOT NULL,
  parsed_intent JSONB NOT NULL,
  enriched_context JSONB,
  scout_results JSONB,
  pathway VARCHAR(50) NOT NULL,  -- 'orchestration' or 'search'
  action_taken VARCHAR(50) NOT NULL,
  redirect_url TEXT,
  confidence DECIMAL(3,2),
  execution_time_ms INTEGER,
  created_at TIMESTAMP DEFAULT NOW(),

  CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id),
  CONSTRAINT fk_tenant FOREIGN KEY (tenant_id) REFERENCES tenants(id)
);

CREATE INDEX idx_intent_log_user ON intent_log(user_id);
CREATE INDEX idx_intent_log_tenant ON intent_log(tenant_id);
CREATE INDEX idx_intent_log_action ON intent_log(action_taken);
CREATE INDEX idx_intent_log_created ON intent_log(created_at);

New Table: clarification_sessions

CREATE TABLE clarification_sessions (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  intent_id UUID NOT NULL REFERENCES intent_log(id),
  questions JSONB NOT NULL,
  answers JSONB,
  status VARCHAR(50) DEFAULT 'pending',  -- 'pending', 'answered', 'expired'
  created_at TIMESTAMP DEFAULT NOW(),
  answered_at TIMESTAMP
);

7.0 Phased Implementation Plan

Phase 1: LLM Provider Abstraction (1-2 weeks)

Week 1:

✅ Migrate AIGateway to use LLMClient instead of direct Gemini SDK
✅ Test all existing services (AIGateway, ScenarioStreamingEngine, DecisionExecutionService)
✅ Update environment configuration
✅ Create OpenAIProvider and AnthropicProvider implementations

Week 2:

✅ Add provider health checks and fallback logic
✅ Update documentation
✅ Performance benchmarking (Gemini vs. OpenAI vs. Anthropic)

Deliverables:

Fully LLM-agnostic architecture
Support for 3 providers (Gemini, OpenAI, Anthropic)
Migration guide for existing services

Phase 2: Intent Parsing Foundation (2-3 weeks)

Week 3:

✅ Create IntentParserService with function calling schema
✅ Implement POST /api/orchestrator/parse-intent endpoint
✅ Add intent_log table and repository
✅ Unit tests for intent parsing accuracy

Week 4:

✅ Build clarification flow with SIE integration
✅ Implement low confidence handling (<0.7)
✅ Create clarification_sessions table
✅ Frontend: Clarification question modal

Week 5:

✅ Integration tests with real user queries
✅ Accuracy benchmarking (target: 90%+ correct routing)
✅ Error handling and fallback to search

Deliverables:

Intent parsing with 90%+ accuracy
Clarification flow for ambiguous queries
Comprehensive test coverage

Phase 3: Context Enrichment & Routing (2 weeks)

Week 6:

✅ Create ContextEnricherService
✅ Integrate Zep (M73) for session context retrieval
✅ Implement context injection into scout queries

Week 7:

✅ Create IntentRouter with pathway logic
✅ Implement routing decision tree
✅ Add audit logging for all routing decisions

Deliverables:

Context-aware query enrichment
Intelligent routing between pathways
Audit trail for all intent operations

Phase 4: Multi-Service Scout (3 weeks)

Week 8:

✅ Create MultiServiceScout coordinator
✅ Implement parallel query execution
✅ Add result normalization logic

Week 9:

✅ Build service-specific query builders:
- Cognee GraphRAG query builder
- pgvector semantic search query builder
- PostgreSQL factual query builder
- Typesense text query builder

Week 10:

✅ Implement graceful degradation (service failures)
✅ Add partial results streaming via WebSocket
✅ Performance optimization (query caching, connection pooling)

Deliverables:

Multi-service scout with <2s P95 latency
Graceful degradation on service failures
Real-time partial results streaming

Phase 5: Workbench Population (2 weeks)

Week 11:

✅ Create WorkbenchPopulatorService
✅ Implement action-specific templates
✅ Build data validation and completeness checks

Week 12:

✅ Add AI vs. user-verified field tracking
✅ Implement warning system for incomplete data
✅ Integration tests with What-If Workbench

Deliverables:

Pre-populated workbench templates
Validation and completeness tracking
Clear AI-generated vs. verified indicators

Phase 6: Frontend Integration (2 weeks)

Week 13:

✅ Create JudgmentBar component (natural language input)
✅ Build IntentProgressIndicator for phase tracking
✅ Implement WebSocket event handlers

Week 14:

✅ Update What-If Workbench to accept pre-populated data
✅ Add validation mode UI (editable sliders, confidence indicators)
✅ Build clarification question flow UI

Deliverables:

Unified Judgment Bar interface
Pre-populated workbenches with validation UI
Real-time progress indicators

Phase 7: Testing & Optimization (2 weeks)

Week 15:

✅ End-to-end integration tests
✅ Performance benchmarking (target latencies)
✅ User acceptance testing (UAT)

Week 16:

✅ Security audit (tenant isolation, data privacy)
✅ Load testing (concurrent users, query volume)
✅ Documentation and training materials

Deliverables:

Production-ready system
Performance metrics meeting targets
Comprehensive documentation

8.0 Non-Functional Requirements

8.1. Performance

Metric	Target	Measurement
Intent Parsing Latency (P95)	<500ms	LLM API call + parsing
Context Enrichment Latency (P95)	<300ms	Zep API call
Scout Parallel Queries (P95)	<2000ms	All 4 services complete
Search Queries (P95)	<500ms	Maintain existing performance
Workbench Population (P95)	<1000ms	Data assembly + validation
End-to-End Orchestration (P95)	<4000ms	All phases complete

8.2. Accuracy

Metric	Target	Measurement
Intent Parsing Accuracy	>90%	Correct pathway routing
Clarification Trigger Rate	5-10%	Low confidence queries
Data Completeness (Workbench)	>80%	Required fields populated
Tenant Isolation Violations	0	Audit log review

8.3. Reliability

Uptime: 99.9% availability for core orchestration services
Graceful Degradation: System must function with up to 2 of 4 scout services down
Fallback: Always fallback to search pathway if orchestration fails
Data Consistency: Typesense sync lag <5 seconds (maintain existing requirement)

8.4. Security

Tenant Isolation: Multi-layer enforcement (parser, enricher, scout, search)
Audit Logging: All intents, queries, and data access logged
Data Privacy: Zep context respects user privacy settings
LLM Security: No PII sent to LLM providers (enforce redaction if needed)

8.5. Scalability

Concurrent Users: Support 1000+ concurrent intent executions
Query Volume: Handle 10,000+ intents/hour
Scout Parallelization: Auto-scale based on query complexity
WebSocket Connections: Support 5000+ active WebSocket sessions

9.0 Migration from v2.2 to v3.0

9.1. Backward Compatibility

Existing Search API Preserved:

POST /api/search  ← Still supported for legacy clients

Gradual Migration Path:

Deploy v3.0 alongside v2.2
Route legacy requests to v2.2 Search Orchestrator
New clients use v3.0 Intent Orchestrator
Deprecation timeline: 6 months

9.2. Feature Flags

# .env
ENABLE_INTENT_ORCHESTRATION=true       # Enable new system
ENABLE_LEGACY_SEARCH=true              # Keep v2.2 available
INTENT_ORCHESTRATION_ROLLOUT=50        # Percentage of users on v3.0

9.3. A/B Testing

Route 50% of users to v3.0 Intent Orchestration
Route 50% of users to v2.2 Search
Measure:
- Time to completion
- User satisfaction
- Query success rate
- Error rates

9.4. Rollback Plan

If v3.0 issues arise:

Set ENABLE_INTENT_ORCHESTRATION=false
Route all traffic to v2.2 Search Orchestrator
Fix issues in staging environment
Gradual re-rollout with reduced percentage

10.0 Success Criteria

10.1. Technical Metrics

✅ Intent parsing accuracy >90%
✅ Orchestration path latency <4s (P95)
✅ Search path latency <500ms (P95) - maintained
✅ Workbench data completeness >80%
✅ Zero tenant isolation violations
✅ System uptime 99.9%

10.2. Business Metrics

✅ 70% reduction in time from query to actionable workspace
✅ 80% reduction in manual data entry (field population)
✅ 90%+ user satisfaction with pre-populated workbenches
✅ 50% increase in scenario simulations per user
✅ 30% reduction in decision cycle time

10.3. User Experience Metrics

✅ Single unified interface (Judgment Bar) adoption >95%
✅ Clarification flow completion rate >80%
✅ Workbench pre-population usage >70%
✅ Search pathway still used for 20% of queries (expected)

11.0 Appendix

11.1. Example Intent Parsing Scenarios

Scenario 1: Production Shift (Orchestration)

Input: "Explore shifting production of SKU-X from China to Mexico"
Parsed Intent:
{
  "action": "simulate_scenario",
  "scenario_type": "move_production",
  "entities": [
    { "type": "product", "value": "SKU-X" },
    { "type": "location", "value": "China", "role": "source" },
    { "type": "location", "value": "Mexico", "role": "target" }
  ],
  "intent_type": "action",
  "confidence": 0.95
}
Pathway: Orchestration → What-If Workbench

Scenario 2: Supplier Comparison (Orchestration)

Input: "Compare Supplier A vs Supplier B for aluminum sourcing"
Parsed Intent:
{
  "action": "compare_entities",
  "entities": [
    { "type": "supplier", "value": "Supplier A", "role": "comparison" },
    { "type": "supplier", "value": "Supplier B", "role": "comparison" },
    { "type": "product", "value": "aluminum" }
  ],
  "intent_type": "action",
  "confidence": 0.89
}
Pathway: Orchestration → Comparison Dashboard

Scenario 3: Historical Search (Search)

Input: "Find all decisions about Mexico from last quarter"
Parsed Intent:
{
  "action": "search",
  "search_type": "decision_history",
  "entities": [
    { "type": "location", "value": "Mexico" },
    { "type": "time_range", "value": "last_quarter" }
  ],
  "intent_type": "information_retrieval",
  "confidence": 0.92
}
Pathway: Search → Decision Timeline

Scenario 4: Conversation Retrieval (Search)

Input: "Show me past conversations about PFAS compliance"
Parsed Intent:
{
  "action": "retrieve_conversations",
  "entities": [
    { "type": "topic", "value": "PFAS compliance" }
  ],
  "intent_type": "information_retrieval",
  "confidence": 0.88
}
Pathway: Search → Zep Conversation Timeline

Scenario 5: Ambiguous Query (Clarification)

Input: "Tell me about Mexico supplier options"
Parsed Intent:
{
  "action": "unclear",
  "entities": [
    { "type": "location", "value": "Mexico" },
    { "type": "supplier", "value": "unknown" }
  ],
  "intent_type": "unclear",
  "confidence": 0.62
}
Action: Trigger SIE Clarification
Questions:
- "Are you looking to compare existing Mexico suppliers?"
- "Do you want to simulate a scenario with a new Mexico supplier?"
- "Are you searching for past decisions about Mexico suppliers?"

11.2. Performance Benchmarks

Intent Parsing Latency:

Gemini: 320ms (average), 450ms (P95)
OpenAI GPT-4 Turbo: 280ms (average), 400ms (P95)
Anthropic Claude 3.5 Sonnet: 350ms (average), 500ms (P95)

Scout Query Latency (Parallel):

Cognee GraphRAG: 450ms (P95)
pgvector Semantic: 180ms (P95)
PostgreSQL Factual: 320ms (P95)
Typesense Text: 45ms (P95)
Total (Parallel): 480ms (P95) - limited by slowest service

End-to-End Orchestration:

Intent Parsing: 450ms
Context Enrichment: 280ms
Scout Queries: 480ms
Workbench Population: 350ms
Total: ~1560ms (average), <4000ms (P95 with slower queries)

11.3. Glossary

Judgment Bar: Unified natural language input interface
Intent Orchestration: Process of parsing, enriching, and routing user intent
Zero-Input Workflow: Pre-populated workspace requiring minimal user data entry
Multi-Service Scout: Parallel query executor across 4+ data services
Orchestration Pathway: Route to action-oriented workbenches (workbench population)
Search Pathway: Route to information retrieval (results list)
Context Enrichment: Process of adding session memory and constraints to queries
LLM Provider Agnosticism: Architecture supporting multiple LLM providers (Gemini, OpenAI, Anthropic)

End of Document

1.0 Overview​

1.1. Purpose​

1.2. Core Architecture Evolution​

1.3. Key Innovations​

2.0 Goals and Objectives​

2.1. Primary Goals​

2.2. Success Metrics​

3.0 Proposed Architecture​

3.1. Component Roles​

3.2. Architectural Diagram​

3.3. Data Flow: Orchestration Path Example​

3.4. Data Flow: Search Path Example​

4.0 Functional Requirements​

FR-1: Intent Parser Service (NEW)​

FR-2: Context Enrichment Service (NEW)​

FR-3: Intent Router (NEW)​

FR-4: Multi-Service Scout (NEW - Orchestration Path)​

FR-5: Workbench Populator (NEW - Orchestration Path)​

FR-6: Search Orchestrator (EXISTING - v2.2)​

FR-7: Real-time Status and Result Streaming (ENHANCED)​

FR-8: Data Synchronization (EXISTING - v2.2)​

FR-9: Tenant Isolation & Security (ENHANCED)​

5.0 Unified API Contract​

5.1. Single Unified Endpoint​

5.2. Request Body​

5.3. Response (202 Accepted)​

5.4. WebSocket Events​

5.5. Clarification Flow (Low Confidence)​

6.0 Implementation Architecture​

6.1. Service Layer Structure​

6.2. LLM Provider Abstraction (Enhanced)​

6.3. Database Schema Extensions​

7.0 Phased Implementation Plan​

Phase 1: LLM Provider Abstraction (1-2 weeks)​

Phase 2: Intent Parsing Foundation (2-3 weeks)​

Phase 3: Context Enrichment & Routing (2 weeks)​

Phase 4: Multi-Service Scout (3 weeks)​

Phase 5: Workbench Population (2 weeks)​

Phase 6: Frontend Integration (2 weeks)​

Phase 7: Testing & Optimization (2 weeks)​

8.0 Non-Functional Requirements​

8.1. Performance​

8.2. Accuracy​

8.3. Reliability​

8.4. Security​

8.5. Scalability​

9.0 Migration from v2.2 to v3.0​

9.1. Backward Compatibility​

9.2. Feature Flags​

9.3. A/B Testing​

9.4. Rollback Plan​

10.0 Success Criteria​

10.1. Technical Metrics​

10.2. Business Metrics​

10.3. User Experience Metrics​

11.0 Appendix​

11.1. Example Intent Parsing Scenarios​

11.2. Performance Benchmarks​

11.3. Glossary​

1.0 Overview

1.1. Purpose

1.2. Core Architecture Evolution

1.3. Key Innovations

2.0 Goals and Objectives

2.1. Primary Goals

2.2. Success Metrics

3.0 Proposed Architecture

3.1. Component Roles

3.2. Architectural Diagram

3.3. Data Flow: Orchestration Path Example

3.4. Data Flow: Search Path Example

4.0 Functional Requirements

FR-1: Intent Parser Service (NEW)

FR-2: Context Enrichment Service (NEW)

FR-3: Intent Router (NEW)

FR-4: Multi-Service Scout (NEW - Orchestration Path)

FR-5: Workbench Populator (NEW - Orchestration Path)

FR-6: Search Orchestrator (EXISTING - v2.2)

FR-7: Real-time Status and Result Streaming (ENHANCED)

FR-8: Data Synchronization (EXISTING - v2.2)

FR-9: Tenant Isolation & Security (ENHANCED)

5.0 Unified API Contract

5.1. Single Unified Endpoint

5.2. Request Body

5.3. Response (202 Accepted)

5.4. WebSocket Events

5.5. Clarification Flow (Low Confidence)

6.0 Implementation Architecture

6.1. Service Layer Structure

6.2. LLM Provider Abstraction (Enhanced)

6.3. Database Schema Extensions

7.0 Phased Implementation Plan

Phase 1: LLM Provider Abstraction (1-2 weeks)

Phase 2: Intent Parsing Foundation (2-3 weeks)

Phase 3: Context Enrichment & Routing (2 weeks)

Phase 4: Multi-Service Scout (3 weeks)

Phase 5: Workbench Population (2 weeks)

Phase 6: Frontend Integration (2 weeks)

Phase 7: Testing & Optimization (2 weeks)

8.0 Non-Functional Requirements

8.1. Performance

8.2. Accuracy

8.3. Reliability

8.4. Security

8.5. Scalability

9.0 Migration from v2.2 to v3.0

9.1. Backward Compatibility

9.2. Feature Flags

9.3. A/B Testing

9.4. Rollback Plan

10.0 Success Criteria

10.1. Technical Metrics

10.2. Business Metrics

10.3. User Experience Metrics

11.0 Appendix

11.1. Example Intent Parsing Scenarios

11.2. Performance Benchmarks

11.3. Glossary