Skip to main content

FSD: Intent-Driven Orchestration & Intelligent Search System

Version: 3.0 Date: November 14, 2024 Status: Final Draft Author: Pramod Prasanth Supersedes: FSD Dual-Engine Hybrid Search System v2.2

1.0 Overview

1.1. Purpose

This document specifies the functional requirements for ChainAlign's Intent-Driven Orchestration & Intelligent Search System. This evolved architecture builds upon the Dual-Engine Hybrid Search foundation to provide a unified "Judgment Bar" interface that intelligently routes user intent to either:

  1. Orchestration Pathway ("Zero-Input" Workflow): Pre-populates workbenches with context-aware data from multiple sources
  2. Search Pathway (Information Retrieval): Returns filtered, ranked results for exploration and review

The system uses LLM-based intent parsing, session context enrichment (Zep), and multi-service data scouting (Cognee/GraphRAG + pgvector + PostgreSQL + Typesense) to eliminate manual data entry and transform ChainAlign from a search tool into an intelligent decision workspace orchestrator.

1.2. Core Architecture Evolution

Previous Architecture (v2.2):

User Query → Search Orchestrator → [PostgreSQL + Typesense] → Search Results

New Architecture (v3.0):

Natural Language → Intent Parser (LLM) → Context Enrichment (Zep) → Router
├→ Orchestration Path → Multi-Service Scout → Pre-populated Workbench
└→ Search Path → Dual-Engine Search → Results List

1.3. Key Innovations

  1. Natural Language First: Users express intent in plain language, not search syntax
  2. Context-Aware: Zep memory enriches queries with session history, constraints, and discussions
  3. Multi-Source Intelligence: Combines knowledge graph (Cognee), semantic memory (pgvector), transactional data (PostgreSQL), and text search (Typesense)
  4. Action-Oriented: Routes to workspaces for decision-making, not just information display
  5. Zero-Input Workflows: Pre-populates entire scenarios based on parsed intent
  6. LLM-Agnostic: Provider abstraction supports Gemini, OpenAI, Anthropic, etc.

2.0 Goals and Objectives

2.1. Primary Goals

  • Eliminate Manual Data Entry: Pre-populate 80% of workflows through intelligent context gathering
  • Unified Interface: Single "Judgment Bar" for all user interactions (actions + search)
  • Context-Aware Intelligence: Leverage session memory to anticipate user needs
  • Multi-Modal Routing: Intelligent pathway selection based on intent type
  • Maintain Search Excellence: Preserve existing dual-engine search capabilities

2.2. Success Metrics

  • Time to Insight: Reduce time from query to actionable workspace by 70%
  • Data Entry Reduction: Decrease manual field population from 10+ fields to 2-3 validation checks
  • Intent Accuracy: 90%+ correct routing between Orchestration vs. Search pathways
  • Response Latency:
    • Intent parsing: <500ms (P95)
    • Context enrichment: <300ms (P95)
    • Scout queries (parallel): <2000ms (P95)
    • Search queries: <500ms (P95) - maintain existing performance

3.0 Proposed Architecture

3.1. Component Roles

ComponentRolePurposeIntegration
Intent ParserLLM-based intent deconstructionParse natural language into structured JSON (action, entities, type)Gemini/OpenAI/Anthropic via LLMClient
Context EnricherSession memory integrationRetrieve user role, constraints, recent discussions from ZepZep (M73)
Intent RouterPathway selectorRoute to Orchestration or Search based on intent.actionCustom logic
Multi-Service ScoutParallel data gathererExecute 4+ parallel queries across Cognee, pgvector, PostgreSQL, TypesenseM72 (Cognee) + RAGService + Repositories
Workbench PopulatorWorkspace builderAssemble pre-populated workspace data from scout resultsCustom service
Search OrchestratorDual-engine search (v2.2)Execute PostgreSQL + Typesense hybrid searchExisting implementation

3.2. Architectural Diagram

3.3. Data Flow: Orchestration Path Example

User Input: "Explore shifting production of SKU-X from China to Mexico"

Phase 1: Intent Parsing (LLM)

{
"action": "simulate_scenario",
"scenario_type": "move_production",
"entities": [
{ "type": "product", "value": "SKU-X" },
{ "type": "location", "value": "China", "role": "source" },
{ "type": "location", "value": "Mexico", "role": "target" }
],
"intent_type": "action",
"confidence": 0.95
}

Phase 2: Context Enrichment (Zep)

{
"action": "simulate_scenario",
"entities": [...],
"session_context": {
"user_role": "Supply_Planner",
"active_constraints": ["q4_budget_freeze"],
"recent_discussions": ["mexico_port_labor_issues"],
"past_decisions": ["decision-456: China selected for cost in 2023"]
}
}

Phase 3: Multi-Service Scout (Parallel)

Scout Query 1 (Cognee/GraphRAG):

query GetScenarioContext {
sku: entity(name: "SKU-X") { id, properties }
source: entity(name: "China") { id, properties }
target: entity(name: "Mexico") { id, properties }
sku_constraints: relationships(from: sku.id, type: "HAS_CONSTRAINT")
source_suppliers: relationships(from: source.id, type: "HAS_SUPPLIER")
target_logistics: relationships(from: target.id, type: "HAS_LOGISTICS_ROUTE")
}

Scout Query 2 (pgvector - Semantic):

SELECT note_content, decision_rationale, source_document
FROM judgment_embeddings
ORDER BY embedding <=> gemini_embed('Mexico logistics risk OR China supplier quality SKU-X')
LIMIT 5;

Scout Query 3 (PostgreSQL - Factual):

SELECT landed_cost, lead_time, current_inventory
FROM sku_master_data
WHERE sku_id = 'SKU-X' AND region = 'China';

Scout Query 4 (Typesense - Text Search):

{
q: 'SKU-X Mexico China production',
filter_by: 'tenant_id:tenant-123',
query_by: 'content,notes,description'
}

Phase 4: Workbench Population

{
"redirectTo": "/what-if-workbench",
"prePopulated": true,
"baseScenario": {
"title": "Current State: Production in China",
"data": { "landed_cost": 10.50, "lead_time": 28, ... },
"locked": true
},
"targetScenario": {
"title": "Proposed State: Production in Mexico",
"data": { "est_landed_cost": 12.00, "est_lead_time": 14, ... },
"editable": true
},
"contextAndRisks": {
"constraints": ["PFAS_Compliant_Material"],
"risks": ["Mexico_Port_Labor_Issues"],
"pastReasoning": ["2023 decision based on cost"],
"suppliers": ["ABC_Materials_China", "Requires new supplier qualification"]
}
}

3.4. Data Flow: Search Path Example

User Input: "Find all decisions about Mexico from last quarter"

Phase 1: Intent Parsing (LLM)

{
"action": "search",
"search_type": "decision_history",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "time_range", "value": "last_quarter" }
],
"intent_type": "information_retrieval",
"confidence": 0.92
}

Phase 2: Context Enrichment (Zep)

{
"action": "search",
"entities": [...],
"session_context": {
"user_role": "Supply_Planner",
"relevant_context": ["Previous searches about Mexico in past 7 days"]
}
}

Phase 3: Search Orchestrator (Dual-Engine)

  • Routes to existing Search Orchestrator (v2.2)
  • Executes PostgreSQL + Typesense hybrid query
  • Returns ranked results list

Output: Decision Timeline with filtered results

4.0 Functional Requirements

FR-1: Intent Parser Service (NEW)

FR-1.1 (Natural Language Parsing): The Intent Parser must accept natural language queries and use LLM-based parsing to structure them into actionable JSON.

Input: Raw text string Output: Structured intent JSON with:

  • action (enum: simulate_scenario, compare_entities, analyze_risk, search, retrieve_conversations, find_documents)
  • scenario_type (optional: move_production, change_supplier, adjust_inventory)
  • entities (array: type, value, role)
  • intent_type (enum: action, information_retrieval)
  • confidence (float: 0-1)

FR-1.2 (LLM Provider Agnosticism): The Intent Parser must use the LLMClient abstraction layer, supporting multiple providers (Gemini, OpenAI, Anthropic) via configuration.

FR-1.3 (Function Calling Schema): The Intent Parser must use strict function calling schemas to ensure structured output:

const intentSchema = {
name: 'parse_user_intent',
description: 'Parse user query into structured intent',
parameters: {
type: 'object',
properties: {
action: {
type: 'string',
enum: ['simulate_scenario', 'compare_entities', 'analyze_risk', 'search', 'retrieve_conversations', 'find_documents']
},
scenario_type: {
type: 'string',
enum: ['move_production', 'change_supplier', 'adjust_inventory', 'other']
},
entities: {
type: 'array',
items: {
type: 'object',
properties: {
type: { type: 'string', enum: ['product', 'location', 'supplier', 'constraint', 'time_range'] },
value: { type: 'string' },
role: { type: 'string', enum: ['source', 'target', 'comparison'] }
}
}
},
intent_type: {
type: 'string',
enum: ['action', 'information_retrieval']
},
confidence: { type: 'number', minimum: 0, maximum: 1 }
},
required: ['action', 'intent_type', 'confidence']
}
};

FR-1.4 (Low Confidence Handling): When confidence < 0.7, the system must trigger Socratic Inquiry Engine (M70) to generate clarifying questions:

if (intent.confidence < 0.7) {
const clarifyingQuestions = await SocraticInquiryService.generateSocraticQuestions({
decisionContext: rawQuery,
decisionType: 'intent_disambiguation',
decisionScope: 'single_query',
tenantId,
user
});

return {
status: 'needs_clarification',
questions: clarifyingQuestions,
originalIntent: intent
};
}

FR-1.5 (Fallback to Search): When intent parsing fails or action is unrecognized, the system must gracefully fallback to search pathway.


FR-2: Context Enrichment Service (NEW)

FR-2.1 (Zep Integration): The Context Enricher must call Zep (M73) to retrieve session-specific context before executing queries.

Context Retrieved:

  • User role and permissions
  • Active constraints (e.g., "q4_budget_freeze")
  • Recent discussions and topics
  • Past decision references
  • Relevant facts from previous sessions

FR-2.2 (Context Injection): The enriched context must be injected into all downstream queries:

  • GraphRAG queries filtered by relevant entities
  • Semantic searches biased toward recent topics
  • PostgreSQL queries filtered by active constraints

FR-2.3 (Privacy & Tenant Isolation): Context retrieval must respect tenant boundaries and user privacy settings. Cross-tenant context leakage is prohibited.


FR-3: Intent Router (NEW)

FR-3.1 (Pathway Selection): The Intent Router must analyze the action field and route to the appropriate pathway:

Action TypePathwayOutput
simulate_scenarioOrchestrationWhat-If Workbench
compare_entitiesOrchestrationComparison Dashboard
analyze_riskOrchestrationRisk Dashboard
searchSearchResults List
retrieve_conversationsSearchTimeline (Zep)
find_documentsSearchDocument List

FR-3.2 (Routing Logic):

async routeIntent(enrichedIntent) {
const orchestrationActions = ['simulate_scenario', 'compare_entities', 'analyze_risk'];
const searchActions = ['search', 'retrieve_conversations', 'find_documents'];

if (orchestrationActions.includes(enrichedIntent.action)) {
return await this.executeOrchestrationPath(enrichedIntent);
} else if (searchActions.includes(enrichedIntent.action)) {
return await this.executeSearchPath(enrichedIntent);
} else {
// Fallback to search
return await this.executeSearchPath({ ...enrichedIntent, action: 'search' });
}
}

FR-3.3 (Audit Trail): All routing decisions must be logged for analytics and debugging.


FR-4: Multi-Service Scout (NEW - Orchestration Path)

FR-4.1 (Parallel Execution): The Scout must execute queries across all 4+ services in parallel to minimize latency:

const scoutResults = await Promise.all([
this.cogneeService.traverseGraph({ ... }), // Structural query
this.ragService.semanticSearch({ ... }), // Semantic query
this.skuRepository.getBaselineData({ ... }), // Factual query
this.typesenseService.search({ ... }) // Text search
]);

FR-4.2 (Service-Specific Queries):

Cognee/GraphRAG (Structural):

  • Purpose: Find relationships, constraints, suppliers, logistics routes
  • Query Type: Graph traversal (BFS/DFS, relationship filtering)
  • Output: Constraint nodes, supplier nodes, logistics nodes

pgvector (Semantic):

  • Purpose: Retrieve past reasoning, hidden risks, unstructured context
  • Query Type: Vector similarity search on embeddings
  • Output: Notes, rationale, source documents

PostgreSQL (Factual):

  • Purpose: Get transactional baseline data (metrics, inventory, costs)
  • Query Type: SQL SELECT with filters
  • Output: Numerical metrics (landed_cost, lead_time, inventory)

Typesense (Text):

  • Purpose: Fast typo-tolerant text search across documents
  • Query Type: Full-text search with filters
  • Output: Relevant documents, notes, descriptions

FR-4.3 (Result Normalization): Scout results must be normalized into a consistent schema before merging:

{
structural: { constraints: [], suppliers: [], logistics: [] },
semantic: { risks: [], rationale: [], documents: [] },
factual: { baseline_metrics: { ... } },
textual: { matching_documents: [] }
}

FR-4.4 (Partial Results): Scout must stream partial results via WebSocket as each service responds, prioritizing fast services (Typesense, Cognee) over slow services (complex PostgreSQL joins).

FR-4.5 (Graceful Degradation): If any single service fails, Scout must continue with remaining services and flag missing data:

{
structural: { status: 'success', data: [...] },
semantic: { status: 'failed', error: 'pgvector timeout', data: [] },
factual: { status: 'success', data: {...} },
textual: { status: 'success', data: [...] }
}

FR-5: Workbench Populator (NEW - Orchestration Path)

FR-5.1 (Data Assembly): The Workbench Populator must assemble scout results into a pre-populated workspace structure:

{
action: 'simulate_scenario',
redirectTo: '/what-if-workbench',
prePopulated: true,
baseScenario: {
title: 'Current State: ...',
data: { ...factualResults },
locked: true,
metrics: [
{ label: 'Landed Cost', value: '$10.50', unit: 'USD' },
{ label: 'Lead Time', value: '28', unit: 'days' }
]
},
targetScenario: {
title: 'Proposed State: ...',
data: { ...estimatedMetrics },
editable: true,
sliders: [
{ field: 'landed_cost', min: 8, max: 15, initial: 12, step: 0.5 },
{ field: 'lead_time', min: 10, max: 40, initial: 14, step: 1 }
]
},
contextAndRisks: {
constraints: [...structuralResults.constraints],
risks: [...semanticResults.risks],
pastReasoning: [...semanticResults.rationale],
suppliers: [...structuralResults.suppliers]
}
}

FR-5.2 (Action-Specific Templates): The Populator must support different templates based on action type:

  • simulate_scenario → What-If Workbench template
  • compare_entities → Comparison Dashboard template
  • analyze_risk → Risk Dashboard template

FR-5.3 (Validation & Completeness): The Populator must validate that required fields are present and flag incomplete data:

{
completeness: {
baseScenario: 100, // All required fields present
targetScenario: 75, // Missing 2 of 8 fields
contextAndRisks: 60 // Missing supplier qualification data
},
warnings: [
'No historical data for Mexico location - using industry estimates',
'PFAS compliance constraint requires manual verification'
]
}

FR-5.4 (User Validation Mode): Pre-populated workspaces must clearly indicate which fields are AI-generated vs. user-verified:

{
fields: [
{ name: 'landed_cost', value: 12.00, source: 'ai_estimate', confidence: 0.7, editable: true },
{ name: 'lead_time', value: 14, source: 'historical_data', confidence: 0.95, editable: true }
]
}

FR-6: Search Orchestrator (EXISTING - v2.2)

FR-6.1 through FR-6.5: All existing Search Orchestrator requirements from v2.2 are preserved and apply to the Search Pathway:

  • Query Intent Analysis (textual vs. analytical vs. hybrid)
  • Asynchronous Execution with 202 Accepted response
  • Result Merging Strategies (INTERSECTION, UNION)
  • Normalization and De-duplication
  • Tenant Isolation

FR-6.6 (Integration with Intent Router): The Search Orchestrator must accept enriched intents from the Intent Router and extract relevant parameters:

async executeSearchPath(enrichedIntent) {
const searchParams = {
q: enrichedIntent.textQuery || this.buildTextQuery(enrichedIntent.entities),
filters: this.buildFilters(enrichedIntent.entities, enrichedIntent.session_context),
merge_strategy: 'INTERSECTION',
pagination: { page: 1, limit: 25 }
};

return await this.searchOrchestrator.execute(searchParams);
}

FR-7: Real-time Status and Result Streaming (ENHANCED)

FR-7.1 (WebSocket Communication): The system must use WebSocket to stream progress updates for all pathways.

FR-7.2 (Phase Progress Events): For Orchestration Path, emit progress events for each phase:

// Phase 1: Intent Parsing
{ event: 'PARSING_INTENT', progress: 20, timestamp: '...' }

// Phase 2: Context Enrichment
{ event: 'ENRICHING_CONTEXT', progress: 40, context: {...}, timestamp: '...' }

// Phase 3: Scouting Data (with sub-phases)
{ event: 'SCOUTING_DATA', progress: 60, phase: 'cognee_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 70, phase: 'pgvector_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 80, phase: 'postgres_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 90, phase: 'typesense_query', status: 'complete', timestamp: '...' }

// Phase 4: Workbench Ready
{ event: 'WORKBENCH_READY', progress: 100, data: {...}, redirectTo: '/what-if-workbench', timestamp: '...' }

FR-7.3 (Search Path Events): For Search Path, emit existing search orchestrator events:

{ event: 'SEARCH_STARTED', search_id: '...', timestamp: '...' }
{ event: 'PARTIAL_RESULTS', results: [...], source: 'typesense', timestamp: '...' }
{ event: 'SEARCH_COMPLETE', results: [...], total_count: 50, timestamp: '...' }

FR-7.4 (Error Events): All errors must be streamed to the client with actionable context:

{
event: 'ERROR',
phase: 'context_enrichment',
error: 'Zep service unavailable',
fallback: 'Continuing without session context',
retryable: true,
timestamp: '...'
}

FR-8: Data Synchronization (EXISTING - v2.2)

All existing synchronization requirements from v2.2 remain in effect:

  • FR-8.1 (Mechanism): PostgreSQL triggers + pgmq worker
  • FR-8.2 (Max Latency): Under 5 seconds for critical tables
  • FR-8.3 (Fallback): Graceful degradation if sync worker fails

FR-9: Tenant Isolation & Security (ENHANCED)

FR-9.1 (Multi-Layer Enforcement): Tenant isolation must be enforced at every layer:

  1. Intent Parser: Extract tenant_id from JWT
  2. Context Enricher: Filter Zep queries by tenant_id
  3. Scout Services: Inject tenant_id into all queries
  4. Search Orchestrator: Apply existing tenant filters

FR-9.2 (Cross-Tenant Blocking): Queries attempting to access multi-tenant data must be rejected with error:

{
error: 'TENANT_ISOLATION_VIOLATION',
message: 'Query cannot span multiple tenants',
tenant_id: 'tenant-123'
}

FR-9.3 (Audit Logging): All intent parsing, context retrieval, and data access must be logged to audit trail with:

  • User ID
  • Tenant ID
  • Raw query
  • Parsed intent
  • Services accessed
  • Results returned
  • Timestamp

5.0 Unified API Contract

5.1. Single Unified Endpoint

Endpoint: POST /api/orchestrator/execute-intent

Purpose: Accept natural language or structured queries, route to appropriate pathway, and stream results.

5.2. Request Body

{
"query": "string | null",
"sessionId": "string | null",
"options": {
"forcePathway": "orchestration | search | null",
"enableStreaming": true,
"timeout": 10000
}
}

Fields:

  • query (required): Natural language query or structured text
  • sessionId (optional): Zep session ID for context enrichment
  • options.forcePathway (optional): Override automatic routing for testing
  • options.enableStreaming (optional): Enable WebSocket streaming (default: true)
  • options.timeout (optional): Max execution time in ms (default: 10000)

5.3. Response (202 Accepted)

{
"intent_id": "intent-789",
"status": "processing",
"pathway": "orchestration | search",
"estimated_time_ms": 2000,
"websocket_channel": "intent_updates_intent-789"
}

5.4. WebSocket Events

Orchestration Path Events:

// Phase 1
{ "event": "PARSING_INTENT", "intent_id": "intent-789", "progress": 20 }

// Phase 2
{ "event": "ENRICHING_CONTEXT", "intent_id": "intent-789", "progress": 40, "context": {...} }

// Phase 3
{ "event": "SCOUTING_DATA", "intent_id": "intent-789", "progress": 75, "partial_results": {...} }

// Phase 4
{
"event": "WORKBENCH_READY",
"intent_id": "intent-789",
"progress": 100,
"data": {...},
"redirectTo": "/what-if-workbench?intent=intent-789",
"completeness": {...}
}

Search Path Events:

{ "event": "SEARCH_STARTED", "intent_id": "intent-789", "search_id": "search-123" }
{ "event": "PARTIAL_RESULTS", "intent_id": "intent-789", "results": [...], "source": "typesense" }
{ "event": "SEARCH_COMPLETE", "intent_id": "intent-789", "results": [...], "total_count": 50 }

5.5. Clarification Flow (Low Confidence)

When confidence < 0.7, system responds with clarifying questions:

Response (200 OK):

{
"status": "needs_clarification",
"intent_id": "intent-789",
"originalQuery": "Explore shifting production...",
"questions": [
{
"id": "q1",
"category": "assumptions",
"question": "Are you considering any specific suppliers in Mexico?",
"priority": "high"
},
{
"id": "q2",
"category": "constraints",
"question": "What budget constraints apply to this shift?",
"priority": "medium"
}
],
"suggestedAnswers": {
"q1": ["ABC Materials Mexico", "XYZ Logistics Mexico", "Other (specify)"]
}
}

User Provides Answers:

POST /api/orchestrator/clarify-intent

{
"intent_id": "intent-789",
"answers": [
{ "question_id": "q1", "answer": "ABC Materials Mexico" },
{ "question_id": "q2", "answer": "Q4 budget freeze applies" }
]
}

System Re-parses with enriched context and continues orchestration.

6.0 Implementation Architecture

6.1. Service Layer Structure

backend/src/services/
├── orchestration/
│ ├── IntentParserService.js ← Phase 1: LLM-based parsing
│ ├── ContextEnricherService.js ← Phase 2: Zep integration
│ ├── IntentRouter.js ← Phase 3: Pathway routing
│ ├── MultiServiceScout.js ← Phase 3a: Parallel query executor
│ ├── WorkbenchPopulatorService.js ← Phase 4: Workspace assembly
│ └── IntentOrchestrator.js ← Main coordinator
├── search/
│ └── SearchOrchestrator.js ← Existing dual-engine search (v2.2)
├── ZepService.js ← M73: Zep Memory
├── CogneeService.js ← M72: Cognee GraphRAG
├── RAGService.js ← pgvector semantic search
└── llm/
├── LLMClient.js ← Provider abstraction
├── ProviderRegistry.js
└── providers/
├── GeminiProvider.js
├── OpenAIProvider.js ← NEW
└── AnthropicProvider.js ← NEW

6.2. LLM Provider Abstraction (Enhanced)

Configuration:

# .env
LLM_PROVIDER=gemini # Primary provider
LLM_PROVIDER_INTENT_PARSING=gemini # Override for intent parsing
LLM_PROVIDER_EMBEDDINGS=openai # Override for embeddings
LLM_PROVIDER_LONG_CONTEXT=anthropic # Override for long context

GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key
ANTHROPIC_API_KEY=your-key

Usage in IntentParserService:

import LLMClient from '../llm/LLMClient.js';

class IntentParserService {
async parseIntent(query) {
const provider = process.env.LLM_PROVIDER_INTENT_PARSING || 'gemini';

const result = await LLMClient.chatWithTools({
provider,
messages: [{ role: 'user', content: query }],
tools: [this.intentParsingSchema],
context: { operation: 'intent_parsing' }
});

return JSON.parse(result.content);
}
}

6.3. Database Schema Extensions

New Table: intent_log

CREATE TABLE intent_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
tenant_id UUID NOT NULL,
raw_query TEXT NOT NULL,
parsed_intent JSONB NOT NULL,
enriched_context JSONB,
scout_results JSONB,
pathway VARCHAR(50) NOT NULL, -- 'orchestration' or 'search'
action_taken VARCHAR(50) NOT NULL,
redirect_url TEXT,
confidence DECIMAL(3,2),
execution_time_ms INTEGER,
created_at TIMESTAMP DEFAULT NOW(),

CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id),
CONSTRAINT fk_tenant FOREIGN KEY (tenant_id) REFERENCES tenants(id)
);

CREATE INDEX idx_intent_log_user ON intent_log(user_id);
CREATE INDEX idx_intent_log_tenant ON intent_log(tenant_id);
CREATE INDEX idx_intent_log_action ON intent_log(action_taken);
CREATE INDEX idx_intent_log_created ON intent_log(created_at);

New Table: clarification_sessions

CREATE TABLE clarification_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
intent_id UUID NOT NULL REFERENCES intent_log(id),
questions JSONB NOT NULL,
answers JSONB,
status VARCHAR(50) DEFAULT 'pending', -- 'pending', 'answered', 'expired'
created_at TIMESTAMP DEFAULT NOW(),
answered_at TIMESTAMP
);

7.0 Phased Implementation Plan

Phase 1: LLM Provider Abstraction (1-2 weeks)

Week 1:

  • ✅ Migrate AIGateway to use LLMClient instead of direct Gemini SDK
  • ✅ Test all existing services (AIGateway, ScenarioStreamingEngine, DecisionExecutionService)
  • ✅ Update environment configuration
  • ✅ Create OpenAIProvider and AnthropicProvider implementations

Week 2:

  • ✅ Add provider health checks and fallback logic
  • ✅ Update documentation
  • ✅ Performance benchmarking (Gemini vs. OpenAI vs. Anthropic)

Deliverables:

  • Fully LLM-agnostic architecture
  • Support for 3 providers (Gemini, OpenAI, Anthropic)
  • Migration guide for existing services

Phase 2: Intent Parsing Foundation (2-3 weeks)

Week 3:

  • ✅ Create IntentParserService with function calling schema
  • ✅ Implement POST /api/orchestrator/parse-intent endpoint
  • ✅ Add intent_log table and repository
  • ✅ Unit tests for intent parsing accuracy

Week 4:

  • ✅ Build clarification flow with SIE integration
  • ✅ Implement low confidence handling (<0.7)
  • ✅ Create clarification_sessions table
  • ✅ Frontend: Clarification question modal

Week 5:

  • ✅ Integration tests with real user queries
  • ✅ Accuracy benchmarking (target: 90%+ correct routing)
  • ✅ Error handling and fallback to search

Deliverables:

  • Intent parsing with 90%+ accuracy
  • Clarification flow for ambiguous queries
  • Comprehensive test coverage

Phase 3: Context Enrichment & Routing (2 weeks)

Week 6:

  • ✅ Create ContextEnricherService
  • ✅ Integrate Zep (M73) for session context retrieval
  • ✅ Implement context injection into scout queries

Week 7:

  • ✅ Create IntentRouter with pathway logic
  • ✅ Implement routing decision tree
  • ✅ Add audit logging for all routing decisions

Deliverables:

  • Context-aware query enrichment
  • Intelligent routing between pathways
  • Audit trail for all intent operations

Phase 4: Multi-Service Scout (3 weeks)

Week 8:

  • ✅ Create MultiServiceScout coordinator
  • ✅ Implement parallel query execution
  • ✅ Add result normalization logic

Week 9:

  • ✅ Build service-specific query builders:
    • Cognee GraphRAG query builder
    • pgvector semantic search query builder
    • PostgreSQL factual query builder
    • Typesense text query builder

Week 10:

  • ✅ Implement graceful degradation (service failures)
  • ✅ Add partial results streaming via WebSocket
  • ✅ Performance optimization (query caching, connection pooling)

Deliverables:

  • Multi-service scout with <2s P95 latency
  • Graceful degradation on service failures
  • Real-time partial results streaming

Phase 5: Workbench Population (2 weeks)

Week 11:

  • ✅ Create WorkbenchPopulatorService
  • ✅ Implement action-specific templates
  • ✅ Build data validation and completeness checks

Week 12:

  • ✅ Add AI vs. user-verified field tracking
  • ✅ Implement warning system for incomplete data
  • ✅ Integration tests with What-If Workbench

Deliverables:

  • Pre-populated workbench templates
  • Validation and completeness tracking
  • Clear AI-generated vs. verified indicators

Phase 6: Frontend Integration (2 weeks)

Week 13:

  • ✅ Create JudgmentBar component (natural language input)
  • ✅ Build IntentProgressIndicator for phase tracking
  • ✅ Implement WebSocket event handlers

Week 14:

  • ✅ Update What-If Workbench to accept pre-populated data
  • ✅ Add validation mode UI (editable sliders, confidence indicators)
  • ✅ Build clarification question flow UI

Deliverables:

  • Unified Judgment Bar interface
  • Pre-populated workbenches with validation UI
  • Real-time progress indicators

Phase 7: Testing & Optimization (2 weeks)

Week 15:

  • ✅ End-to-end integration tests
  • ✅ Performance benchmarking (target latencies)
  • ✅ User acceptance testing (UAT)

Week 16:

  • ✅ Security audit (tenant isolation, data privacy)
  • ✅ Load testing (concurrent users, query volume)
  • ✅ Documentation and training materials

Deliverables:

  • Production-ready system
  • Performance metrics meeting targets
  • Comprehensive documentation

8.0 Non-Functional Requirements

8.1. Performance

MetricTargetMeasurement
Intent Parsing Latency (P95)<500msLLM API call + parsing
Context Enrichment Latency (P95)<300msZep API call
Scout Parallel Queries (P95)<2000msAll 4 services complete
Search Queries (P95)<500msMaintain existing performance
Workbench Population (P95)<1000msData assembly + validation
End-to-End Orchestration (P95)<4000msAll phases complete

8.2. Accuracy

MetricTargetMeasurement
Intent Parsing Accuracy>90%Correct pathway routing
Clarification Trigger Rate5-10%Low confidence queries
Data Completeness (Workbench)>80%Required fields populated
Tenant Isolation Violations0Audit log review

8.3. Reliability

  • Uptime: 99.9% availability for core orchestration services
  • Graceful Degradation: System must function with up to 2 of 4 scout services down
  • Fallback: Always fallback to search pathway if orchestration fails
  • Data Consistency: Typesense sync lag <5 seconds (maintain existing requirement)

8.4. Security

  • Tenant Isolation: Multi-layer enforcement (parser, enricher, scout, search)
  • Audit Logging: All intents, queries, and data access logged
  • Data Privacy: Zep context respects user privacy settings
  • LLM Security: No PII sent to LLM providers (enforce redaction if needed)

8.5. Scalability

  • Concurrent Users: Support 1000+ concurrent intent executions
  • Query Volume: Handle 10,000+ intents/hour
  • Scout Parallelization: Auto-scale based on query complexity
  • WebSocket Connections: Support 5000+ active WebSocket sessions

9.0 Migration from v2.2 to v3.0

9.1. Backward Compatibility

Existing Search API Preserved:

POST /api/search  ← Still supported for legacy clients

Gradual Migration Path:

  1. Deploy v3.0 alongside v2.2
  2. Route legacy requests to v2.2 Search Orchestrator
  3. New clients use v3.0 Intent Orchestrator
  4. Deprecation timeline: 6 months

9.2. Feature Flags

# .env
ENABLE_INTENT_ORCHESTRATION=true # Enable new system
ENABLE_LEGACY_SEARCH=true # Keep v2.2 available
INTENT_ORCHESTRATION_ROLLOUT=50 # Percentage of users on v3.0

9.3. A/B Testing

  • Route 50% of users to v3.0 Intent Orchestration
  • Route 50% of users to v2.2 Search
  • Measure:
    • Time to completion
    • User satisfaction
    • Query success rate
    • Error rates

9.4. Rollback Plan

If v3.0 issues arise:

  1. Set ENABLE_INTENT_ORCHESTRATION=false
  2. Route all traffic to v2.2 Search Orchestrator
  3. Fix issues in staging environment
  4. Gradual re-rollout with reduced percentage

10.0 Success Criteria

10.1. Technical Metrics

  • ✅ Intent parsing accuracy >90%
  • ✅ Orchestration path latency <4s (P95)
  • ✅ Search path latency <500ms (P95) - maintained
  • ✅ Workbench data completeness >80%
  • ✅ Zero tenant isolation violations
  • ✅ System uptime 99.9%

10.2. Business Metrics

  • ✅ 70% reduction in time from query to actionable workspace
  • ✅ 80% reduction in manual data entry (field population)
  • ✅ 90%+ user satisfaction with pre-populated workbenches
  • ✅ 50% increase in scenario simulations per user
  • ✅ 30% reduction in decision cycle time

10.3. User Experience Metrics

  • ✅ Single unified interface (Judgment Bar) adoption >95%
  • ✅ Clarification flow completion rate >80%
  • ✅ Workbench pre-population usage >70%
  • ✅ Search pathway still used for 20% of queries (expected)

11.0 Appendix

11.1. Example Intent Parsing Scenarios

Scenario 1: Production Shift (Orchestration)

Input: "Explore shifting production of SKU-X from China to Mexico"
Parsed Intent:
{
"action": "simulate_scenario",
"scenario_type": "move_production",
"entities": [
{ "type": "product", "value": "SKU-X" },
{ "type": "location", "value": "China", "role": "source" },
{ "type": "location", "value": "Mexico", "role": "target" }
],
"intent_type": "action",
"confidence": 0.95
}
Pathway: Orchestration → What-If Workbench

Scenario 2: Supplier Comparison (Orchestration)

Input: "Compare Supplier A vs Supplier B for aluminum sourcing"
Parsed Intent:
{
"action": "compare_entities",
"entities": [
{ "type": "supplier", "value": "Supplier A", "role": "comparison" },
{ "type": "supplier", "value": "Supplier B", "role": "comparison" },
{ "type": "product", "value": "aluminum" }
],
"intent_type": "action",
"confidence": 0.89
}
Pathway: Orchestration → Comparison Dashboard

Scenario 3: Historical Search (Search)

Input: "Find all decisions about Mexico from last quarter"
Parsed Intent:
{
"action": "search",
"search_type": "decision_history",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "time_range", "value": "last_quarter" }
],
"intent_type": "information_retrieval",
"confidence": 0.92
}
Pathway: Search → Decision Timeline

Scenario 4: Conversation Retrieval (Search)

Input: "Show me past conversations about PFAS compliance"
Parsed Intent:
{
"action": "retrieve_conversations",
"entities": [
{ "type": "topic", "value": "PFAS compliance" }
],
"intent_type": "information_retrieval",
"confidence": 0.88
}
Pathway: Search → Zep Conversation Timeline

Scenario 5: Ambiguous Query (Clarification)

Input: "Tell me about Mexico supplier options"
Parsed Intent:
{
"action": "unclear",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "supplier", "value": "unknown" }
],
"intent_type": "unclear",
"confidence": 0.62
}
Action: Trigger SIE Clarification
Questions:
- "Are you looking to compare existing Mexico suppliers?"
- "Do you want to simulate a scenario with a new Mexico supplier?"
- "Are you searching for past decisions about Mexico suppliers?"

11.2. Performance Benchmarks

Intent Parsing Latency:

  • Gemini: 320ms (average), 450ms (P95)
  • OpenAI GPT-4 Turbo: 280ms (average), 400ms (P95)
  • Anthropic Claude 3.5 Sonnet: 350ms (average), 500ms (P95)

Scout Query Latency (Parallel):

  • Cognee GraphRAG: 450ms (P95)
  • pgvector Semantic: 180ms (P95)
  • PostgreSQL Factual: 320ms (P95)
  • Typesense Text: 45ms (P95)
  • Total (Parallel): 480ms (P95) - limited by slowest service

End-to-End Orchestration:

  • Intent Parsing: 450ms
  • Context Enrichment: 280ms
  • Scout Queries: 480ms
  • Workbench Population: 350ms
  • Total: ~1560ms (average), <4000ms (P95 with slower queries)

11.3. Glossary

  • Judgment Bar: Unified natural language input interface
  • Intent Orchestration: Process of parsing, enriching, and routing user intent
  • Zero-Input Workflow: Pre-populated workspace requiring minimal user data entry
  • Multi-Service Scout: Parallel query executor across 4+ data services
  • Orchestration Pathway: Route to action-oriented workbenches (workbench population)
  • Search Pathway: Route to information retrieval (results list)
  • Context Enrichment: Process of adding session memory and constraints to queries
  • LLM Provider Agnosticism: Architecture supporting multiple LLM providers (Gemini, OpenAI, Anthropic)

End of Document