FSD: Intent-Driven Orchestration & Intelligent Search System
Version: 3.0 Date: November 14, 2024 Status: Final Draft Author: Pramod Prasanth Supersedes: FSD Dual-Engine Hybrid Search System v2.2
1.0 Overview
1.1. Purpose
This document specifies the functional requirements for ChainAlign's Intent-Driven Orchestration & Intelligent Search System. This evolved architecture builds upon the Dual-Engine Hybrid Search foundation to provide a unified "Judgment Bar" interface that intelligently routes user intent to either:
- Orchestration Pathway ("Zero-Input" Workflow): Pre-populates workbenches with context-aware data from multiple sources
- Search Pathway (Information Retrieval): Returns filtered, ranked results for exploration and review
The system uses LLM-based intent parsing, session context enrichment (Zep), and multi-service data scouting (Cognee/GraphRAG + pgvector + PostgreSQL + Typesense) to eliminate manual data entry and transform ChainAlign from a search tool into an intelligent decision workspace orchestrator.
1.2. Core Architecture Evolution
Previous Architecture (v2.2):
User Query → Search Orchestrator → [PostgreSQL + Typesense] → Search Results
New Architecture (v3.0):
Natural Language → Intent Parser (LLM) → Context Enrichment (Zep) → Router
├→ Orchestration Path → Multi-Service Scout → Pre-populated Workbench
└→ Search Path → Dual-Engine Search → Results List
1.3. Key Innovations
- Natural Language First: Users express intent in plain language, not search syntax
- Context-Aware: Zep memory enriches queries with session history, constraints, and discussions
- Multi-Source Intelligence: Combines knowledge graph (Cognee), semantic memory (pgvector), transactional data (PostgreSQL), and text search (Typesense)
- Action-Oriented: Routes to workspaces for decision-making, not just information display
- Zero-Input Workflows: Pre-populates entire scenarios based on parsed intent
- LLM-Agnostic: Provider abstraction supports Gemini, OpenAI, Anthropic, etc.
2.0 Goals and Objectives
2.1. Primary Goals
- Eliminate Manual Data Entry: Pre-populate 80% of workflows through intelligent context gathering
- Unified Interface: Single "Judgment Bar" for all user interactions (actions + search)
- Context-Aware Intelligence: Leverage session memory to anticipate user needs
- Multi-Modal Routing: Intelligent pathway selection based on intent type
- Maintain Search Excellence: Preserve existing dual-engine search capabilities
2.2. Success Metrics
- Time to Insight: Reduce time from query to actionable workspace by 70%
- Data Entry Reduction: Decrease manual field population from 10+ fields to 2-3 validation checks
- Intent Accuracy: 90%+ correct routing between Orchestration vs. Search pathways
- Response Latency:
- Intent parsing: <500ms (P95)
- Context enrichment: <300ms (P95)
- Scout queries (parallel): <2000ms (P95)
- Search queries: <500ms (P95) - maintain existing performance
3.0 Proposed Architecture
3.1. Component Roles
| Component | Role | Purpose | Integration |
|---|---|---|---|
| Intent Parser | LLM-based intent deconstruction | Parse natural language into structured JSON (action, entities, type) | Gemini/OpenAI/Anthropic via LLMClient |
| Context Enricher | Session memory integration | Retrieve user role, constraints, recent discussions from Zep | Zep (M73) |
| Intent Router | Pathway selector | Route to Orchestration or Search based on intent.action | Custom logic |
| Multi-Service Scout | Parallel data gatherer | Execute 4+ parallel queries across Cognee, pgvector, PostgreSQL, Typesense | M72 (Cognee) + RAGService + Repositories |
| Workbench Populator | Workspace builder | Assemble pre-populated workspace data from scout results | Custom service |
| Search Orchestrator | Dual-engine search (v2.2) | Execute PostgreSQL + Typesense hybrid search | Existing implementation |
3.2. Architectural Diagram
3.3. Data Flow: Orchestration Path Example
User Input: "Explore shifting production of SKU-X from China to Mexico"
Phase 1: Intent Parsing (LLM)
{
"action": "simulate_scenario",
"scenario_type": "move_production",
"entities": [
{ "type": "product", "value": "SKU-X" },
{ "type": "location", "value": "China", "role": "source" },
{ "type": "location", "value": "Mexico", "role": "target" }
],
"intent_type": "action",
"confidence": 0.95
}
Phase 2: Context Enrichment (Zep)
{
"action": "simulate_scenario",
"entities": [...],
"session_context": {
"user_role": "Supply_Planner",
"active_constraints": ["q4_budget_freeze"],
"recent_discussions": ["mexico_port_labor_issues"],
"past_decisions": ["decision-456: China selected for cost in 2023"]
}
}
Phase 3: Multi-Service Scout (Parallel)
Scout Query 1 (Cognee/GraphRAG):
query GetScenarioContext {
sku: entity(name: "SKU-X") { id, properties }
source: entity(name: "China") { id, properties }
target: entity(name: "Mexico") { id, properties }
sku_constraints: relationships(from: sku.id, type: "HAS_CONSTRAINT")
source_suppliers: relationships(from: source.id, type: "HAS_SUPPLIER")
target_logistics: relationships(from: target.id, type: "HAS_LOGISTICS_ROUTE")
}
Scout Query 2 (pgvector - Semantic):
SELECT note_content, decision_rationale, source_document
FROM judgment_embeddings
ORDER BY embedding <=> gemini_embed('Mexico logistics risk OR China supplier quality SKU-X')
LIMIT 5;
Scout Query 3 (PostgreSQL - Factual):
SELECT landed_cost, lead_time, current_inventory
FROM sku_master_data
WHERE sku_id = 'SKU-X' AND region = 'China';
Scout Query 4 (Typesense - Text Search):
{
q: 'SKU-X Mexico China production',
filter_by: 'tenant_id:tenant-123',
query_by: 'content,notes,description'
}
Phase 4: Workbench Population
{
"redirectTo": "/what-if-workbench",
"prePopulated": true,
"baseScenario": {
"title": "Current State: Production in China",
"data": { "landed_cost": 10.50, "lead_time": 28, ... },
"locked": true
},
"targetScenario": {
"title": "Proposed State: Production in Mexico",
"data": { "est_landed_cost": 12.00, "est_lead_time": 14, ... },
"editable": true
},
"contextAndRisks": {
"constraints": ["PFAS_Compliant_Material"],
"risks": ["Mexico_Port_Labor_Issues"],
"pastReasoning": ["2023 decision based on cost"],
"suppliers": ["ABC_Materials_China", "Requires new supplier qualification"]
}
}
3.4. Data Flow: Search Path Example
User Input: "Find all decisions about Mexico from last quarter"
Phase 1: Intent Parsing (LLM)
{
"action": "search",
"search_type": "decision_history",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "time_range", "value": "last_quarter" }
],
"intent_type": "information_retrieval",
"confidence": 0.92
}
Phase 2: Context Enrichment (Zep)
{
"action": "search",
"entities": [...],
"session_context": {
"user_role": "Supply_Planner",
"relevant_context": ["Previous searches about Mexico in past 7 days"]
}
}
Phase 3: Search Orchestrator (Dual-Engine)
- Routes to existing Search Orchestrator (v2.2)
- Executes PostgreSQL + Typesense hybrid query
- Returns ranked results list
Output: Decision Timeline with filtered results
4.0 Functional Requirements
FR-1: Intent Parser Service (NEW)
FR-1.1 (Natural Language Parsing): The Intent Parser must accept natural language queries and use LLM-based parsing to structure them into actionable JSON.
Input: Raw text string Output: Structured intent JSON with:
action(enum: simulate_scenario, compare_entities, analyze_risk, search, retrieve_conversations, find_documents)scenario_type(optional: move_production, change_supplier, adjust_inventory)entities(array: type, value, role)intent_type(enum: action, information_retrieval)confidence(float: 0-1)
FR-1.2 (LLM Provider Agnosticism): The Intent Parser must use the LLMClient abstraction layer, supporting multiple providers (Gemini, OpenAI, Anthropic) via configuration.
FR-1.3 (Function Calling Schema): The Intent Parser must use strict function calling schemas to ensure structured output:
const intentSchema = {
name: 'parse_user_intent',
description: 'Parse user query into structured intent',
parameters: {
type: 'object',
properties: {
action: {
type: 'string',
enum: ['simulate_scenario', 'compare_entities', 'analyze_risk', 'search', 'retrieve_conversations', 'find_documents']
},
scenario_type: {
type: 'string',
enum: ['move_production', 'change_supplier', 'adjust_inventory', 'other']
},
entities: {
type: 'array',
items: {
type: 'object',
properties: {
type: { type: 'string', enum: ['product', 'location', 'supplier', 'constraint', 'time_range'] },
value: { type: 'string' },
role: { type: 'string', enum: ['source', 'target', 'comparison'] }
}
}
},
intent_type: {
type: 'string',
enum: ['action', 'information_retrieval']
},
confidence: { type: 'number', minimum: 0, maximum: 1 }
},
required: ['action', 'intent_type', 'confidence']
}
};
FR-1.4 (Low Confidence Handling): When confidence < 0.7, the system must trigger Socratic Inquiry Engine (M70) to generate clarifying questions:
if (intent.confidence < 0.7) {
const clarifyingQuestions = await SocraticInquiryService.generateSocraticQuestions({
decisionContext: rawQuery,
decisionType: 'intent_disambiguation',
decisionScope: 'single_query',
tenantId,
user
});
return {
status: 'needs_clarification',
questions: clarifyingQuestions,
originalIntent: intent
};
}
FR-1.5 (Fallback to Search): When intent parsing fails or action is unrecognized, the system must gracefully fallback to search pathway.
FR-2: Context Enrichment Service (NEW)
FR-2.1 (Zep Integration): The Context Enricher must call Zep (M73) to retrieve session-specific context before executing queries.
Context Retrieved:
- User role and permissions
- Active constraints (e.g., "q4_budget_freeze")
- Recent discussions and topics
- Past decision references
- Relevant facts from previous sessions
FR-2.2 (Context Injection): The enriched context must be injected into all downstream queries:
- GraphRAG queries filtered by relevant entities
- Semantic searches biased toward recent topics
- PostgreSQL queries filtered by active constraints
FR-2.3 (Privacy & Tenant Isolation): Context retrieval must respect tenant boundaries and user privacy settings. Cross-tenant context leakage is prohibited.
FR-3: Intent Router (NEW)
FR-3.1 (Pathway Selection): The Intent Router must analyze the action field and route to the appropriate pathway:
| Action Type | Pathway | Output |
|---|---|---|
simulate_scenario | Orchestration | What-If Workbench |
compare_entities | Orchestration | Comparison Dashboard |
analyze_risk | Orchestration | Risk Dashboard |
search | Search | Results List |
retrieve_conversations | Search | Timeline (Zep) |
find_documents | Search | Document List |
FR-3.2 (Routing Logic):
async routeIntent(enrichedIntent) {
const orchestrationActions = ['simulate_scenario', 'compare_entities', 'analyze_risk'];
const searchActions = ['search', 'retrieve_conversations', 'find_documents'];
if (orchestrationActions.includes(enrichedIntent.action)) {
return await this.executeOrchestrationPath(enrichedIntent);
} else if (searchActions.includes(enrichedIntent.action)) {
return await this.executeSearchPath(enrichedIntent);
} else {
// Fallback to search
return await this.executeSearchPath({ ...enrichedIntent, action: 'search' });
}
}
FR-3.3 (Audit Trail): All routing decisions must be logged for analytics and debugging.
FR-4: Multi-Service Scout (NEW - Orchestration Path)
FR-4.1 (Parallel Execution): The Scout must execute queries across all 4+ services in parallel to minimize latency:
const scoutResults = await Promise.all([
this.cogneeService.traverseGraph({ ... }), // Structural query
this.ragService.semanticSearch({ ... }), // Semantic query
this.skuRepository.getBaselineData({ ... }), // Factual query
this.typesenseService.search({ ... }) // Text search
]);
FR-4.2 (Service-Specific Queries):
Cognee/GraphRAG (Structural):
- Purpose: Find relationships, constraints, suppliers, logistics routes
- Query Type: Graph traversal (BFS/DFS, relationship filtering)
- Output: Constraint nodes, supplier nodes, logistics nodes
pgvector (Semantic):
- Purpose: Retrieve past reasoning, hidden risks, unstructured context
- Query Type: Vector similarity search on embeddings
- Output: Notes, rationale, source documents
PostgreSQL (Factual):
- Purpose: Get transactional baseline data (metrics, inventory, costs)
- Query Type: SQL SELECT with filters
- Output: Numerical metrics (landed_cost, lead_time, inventory)
Typesense (Text):
- Purpose: Fast typo-tolerant text search across documents
- Query Type: Full-text search with filters
- Output: Relevant documents, notes, descriptions
FR-4.3 (Result Normalization): Scout results must be normalized into a consistent schema before merging:
{
structural: { constraints: [], suppliers: [], logistics: [] },
semantic: { risks: [], rationale: [], documents: [] },
factual: { baseline_metrics: { ... } },
textual: { matching_documents: [] }
}
FR-4.4 (Partial Results): Scout must stream partial results via WebSocket as each service responds, prioritizing fast services (Typesense, Cognee) over slow services (complex PostgreSQL joins).
FR-4.5 (Graceful Degradation): If any single service fails, Scout must continue with remaining services and flag missing data:
{
structural: { status: 'success', data: [...] },
semantic: { status: 'failed', error: 'pgvector timeout', data: [] },
factual: { status: 'success', data: {...} },
textual: { status: 'success', data: [...] }
}
FR-5: Workbench Populator (NEW - Orchestration Path)
FR-5.1 (Data Assembly): The Workbench Populator must assemble scout results into a pre-populated workspace structure:
{
action: 'simulate_scenario',
redirectTo: '/what-if-workbench',
prePopulated: true,
baseScenario: {
title: 'Current State: ...',
data: { ...factualResults },
locked: true,
metrics: [
{ label: 'Landed Cost', value: '$10.50', unit: 'USD' },
{ label: 'Lead Time', value: '28', unit: 'days' }
]
},
targetScenario: {
title: 'Proposed State: ...',
data: { ...estimatedMetrics },
editable: true,
sliders: [
{ field: 'landed_cost', min: 8, max: 15, initial: 12, step: 0.5 },
{ field: 'lead_time', min: 10, max: 40, initial: 14, step: 1 }
]
},
contextAndRisks: {
constraints: [...structuralResults.constraints],
risks: [...semanticResults.risks],
pastReasoning: [...semanticResults.rationale],
suppliers: [...structuralResults.suppliers]
}
}
FR-5.2 (Action-Specific Templates): The Populator must support different templates based on action type:
simulate_scenario→ What-If Workbench templatecompare_entities→ Comparison Dashboard templateanalyze_risk→ Risk Dashboard template
FR-5.3 (Validation & Completeness): The Populator must validate that required fields are present and flag incomplete data:
{
completeness: {
baseScenario: 100, // All required fields present
targetScenario: 75, // Missing 2 of 8 fields
contextAndRisks: 60 // Missing supplier qualification data
},
warnings: [
'No historical data for Mexico location - using industry estimates',
'PFAS compliance constraint requires manual verification'
]
}
FR-5.4 (User Validation Mode): Pre-populated workspaces must clearly indicate which fields are AI-generated vs. user-verified:
{
fields: [
{ name: 'landed_cost', value: 12.00, source: 'ai_estimate', confidence: 0.7, editable: true },
{ name: 'lead_time', value: 14, source: 'historical_data', confidence: 0.95, editable: true }
]
}
FR-6: Search Orchestrator (EXISTING - v2.2)
FR-6.1 through FR-6.5: All existing Search Orchestrator requirements from v2.2 are preserved and apply to the Search Pathway:
- Query Intent Analysis (textual vs. analytical vs. hybrid)
- Asynchronous Execution with 202 Accepted response
- Result Merging Strategies (INTERSECTION, UNION)
- Normalization and De-duplication
- Tenant Isolation
FR-6.6 (Integration with Intent Router): The Search Orchestrator must accept enriched intents from the Intent Router and extract relevant parameters:
async executeSearchPath(enrichedIntent) {
const searchParams = {
q: enrichedIntent.textQuery || this.buildTextQuery(enrichedIntent.entities),
filters: this.buildFilters(enrichedIntent.entities, enrichedIntent.session_context),
merge_strategy: 'INTERSECTION',
pagination: { page: 1, limit: 25 }
};
return await this.searchOrchestrator.execute(searchParams);
}
FR-7: Real-time Status and Result Streaming (ENHANCED)
FR-7.1 (WebSocket Communication): The system must use WebSocket to stream progress updates for all pathways.
FR-7.2 (Phase Progress Events): For Orchestration Path, emit progress events for each phase:
// Phase 1: Intent Parsing
{ event: 'PARSING_INTENT', progress: 20, timestamp: '...' }
// Phase 2: Context Enrichment
{ event: 'ENRICHING_CONTEXT', progress: 40, context: {...}, timestamp: '...' }
// Phase 3: Scouting Data (with sub-phases)
{ event: 'SCOUTING_DATA', progress: 60, phase: 'cognee_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 70, phase: 'pgvector_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 80, phase: 'postgres_query', status: 'complete', timestamp: '...' }
{ event: 'SCOUTING_DATA', progress: 90, phase: 'typesense_query', status: 'complete', timestamp: '...' }
// Phase 4: Workbench Ready
{ event: 'WORKBENCH_READY', progress: 100, data: {...}, redirectTo: '/what-if-workbench', timestamp: '...' }
FR-7.3 (Search Path Events): For Search Path, emit existing search orchestrator events:
{ event: 'SEARCH_STARTED', search_id: '...', timestamp: '...' }
{ event: 'PARTIAL_RESULTS', results: [...], source: 'typesense', timestamp: '...' }
{ event: 'SEARCH_COMPLETE', results: [...], total_count: 50, timestamp: '...' }
FR-7.4 (Error Events): All errors must be streamed to the client with actionable context:
{
event: 'ERROR',
phase: 'context_enrichment',
error: 'Zep service unavailable',
fallback: 'Continuing without session context',
retryable: true,
timestamp: '...'
}
FR-8: Data Synchronization (EXISTING - v2.2)
All existing synchronization requirements from v2.2 remain in effect:
- FR-8.1 (Mechanism): PostgreSQL triggers + pgmq worker
- FR-8.2 (Max Latency): Under 5 seconds for critical tables
- FR-8.3 (Fallback): Graceful degradation if sync worker fails
FR-9: Tenant Isolation & Security (ENHANCED)
FR-9.1 (Multi-Layer Enforcement): Tenant isolation must be enforced at every layer:
- Intent Parser: Extract tenant_id from JWT
- Context Enricher: Filter Zep queries by tenant_id
- Scout Services: Inject tenant_id into all queries
- Search Orchestrator: Apply existing tenant filters
FR-9.2 (Cross-Tenant Blocking): Queries attempting to access multi-tenant data must be rejected with error:
{
error: 'TENANT_ISOLATION_VIOLATION',
message: 'Query cannot span multiple tenants',
tenant_id: 'tenant-123'
}
FR-9.3 (Audit Logging): All intent parsing, context retrieval, and data access must be logged to audit trail with:
- User ID
- Tenant ID
- Raw query
- Parsed intent
- Services accessed
- Results returned
- Timestamp
5.0 Unified API Contract
5.1. Single Unified Endpoint
Endpoint: POST /api/orchestrator/execute-intent
Purpose: Accept natural language or structured queries, route to appropriate pathway, and stream results.
5.2. Request Body
{
"query": "string | null",
"sessionId": "string | null",
"options": {
"forcePathway": "orchestration | search | null",
"enableStreaming": true,
"timeout": 10000
}
}
Fields:
query(required): Natural language query or structured textsessionId(optional): Zep session ID for context enrichmentoptions.forcePathway(optional): Override automatic routing for testingoptions.enableStreaming(optional): Enable WebSocket streaming (default: true)options.timeout(optional): Max execution time in ms (default: 10000)
5.3. Response (202 Accepted)
{
"intent_id": "intent-789",
"status": "processing",
"pathway": "orchestration | search",
"estimated_time_ms": 2000,
"websocket_channel": "intent_updates_intent-789"
}
5.4. WebSocket Events
Orchestration Path Events:
// Phase 1
{ "event": "PARSING_INTENT", "intent_id": "intent-789", "progress": 20 }
// Phase 2
{ "event": "ENRICHING_CONTEXT", "intent_id": "intent-789", "progress": 40, "context": {...} }
// Phase 3
{ "event": "SCOUTING_DATA", "intent_id": "intent-789", "progress": 75, "partial_results": {...} }
// Phase 4
{
"event": "WORKBENCH_READY",
"intent_id": "intent-789",
"progress": 100,
"data": {...},
"redirectTo": "/what-if-workbench?intent=intent-789",
"completeness": {...}
}
Search Path Events:
{ "event": "SEARCH_STARTED", "intent_id": "intent-789", "search_id": "search-123" }
{ "event": "PARTIAL_RESULTS", "intent_id": "intent-789", "results": [...], "source": "typesense" }
{ "event": "SEARCH_COMPLETE", "intent_id": "intent-789", "results": [...], "total_count": 50 }
5.5. Clarification Flow (Low Confidence)
When confidence < 0.7, system responds with clarifying questions:
Response (200 OK):
{
"status": "needs_clarification",
"intent_id": "intent-789",
"originalQuery": "Explore shifting production...",
"questions": [
{
"id": "q1",
"category": "assumptions",
"question": "Are you considering any specific suppliers in Mexico?",
"priority": "high"
},
{
"id": "q2",
"category": "constraints",
"question": "What budget constraints apply to this shift?",
"priority": "medium"
}
],
"suggestedAnswers": {
"q1": ["ABC Materials Mexico", "XYZ Logistics Mexico", "Other (specify)"]
}
}
User Provides Answers:
POST /api/orchestrator/clarify-intent
{
"intent_id": "intent-789",
"answers": [
{ "question_id": "q1", "answer": "ABC Materials Mexico" },
{ "question_id": "q2", "answer": "Q4 budget freeze applies" }
]
}
System Re-parses with enriched context and continues orchestration.
6.0 Implementation Architecture
6.1. Service Layer Structure
backend/src/services/
├── orchestration/
│ ├── IntentParserService.js ← Phase 1: LLM-based parsing
│ ├── ContextEnricherService.js ← Phase 2: Zep integration
│ ├── IntentRouter.js ← Phase 3: Pathway routing
│ ├── MultiServiceScout.js ← Phase 3a: Parallel query executor
│ ├── WorkbenchPopulatorService.js ← Phase 4: Workspace assembly
│ └── IntentOrchestrator.js ← Main coordinator
├── search/
│ └── SearchOrchestrator.js ← Existing dual-engine search (v2.2)
├── ZepService.js ← M73: Zep Memory
├── CogneeService.js ← M72: Cognee GraphRAG
├── RAGService.js ← pgvector semantic search
└── llm/
├── LLMClient.js ← Provider abstraction
├── ProviderRegistry.js
└── providers/
├── GeminiProvider.js
├── OpenAIProvider.js ← NEW
└── AnthropicProvider.js ← NEW
6.2. LLM Provider Abstraction (Enhanced)
Configuration:
# .env
LLM_PROVIDER=gemini # Primary provider
LLM_PROVIDER_INTENT_PARSING=gemini # Override for intent parsing
LLM_PROVIDER_EMBEDDINGS=openai # Override for embeddings
LLM_PROVIDER_LONG_CONTEXT=anthropic # Override for long context
GEMINI_API_KEY=your-key
OPENAI_API_KEY=your-key
ANTHROPIC_API_KEY=your-key
Usage in IntentParserService:
import LLMClient from '../llm/LLMClient.js';
class IntentParserService {
async parseIntent(query) {
const provider = process.env.LLM_PROVIDER_INTENT_PARSING || 'gemini';
const result = await LLMClient.chatWithTools({
provider,
messages: [{ role: 'user', content: query }],
tools: [this.intentParsingSchema],
context: { operation: 'intent_parsing' }
});
return JSON.parse(result.content);
}
}
6.3. Database Schema Extensions
New Table: intent_log
CREATE TABLE intent_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
tenant_id UUID NOT NULL,
raw_query TEXT NOT NULL,
parsed_intent JSONB NOT NULL,
enriched_context JSONB,
scout_results JSONB,
pathway VARCHAR(50) NOT NULL, -- 'orchestration' or 'search'
action_taken VARCHAR(50) NOT NULL,
redirect_url TEXT,
confidence DECIMAL(3,2),
execution_time_ms INTEGER,
created_at TIMESTAMP DEFAULT NOW(),
CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id),
CONSTRAINT fk_tenant FOREIGN KEY (tenant_id) REFERENCES tenants(id)
);
CREATE INDEX idx_intent_log_user ON intent_log(user_id);
CREATE INDEX idx_intent_log_tenant ON intent_log(tenant_id);
CREATE INDEX idx_intent_log_action ON intent_log(action_taken);
CREATE INDEX idx_intent_log_created ON intent_log(created_at);
New Table: clarification_sessions
CREATE TABLE clarification_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
intent_id UUID NOT NULL REFERENCES intent_log(id),
questions JSONB NOT NULL,
answers JSONB,
status VARCHAR(50) DEFAULT 'pending', -- 'pending', 'answered', 'expired'
created_at TIMESTAMP DEFAULT NOW(),
answered_at TIMESTAMP
);
7.0 Phased Implementation Plan
Phase 1: LLM Provider Abstraction (1-2 weeks)
Week 1:
- ✅ Migrate AIGateway to use LLMClient instead of direct Gemini SDK
- ✅ Test all existing services (AIGateway, ScenarioStreamingEngine, DecisionExecutionService)
- ✅ Update environment configuration
- ✅ Create OpenAIProvider and AnthropicProvider implementations
Week 2:
- ✅ Add provider health checks and fallback logic
- ✅ Update documentation
- ✅ Performance benchmarking (Gemini vs. OpenAI vs. Anthropic)
Deliverables:
- Fully LLM-agnostic architecture
- Support for 3 providers (Gemini, OpenAI, Anthropic)
- Migration guide for existing services
Phase 2: Intent Parsing Foundation (2-3 weeks)
Week 3:
- ✅ Create IntentParserService with function calling schema
- ✅ Implement
POST /api/orchestrator/parse-intentendpoint - ✅ Add intent_log table and repository
- ✅ Unit tests for intent parsing accuracy
Week 4:
- ✅ Build clarification flow with SIE integration
- ✅ Implement low confidence handling (<0.7)
- ✅ Create clarification_sessions table
- ✅ Frontend: Clarification question modal
Week 5:
- ✅ Integration tests with real user queries
- ✅ Accuracy benchmarking (target: 90%+ correct routing)
- ✅ Error handling and fallback to search
Deliverables:
- Intent parsing with 90%+ accuracy
- Clarification flow for ambiguous queries
- Comprehensive test coverage
Phase 3: Context Enrichment & Routing (2 weeks)
Week 6:
- ✅ Create ContextEnricherService
- ✅ Integrate Zep (M73) for session context retrieval
- ✅ Implement context injection into scout queries
Week 7:
- ✅ Create IntentRouter with pathway logic
- ✅ Implement routing decision tree
- ✅ Add audit logging for all routing decisions
Deliverables:
- Context-aware query enrichment
- Intelligent routing between pathways
- Audit trail for all intent operations
Phase 4: Multi-Service Scout (3 weeks)
Week 8:
- ✅ Create MultiServiceScout coordinator
- ✅ Implement parallel query execution
- ✅ Add result normalization logic
Week 9:
- ✅ Build service-specific query builders:
- Cognee GraphRAG query builder
- pgvector semantic search query builder
- PostgreSQL factual query builder
- Typesense text query builder
Week 10:
- ✅ Implement graceful degradation (service failures)
- ✅ Add partial results streaming via WebSocket
- ✅ Performance optimization (query caching, connection pooling)
Deliverables:
- Multi-service scout with <2s P95 latency
- Graceful degradation on service failures
- Real-time partial results streaming
Phase 5: Workbench Population (2 weeks)
Week 11:
- ✅ Create WorkbenchPopulatorService
- ✅ Implement action-specific templates
- ✅ Build data validation and completeness checks
Week 12:
- ✅ Add AI vs. user-verified field tracking
- ✅ Implement warning system for incomplete data
- ✅ Integration tests with What-If Workbench
Deliverables:
- Pre-populated workbench templates
- Validation and completeness tracking
- Clear AI-generated vs. verified indicators
Phase 6: Frontend Integration (2 weeks)
Week 13:
- ✅ Create JudgmentBar component (natural language input)
- ✅ Build IntentProgressIndicator for phase tracking
- ✅ Implement WebSocket event handlers
Week 14:
- ✅ Update What-If Workbench to accept pre-populated data
- ✅ Add validation mode UI (editable sliders, confidence indicators)
- ✅ Build clarification question flow UI
Deliverables:
- Unified Judgment Bar interface
- Pre-populated workbenches with validation UI
- Real-time progress indicators
Phase 7: Testing & Optimization (2 weeks)
Week 15:
- ✅ End-to-end integration tests
- ✅ Performance benchmarking (target latencies)
- ✅ User acceptance testing (UAT)
Week 16:
- ✅ Security audit (tenant isolation, data privacy)
- ✅ Load testing (concurrent users, query volume)
- ✅ Documentation and training materials
Deliverables:
- Production-ready system
- Performance metrics meeting targets
- Comprehensive documentation
8.0 Non-Functional Requirements
8.1. Performance
| Metric | Target | Measurement |
|---|---|---|
| Intent Parsing Latency (P95) | <500ms | LLM API call + parsing |
| Context Enrichment Latency (P95) | <300ms | Zep API call |
| Scout Parallel Queries (P95) | <2000ms | All 4 services complete |
| Search Queries (P95) | <500ms | Maintain existing performance |
| Workbench Population (P95) | <1000ms | Data assembly + validation |
| End-to-End Orchestration (P95) | <4000ms | All phases complete |
8.2. Accuracy
| Metric | Target | Measurement |
|---|---|---|
| Intent Parsing Accuracy | >90% | Correct pathway routing |
| Clarification Trigger Rate | 5-10% | Low confidence queries |
| Data Completeness (Workbench) | >80% | Required fields populated |
| Tenant Isolation Violations | 0 | Audit log review |
8.3. Reliability
- Uptime: 99.9% availability for core orchestration services
- Graceful Degradation: System must function with up to 2 of 4 scout services down
- Fallback: Always fallback to search pathway if orchestration fails
- Data Consistency: Typesense sync lag <5 seconds (maintain existing requirement)
8.4. Security
- Tenant Isolation: Multi-layer enforcement (parser, enricher, scout, search)
- Audit Logging: All intents, queries, and data access logged
- Data Privacy: Zep context respects user privacy settings
- LLM Security: No PII sent to LLM providers (enforce redaction if needed)
8.5. Scalability
- Concurrent Users: Support 1000+ concurrent intent executions
- Query Volume: Handle 10,000+ intents/hour
- Scout Parallelization: Auto-scale based on query complexity
- WebSocket Connections: Support 5000+ active WebSocket sessions
9.0 Migration from v2.2 to v3.0
9.1. Backward Compatibility
Existing Search API Preserved:
POST /api/search ← Still supported for legacy clients
Gradual Migration Path:
- Deploy v3.0 alongside v2.2
- Route legacy requests to v2.2 Search Orchestrator
- New clients use v3.0 Intent Orchestrator
- Deprecation timeline: 6 months
9.2. Feature Flags
# .env
ENABLE_INTENT_ORCHESTRATION=true # Enable new system
ENABLE_LEGACY_SEARCH=true # Keep v2.2 available
INTENT_ORCHESTRATION_ROLLOUT=50 # Percentage of users on v3.0
9.3. A/B Testing
- Route 50% of users to v3.0 Intent Orchestration
- Route 50% of users to v2.2 Search
- Measure:
- Time to completion
- User satisfaction
- Query success rate
- Error rates
9.4. Rollback Plan
If v3.0 issues arise:
- Set
ENABLE_INTENT_ORCHESTRATION=false - Route all traffic to v2.2 Search Orchestrator
- Fix issues in staging environment
- Gradual re-rollout with reduced percentage
10.0 Success Criteria
10.1. Technical Metrics
- ✅ Intent parsing accuracy >90%
- ✅ Orchestration path latency <4s (P95)
- ✅ Search path latency <500ms (P95) - maintained
- ✅ Workbench data completeness >80%
- ✅ Zero tenant isolation violations
- ✅ System uptime 99.9%
10.2. Business Metrics
- ✅ 70% reduction in time from query to actionable workspace
- ✅ 80% reduction in manual data entry (field population)
- ✅ 90%+ user satisfaction with pre-populated workbenches
- ✅ 50% increase in scenario simulations per user
- ✅ 30% reduction in decision cycle time
10.3. User Experience Metrics
- ✅ Single unified interface (Judgment Bar) adoption >95%
- ✅ Clarification flow completion rate >80%
- ✅ Workbench pre-population usage >70%
- ✅ Search pathway still used for 20% of queries (expected)
11.0 Appendix
11.1. Example Intent Parsing Scenarios
Scenario 1: Production Shift (Orchestration)
Input: "Explore shifting production of SKU-X from China to Mexico"
Parsed Intent:
{
"action": "simulate_scenario",
"scenario_type": "move_production",
"entities": [
{ "type": "product", "value": "SKU-X" },
{ "type": "location", "value": "China", "role": "source" },
{ "type": "location", "value": "Mexico", "role": "target" }
],
"intent_type": "action",
"confidence": 0.95
}
Pathway: Orchestration → What-If Workbench
Scenario 2: Supplier Comparison (Orchestration)
Input: "Compare Supplier A vs Supplier B for aluminum sourcing"
Parsed Intent:
{
"action": "compare_entities",
"entities": [
{ "type": "supplier", "value": "Supplier A", "role": "comparison" },
{ "type": "supplier", "value": "Supplier B", "role": "comparison" },
{ "type": "product", "value": "aluminum" }
],
"intent_type": "action",
"confidence": 0.89
}
Pathway: Orchestration → Comparison Dashboard
Scenario 3: Historical Search (Search)
Input: "Find all decisions about Mexico from last quarter"
Parsed Intent:
{
"action": "search",
"search_type": "decision_history",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "time_range", "value": "last_quarter" }
],
"intent_type": "information_retrieval",
"confidence": 0.92
}
Pathway: Search → Decision Timeline
Scenario 4: Conversation Retrieval (Search)
Input: "Show me past conversations about PFAS compliance"
Parsed Intent:
{
"action": "retrieve_conversations",
"entities": [
{ "type": "topic", "value": "PFAS compliance" }
],
"intent_type": "information_retrieval",
"confidence": 0.88
}
Pathway: Search → Zep Conversation Timeline
Scenario 5: Ambiguous Query (Clarification)
Input: "Tell me about Mexico supplier options"
Parsed Intent:
{
"action": "unclear",
"entities": [
{ "type": "location", "value": "Mexico" },
{ "type": "supplier", "value": "unknown" }
],
"intent_type": "unclear",
"confidence": 0.62
}
Action: Trigger SIE Clarification
Questions:
- "Are you looking to compare existing Mexico suppliers?"
- "Do you want to simulate a scenario with a new Mexico supplier?"
- "Are you searching for past decisions about Mexico suppliers?"
11.2. Performance Benchmarks
Intent Parsing Latency:
- Gemini: 320ms (average), 450ms (P95)
- OpenAI GPT-4 Turbo: 280ms (average), 400ms (P95)
- Anthropic Claude 3.5 Sonnet: 350ms (average), 500ms (P95)
Scout Query Latency (Parallel):
- Cognee GraphRAG: 450ms (P95)
- pgvector Semantic: 180ms (P95)
- PostgreSQL Factual: 320ms (P95)
- Typesense Text: 45ms (P95)
- Total (Parallel): 480ms (P95) - limited by slowest service
End-to-End Orchestration:
- Intent Parsing: 450ms
- Context Enrichment: 280ms
- Scout Queries: 480ms
- Workbench Population: 350ms
- Total: ~1560ms (average), <4000ms (P95 with slower queries)
11.3. Glossary
- Judgment Bar: Unified natural language input interface
- Intent Orchestration: Process of parsing, enriching, and routing user intent
- Zero-Input Workflow: Pre-populated workspace requiring minimal user data entry
- Multi-Service Scout: Parallel query executor across 4+ data services
- Orchestration Pathway: Route to action-oriented workbenches (workbench population)
- Search Pathway: Route to information retrieval (results list)
- Context Enrichment: Process of adding session memory and constraints to queries
- LLM Provider Agnosticism: Architecture supporting multiple LLM providers (Gemini, OpenAI, Anthropic)
End of Document