Zep Memory Activation (M73)
1. Executive Summary
| Field | Description |
|---|---|
| Feature Name | Zep Memory Activation (Complete) |
| Milestone | M73 |
| Current Status | 40% → Target: 100% |
| Architectural Goal | Activate long-term memory for AI-powered decision support using Zep. Enable ChainAlign to remember user preferences, reasoning patterns, past decisions, and contextual facts across sessions. |
| System Impact | Transforms AI from stateless question-answering to context-aware strategic partner that learns user's decision-making style, preferred trade-offs, and organizational norms over time. |
| Target Users | All users (indirect), AI systems (direct consumer) |
| MVP Success Metric | AI can recall user's past decisions, preferred reasoning patterns, and organizational context in new Socratic dialogues. Memory persists across sessions. |
2. Strategic Context
Why Zep for Long-Term Memory?
| Capability | Without Zep (Current) | With Zep (Target) |
|---|---|---|
| Session Memory | ❌ Lost after conversation ends | ✅ Persistent across sessions |
| Fact Extraction | ❌ No automatic fact extraction | ✅ Auto-extracts key facts from conversations |
| User Preferences | ❌ User must re-state every time | ✅ Remembers preferences ("I always prioritize long-term margin") |
| Reasoning Patterns | ❌ No learning from past decisions | ✅ Learns which trade-offs user typically accepts |
| Organizational Context | ❌ Generic AI responses | ✅ Tailored to company's strategic priorities |
| Proactive Suggestions | ❌ Reactive only | ✅ "Based on your past decisions, you might want to consider..." |
Zep Use Cases in ChainAlign:
- Socratic Inquiry Engine: Personalize questions based on user's past reasoning patterns
- Decision History Feed: "You made a similar decision in Q2 2024, here's what you considered"
- Proactive Insights: "Your current decision conflicts with your stated preference for long-term margin"
- Adaptive Learning: Track which Socratic questions led to decision changes (feedback loop to M70)
3. Current State (40% Complete)
What's Already Built:
✅ ZepClient initialized in zepService.js
✅ ZEP_API_URL environment variable configured
✅ Basic client connection established
What's Missing (60% Gap):
❌ Session Management: No create/update/retrieve session logic ❌ Memory Storage: Not storing conversation messages in Zep ❌ Fact Extraction: Not using Zep's fact extraction (e.g., "User prefers suppliers with 99% uptime") ❌ Memory Retrieval: AI calls don't retrieve relevant memories from Zep ❌ User Profiles: No user-specific memory sessions ❌ Tenant Isolation: No tenant_id scoping for multi-tenancy ❌ Integration with AIManager: AIGateway/AIManager doesn't use Zep context
4. Zep Architecture Overview
Zep provides:
- Sessions: Containers for conversations (one per user or decision)
- Messages: Individual user/assistant messages within a session
- Memory: Auto-extracted facts and summaries from sessions
- Search: Semantic search across all sessions and facts
ChainAlign's Zep Structure:
ZepSession (per user + decision context)
├── session_id: "user-{userId}-decision-{decisionId}"
├── metadata: { tenant_id, user_id, decision_id, decision_type }
├── messages: [
│ { role: "user", content: "I'm concerned about supplier risk..." },
│ { role: "assistant", content: "What assumptions are we making?" },
│ { role: "user", content: "I always prioritize long-term margin over short-term cost" }
│ ]
└── facts: [
{ fact: "User prioritizes long-term margin", extracted_at: "2024-11-10" }
]
5. Functional Requirements
FR-1: Session Management
| ID | Requirement | Details |
|---|---|---|
| FR-1.1 | Create Session | When a user starts a new decision, create a Zep session: session_id = "user-{userId}-decision-{decisionId}"Metadata: tenant_id, user_id, decision_id, decision_type |
| FR-1.2 | Retrieve Existing Session | If decision already has a session, retrieve it (don't duplicate) |
| FR-1.3 | Update Session Metadata | When decision status changes (Draft → In Review → Approved), update session metadata |
| FR-1.4 | Close Session | When decision is finalized, mark session as closed (metadata: status: "closed") |
| FR-1.5 | List User Sessions | Retrieve all sessions for a given user (for "past decisions" view) |
FR-2: Memory Storage
| ID | Requirement | Details |
|---|---|---|
| FR-2.1 | Store User Messages | When user answers Socratic questions, store each response as a message in Zep |
| FR-2.2 | Store AI Messages | Store Socratic questions generated by AI as assistant messages |
| FR-2.3 | Store Decision Outcomes | When decision outcome is recorded, add final message: "Decision outcome: Success. Lessons learned: ..." |
| FR-2.4 | Message Metadata | Each message includes: timestamp, message_id, role (user/assistant), content |
FR-3: Fact Extraction
| ID | Requirement | Details |
|---|---|---|
| FR-3.1 | Automatic Fact Extraction | Zep auto-extracts facts from user messages (e.g., "I always prioritize long-term margin" → fact: "User prioritizes long-term margin") |
| FR-3.2 | User Preference Facts | Extract: Preferred trade-offs, risk tolerance, strategic priorities |
| FR-3.3 | Organizational Context Facts | Extract: Company policies, supplier preferences, regulatory constraints |
| FR-3.4 | Decision Pattern Facts | Extract: Common decision types, recurring challenges |
| FR-3.5 | Fact Retrieval | API to retrieve all facts for a user or session |
FR-4: Memory Retrieval for AI Context
| ID | Requirement | Details |
|---|---|---|
| FR-4.1 | Retrieve User Memory | Before generating Socratic questions, retrieve relevant facts from user's past sessions |
| FR-4.2 | Retrieve Similar Decisions | Search Zep for past decisions with similar context (semantic search across sessions) |
| FR-4.3 | Inject Memory into Prompts | Add retrieved facts to LLM prompt context: "USER PREFERENCES: {facts from Zep}" "PAST SIMILAR DECISIONS: {similar sessions}" |
| FR-4.4 | Memory-Aware Question Generation | Socratic questions should reference user's past patterns: "In Q2 2024, you prioritized long-term margin. Does that still apply here?" |
FR-5: Adaptive Learning (Feedback Loop)
| ID | Requirement | Details |
|---|---|---|
| FR-5.1 | Track Question Impact | When user marks a Socratic question as "highly impactful," store in Zep as fact |
| FR-5.2 | Track Decision Changes | If user changes mind after answering a question, store pattern: "Question type X led to decision change" |
| FR-5.3 | Learn Trade-Off Preferences | Over time, extract which trade-offs user consistently accepts/rejects |
| FR-5.4 | Proactive Alerts | If current decision conflicts with past preferences, alert user: "This decision sacrifices long-term margin, which you typically prioritize" |
FR-6: Tenant Isolation & Security
| ID | Requirement | Details |
|---|---|---|
| FR-6.1 | Tenant-Scoped Sessions | All sessions include tenant_id in metadata |
| FR-6.2 | Tenant-Scoped Search | Memory retrieval always filters by tenant_id (prevent cross-tenant leakage) |
| FR-6.3 | User-Scoped Memory | Users only access their own memory (RBAC enforcement) |
| FR-6.4 | Admin Access | Tenant admins can view aggregated facts (not raw messages) for compliance audits |
6. Technical Architecture
6.1. Zep Service Enhancements
Enhance zepService.js
// backend/src/services/zepService.js
import { ZepClient } from "@getzep/zep-js";
import * as appLogger from './appLogger.js';
const ZEP_API_URL = process.env.ZEP_API_URL || "http://localhost:8000";
class ZepService {
constructor() {
this.client = new ZepClient(ZEP_API_URL);
appLogger.info(`ZepService initialized with API URL: ${ZEP_API_URL}`);
}
/**
* Create or retrieve a session for a user + decision.
*/
async getOrCreateSession(userId, decisionId, tenantId, metadata = {}) {
const sessionId = `user-${userId}-decision-${decisionId}`;
try {
// Try to retrieve existing session
const session = await this.client.memory.getSession(sessionId);
appLogger.info(`Retrieved existing Zep session: ${sessionId}`);
return session;
} catch (error) {
// Session doesn't exist, create it
if (error.status === 404) {
const newSession = await this.client.memory.addSession({
session_id: sessionId,
metadata: {
tenant_id: tenantId,
user_id: userId,
decision_id: decisionId,
status: "active",
...metadata
}
});
appLogger.info(`Created new Zep session: ${sessionId}`);
return newSession;
}
throw error;
}
}
/**
* Add a message to a session (user or assistant).
*/
async addMessage(sessionId, role, content, metadata = {}) {
await this.client.memory.addMemory(sessionId, {
messages: [{
role,
content,
metadata: {
timestamp: new Date().toISOString(),
...metadata
}
}]
});
appLogger.info(`Added ${role} message to Zep session ${sessionId}`);
}
/**
* Retrieve all messages from a session.
*/
async getSessionMessages(sessionId) {
const memory = await this.client.memory.getMemory(sessionId);
return memory.messages || [];
}
/**
* Retrieve extracted facts from a session.
*/
async getSessionFacts(sessionId) {
const memory = await this.client.memory.getMemory(sessionId);
return memory.facts || [];
}
/**
* Search across all user sessions for relevant context.
*/
async searchUserMemory(userId, tenantId, query, limit = 5) {
const results = await this.client.memory.searchSessions(query, {
metadata: {
user_id: userId,
tenant_id: tenantId
},
limit
});
return results;
}
/**
* Retrieve all facts for a user (across all sessions).
*/
async getUserFacts(userId, tenantId) {
// Search all sessions for this user
const sessions = await this.client.memory.listSessions({
metadata: {
user_id: userId,
tenant_id: tenantId
}
});
// Aggregate facts from all sessions
const allFacts = [];
for (const session of sessions) {
const facts = await this.getSessionFacts(session.session_id);
allFacts.push(...facts);
}
return allFacts;
}
/**
* Update session metadata (e.g., when decision status changes).
*/
async updateSessionMetadata(sessionId, metadata) {
await this.client.memory.updateSession(sessionId, { metadata });
appLogger.info(`Updated metadata for Zep session ${sessionId}`);
}
/**
* Close a session (when decision is finalized).
*/
async closeSession(sessionId) {
await this.updateSessionMetadata(sessionId, { status: "closed" });
appLogger.info(`Closed Zep session ${sessionId}`);
}
}
export default new ZepService();
6.2. Integration with AIManager/AIGateway
Enhance AIManager.js to use Zep context
// In AIManager.callLLM() or AIGateway.callLLM()
async function callLLM({ prompt, tenantId, user, queryContext, sessionId, decisionId }) {
let enrichedPrompt = prompt;
// Retrieve Zep memory if sessionId provided
if (sessionId || (user && decisionId)) {
const actualSessionId = sessionId || `user-${user.id}-decision-${decisionId}`;
try {
// Get user facts from Zep
const userFacts = await ZepService.getUserFacts(user.id, tenantId);
// Get similar past decisions
const similarDecisions = await ZepService.searchUserMemory(user.id, tenantId, prompt, 3);
// Enrich prompt with Zep context
const zepContext = `
USER PREFERENCES (from past decisions):
${userFacts.map(f => `- ${f.fact}`).join('\n')}
SIMILAR PAST DECISIONS:
${similarDecisions.map(d => `- ${d.content.substring(0, 200)}...`).join('\n')}
`;
enrichedPrompt = `${zepContext}\n\n${prompt}`;
appLogger.info(`Enriched prompt with Zep context for user ${user.id}`);
} catch (error) {
appLogger.warn(`Failed to retrieve Zep context: ${error.message}`);
// Continue without Zep context (graceful degradation)
}
}
// Call LLM with enriched prompt
const response = await geminiModel.generateContent(enrichedPrompt);
return response.response.text();
}
6.3. Integration with Socratic Inquiry Engine
Enhance SocraticInquiryService.js
// In SocraticInquiryService.generateQuestions()
async generateQuestions(decisionContext, tenantId, userId, decisionId) {
// Create or retrieve Zep session
const session = await ZepService.getOrCreateSession(userId, decisionId, tenantId, {
decision_type: decisionContext.decisionType
});
// Retrieve GraphRAG context (from M72)
const strategicContext = await CogneeService.retrieveStrategicContext();
const historicalContext = await CogneeService.retrieveHistoricalFailures(decisionContext.decisionType);
const constraintsContext = await CogneeService.retrieveConstraints(decisionContext.scope);
// Retrieve Zep memory (user preferences and past patterns)
const userFacts = await ZepService.getUserFacts(userId, tenantId);
const similarDecisions = await ZepService.searchUserMemory(userId, tenantId, decisionContext.description, 2);
// Build prompt with all context (GraphRAG + Zep)
const fullPrompt = buildSocraticPrompt({
decisionContext,
strategicContext,
historicalContext,
constraintsContext,
userPreferences: userFacts,
pastSimilarDecisions: similarDecisions
});
// Generate questions via LLM
const questions = await AIManager.callLLM({
prompt: fullPrompt,
tenantId,
user: { id: userId },
decisionId,
queryContext: 'socratic_question_generation'
});
// Store questions in Zep as assistant messages
for (const q of questions) {
await ZepService.addMessage(session.session_id, 'assistant', q.text, { category: q.category });
}
return questions;
}
Enhance captureHumanJudgment()
async captureHumanJudgment(questionId, userResponse, userId, decisionId, tenantId) {
// Store in database (CAB)
const judgment = await HumanJudgmentRepository.create({
question_id: questionId,
user_id: userId,
decision_id: decisionId,
response_text: userResponse,
tenant_id: tenantId
});
// Store in Zep as user message
const sessionId = `user-${userId}-decision-${decisionId}`;
await ZepService.addMessage(sessionId, 'user', userResponse, {
question_id: questionId
});
// Zep will auto-extract facts from userResponse (e.g., preferences, priorities)
return judgment;
}
7. Performance Requirements
| Metric | Target | Rationale |
|---|---|---|
| Session Creation | < 500ms | Fast session initialization |
| Message Storage | < 200ms | Real-time message logging |
| Fact Retrieval | < 1 second | Memory-enriched prompts must not delay AI responses |
| Memory Search | < 2 seconds | Acceptable for similarity search across sessions |
| Session List | < 1 second | For user's past decisions view |
8. Security & Compliance
8.1. Tenant Isolation
Critical: All Zep operations must filter by tenant_id to prevent data leakage.
Enforcement:
// In ZepService.searchUserMemory()
async searchUserMemory(userId, tenantId, query, limit = 5) {
const results = await this.client.memory.searchSessions(query, {
metadata: {
user_id: userId,
tenant_id: tenantId // ALWAYS filter by tenant_id
},
limit
});
return results;
}
8.2. RBAC for Memory Access
| Role | Access Level |
|---|---|
| User | Can view only their own sessions and facts |
| Planner | Can view their own sessions, plus aggregated facts for their domain (no raw messages) |
| Executive | Can view aggregated facts across all users (for trend analysis), no raw messages |
| Admin | Can view aggregated facts for compliance audits, cannot view raw user messages without consent |
8.3. Data Retention
- Zep sessions retained for 7 years (regulatory compliance)
- Facts retained indefinitely (proprietary learning asset)
- Raw messages can be deleted after 7 years (with user consent)
9. Testing Requirements
9.1. Unit Tests (80% coverage)
ZepService.getOrCreateSession()creates session if not existsZepService.addMessage()stores message correctlyZepService.getUserFacts()aggregates facts across sessionsZepService.searchUserMemory()filters by tenant_id
9.2. Integration Tests
- E2E: Create decision → Generate Socratic questions → Capture judgment → Verify stored in Zep → Retrieve facts → Verify facts extracted
- E2E: Create second decision → Generate questions → Verify questions reference facts from first decision
- Tenant isolation: User A cannot retrieve User B's memory (different tenant)
9.3. Performance Tests
- Load test: 100 concurrent session creations (target: < 500ms each)
- Memory retrieval with 1,000 facts (target: < 1s)
10. Migration Strategy
10.1. Backfill Historical Decisions
Phase 1: Identify Backfill Candidates (Week 1)
- Query
decisionstable for completed decisions with human_judgments - Identify users with ≥ 3 decisions (enough to extract patterns)
Phase 2: Create Zep Sessions (Week 2)
- For each historical decision:
- Create Zep session with metadata
- Add Socratic questions as assistant messages
- Add human judgments as user messages
- Mark session as closed
Phase 3: Fact Extraction (Week 3)
- Zep auto-extracts facts from backfilled messages
- Manually review and validate extracted facts for top 10 users
10.2. Dual-Mode Operation
- Legacy Mode: AI calls without Zep context (current behavior)
- Zep-Enabled Mode: AI calls enriched with Zep memory
- Feature flag:
ENABLE_ZEP_MEMORY(default: false)
10.3. Rollout Plan
Week 1: Deploy enhanced ZepService, no user-facing changes Week 2: Enable Zep memory storage for new decisions (passive mode) Week 3: Enable Zep context retrieval for 10% of AI calls (A/B test) Week 4: Enable for 100% of AI calls, backfill historical decisions
11. Success Criteria
M73 is complete when: ✅ Zep sessions created for all new decisions ✅ User messages and AI responses stored in Zep ✅ Facts auto-extracted from conversations ✅ AI prompts enriched with Zep context (user preferences, past decisions) ✅ Socratic questions reference user's past patterns ✅ Tenant isolation verified (no cross-tenant leakage) ✅ Unit test coverage ≥ 80% ✅ Integration tests passing ✅ Performance targets met (< 1s fact retrieval) ✅ At least 50 historical decisions backfilled
12. Dependencies
| Dependency | Status | Required For |
|---|---|---|
| Zep Client Initialization | ✅ 40% Complete | Foundation |
| M70 (Socratic Inquiry Engine) | 🔄 In Progress | Primary consumer of Zep memory |
| M71 (Decision History Feed) | 🔄 In Progress | Displays past sessions from Zep |
| M72 (Cognee GraphRAG) | 🔄 In Progress | Complementary context source (GraphRAG + Zep = full context) |
13. Future Enhancements (Post-M73)
Phase 2 Features:
- Memory Dashboards: User can see their extracted facts and preferences
- Fact Editing: User can manually add/edit facts ("I prefer suppliers with 99% uptime")
- Shared Organizational Memory: Tenant-wide facts (not user-specific)
- Memory Export: Download Zep memory as JSON for portability
- Cross-Decision Pattern Detection: "You've made 5 decisions prioritizing cost over quality in Q4 2024"
Document Status: Complete FSD Last Updated: 2025-11-14 Milestone: M73 - Zep Memory Activation (40% → 100%) Estimated Implementation: 2 weeks (1 backend developer)