Skip to main content

ChainAlign Demo Tenant Seeding Plan

Status: Draft Last Updated: 2025-10-27 Owner: CloudAlign Demo Preparation


Executive Summary

This document outlines the comprehensive plan for seeding demonstration data across four distinct ChainAlign tenant scenarios. Each scenario targets a specific business process and industry vertical:

  1. Discrete Manufacturing (Electronics/Tech) - TechElectronics Tenant
  2. Process Manufacturing (Pharma/APIs) - PharmaCo Tenant
  3. Retail Newsvendor (Multi-location retail) - MegaRetail Tenant (with VN2 dataset)
  4. Financial/Agricultural (Commodity Supply) - AgroScience Tenant

Key Principles:

  • Data is seeded once and never regenerated via script
  • All data follows daily granularity (required for ML training)
  • Firebase email/password auth enables multiple demo users without Google accounts
  • Each tenant is completely isolated with its own admin user
  • All data represents realistic, production-like scenarios

Database Audit Summary

Current State (as of 2025-10-27)

AspectStatus
Tenants Created8 total (3 system, 2 dev, 3 demo-candidate)
Demo-ready TenantsMegaRetail (f9b46ebc-97f7-4fd4-818c-f45a6deeb587)
Test TenantTest Tenant (f47ac10b-58cc-4372-a567-0e02b2c3d479) - 1 location, 1 product, 0 users
VN2 Dataset Files✓ Available at docs/reference-docs/VN2-competition-datasets/
VN2 CSV FilesMaster.csv, Sales.csv (157 weeks), In Stock.csv
User Accounts0 in any demo tenant (all need creation)
Actuals Data0 records in any demo tenant
Sales History0 records in any demo tenant

Key Files & Scripts

  • VN2 Import Script: backend/scripts/import_vn2_dataset.js - Imports 67 stores, 297 products, 599 store-product combinations, 157 weeks of sales
  • VN2 Migration: backend/migrations/20251016000007_create_vn2_simulation_tables.cjs
  • Policy Tables: backend/migrations/20251027000001-3_create_policy_*_table.cjs (events, routing, compliance)

Demo Scenario 1: Discrete Manufacturing (TechElectronics)

Scenario Overview

Electronics manufacturing company sourcing components from multiple suppliers across Asia and Mexico. Faces tariff volatility, supply chain disruptions, and seasonal demand spikes for consumer electronics.

Tenant Configuration

Company Name: TechElectronics
Tenant ID: (to be created)
Industry: Electronics Manufacturing
Key Products: Semiconductors, PCBs, Power supplies
Locations: 4 (US manufacturing plants + regional warehouse)
Time Horizon: 5 years (2020-2025) daily data
Demo Focus: Tariff impact on component sourcing

Demo Data Requirements

1. Locations (4 stores)

  • US Manufacturing Plant 1 (Houston, TX) - Receiving/Assembly
  • US Manufacturing Plant 2 (Phoenix, AZ) - Testing/QA
  • Regional Warehouse (Chicago, IL) - Distribution hub
  • Supplier Hub (Singapore) - Transhipment point

Data: Location configurations, capacities, operational hours

2. Products (10 SKUs)

- SEMI-001: Microcontroller Unit (MCU) - sourced from Taiwan/Singapore
- SEMI-002: Power Management IC - sourced from Japan
- PCB-001: FR-4 PCB Assembly - sourced from China/Vietnam
- SUPPLY-001: AC/DC Power Supply - sourced from Mexico/China
- SUPPLY-002: Switching Power Supply - sourced from Malaysia
- CONNECTOR-001: High-density Connector - sourced from Japan
- CRYSTAL-001: Oscillator Crystal - sourced from China
- RESISTOR-001: Precision Resistor Kit - sourced from Taiwan
- CAPACITOR-001: High-cap Capacitor - sourced from Vietnam
- SENSOR-001: Sensor Module - sourced from South Korea

3. Actuals Data (5 years daily = ~1,825 records per SKU)

  • Date Range: 2020-01-01 to 2025-10-27 (daily granularity)
  • Metrics per SKU per day:
    • demand_actual: Daily demand in units (realistic ranges per product)
    • shipments_actual: Incoming shipments
    • production_actual: Production output
    • inventory_actual: End-of-day inventory

Sample Pattern (SEMI-001):

  • Baseline: 500 units/day
  • Seasonality: ±40% variation (summer surge for consumer demand)
  • Disruptions:
    • April 2021: Taiwan lockdown → 20-day zero supply
    • May 2022: Shanghai lockdown → 40-day 50% reduction
    • March 2023: Tariff shock → demand spike 300 units/day for 30 days

4. Policy Events (6-8 major events)

1. US-China Trade Tensions (Jan 2020)
- Type: EXTERNAL (TARIFF_CHANGE)
- Impact: 15% on PCBs, semiconductors
- Affected: CN products

2. COVID-19 Supply Shock (Mar 2020)
- Type: EXTERNAL (SANCTIONS/SUPPLY_DISRUPTION)
- Impact: 60% reduction for 8 weeks
- Affected: All suppliers

3. Taiwan Drought Impact (Apr 2021)
- Type: EXTERNAL (SUPPLY_DISRUPTION)
- Impact: SEMI-001, SEMI-002 production stops
- Duration: 20 days

4. Shanghai Lockdown (Apr-May 2022)
- Type: EXTERNAL (SUPPLY_DISRUPTION)
- Impact: 50% reduction PCB production
- Affected: PCB-001, FR-4 suppliers

5. CHIPS Act Incentives (Aug 2022)
- Type: INTERNAL (STRATEGIC_DECISION)
- Decision: Nearshore 40% of US production to Mexico
- Impact: Shift from Asia to Mexico sourcing

6. Uyghur Forced Labor Prevention (Jan 2023)
- Type: REGULATORY (COMPLIANCE_REQUIREMENT)
- Impact: Audit all supply chain
- Affected: All China suppliers

7. US Tariff on Semiconductors (Mar 2023)
- Type: EXTERNAL (TARIFF_CHANGE)
- Impact: 25% on SEMI products from Taiwan/Singapore
- Affected: SEMI-001, SEMI-002

8. Supply Chain Diversification Initiative (Jun 2024)
- Type: INTERNAL (STRATEGIC_DECISION)
- Decision: Add Vietnam supplier for PCBs
- Impact: 30% shift from China to Vietnam

5. Compliance Records

  • RoHS compliance tracking (all products)
  • Conflict minerals tracking (semiconductors)
  • Supply chain audit records

Demo Scenario 2: Process Manufacturing (PharmaCo)

Scenario Overview

Pharmaceutical API (Active Pharmaceutical Ingredient) manufacturer producing critical acetaminophen, ibuprofen, and specialty APIs for US and EU markets. Highly regulated with strict compliance requirements.

Tenant Configuration

Company Name: PharmaCo
Tenant ID: (to be created)
Industry: Pharmaceutical Manufacturing
Key Products: APIs (Acetaminophen, Ibuprofen, Specialty compounds)
Locations: 3 (Manufacturing facilities + quality control lab)
Time Horizon: 5 years (2020-2025) daily data
Demo Focus: Compliance tracking, regulatory policy impact

Demo Data Requirements

1. Locations (3 facilities)

  • Primary Manufacturing (New Jersey) - Production facility
  • Quality Assurance (New Jersey) - Testing/QC
  • Finished Goods Warehouse (New Jersey) - Pharma-grade storage

Specs: Temperature-controlled, hygiene protocols, capacity tracking

2. Products (5 APIs)

- API-ACETAMINOPHEN: Acetaminophen API
- Annual volume: 500 tons
- Suppliers: China (70%), India (30%)
- Regulatory: FDA approved, USP grade

- API-IBUPROFEN: Ibuprofen API
- Annual volume: 300 tons
- Suppliers: China (100%)
- Regulatory: FDA approved, EP grade

- API-SPECIALTY-001: Custom API for rare disease
- Annual volume: 50 tons
- Suppliers: Europe (80%), India (20%)
- Regulatory: FDA approved, custom synthesis

- API-EXCIPIENT-001: Excipient bulk chemical
- Annual volume: 1000 tons
- Suppliers: Multiple
- Regulatory: USP/EP grade

- API-METABOLITE-001: Metabolite compound
- Annual volume: 10 tons
- Suppliers: Contract manufacturers
- Regulatory: Research/development

3. Actuals Data (5 years daily = ~1,825 records per API)

  • Date Range: 2020-01-01 to 2025-10-27
  • Metrics per API per day:
    • demand_actual: API units (kg/tons)
    • production_actual: Manufacturing output
    • inventory_actual: Pharma-grade storage inventory
    • in_stock: Binary (sufficient stock for regulatory compliance)

Realistic Pattern (API-ACETAMINOPHEN):

  • Baseline: 1.37 tons/day (500 tons/year)
  • Seasonality: ±30% (winter flu season surge)
  • Disruptions:
    • Feb 2021: India supplier lockdown → 30-day 50% shortage
    • Oct 2022: China export restrictions → Regulatory compliance risk
    • Jan 2024: Supply realignment → Ramp up India supplier to 50%

4. Policy Events (8-10 regulatory events)

1. FDA GMP Inspection (Mar 2021)
- Type: REGULATORY (COMPLIANCE_AUDIT)
- Impact: Quality certification review
- Result: Pass, update standard operating procedures

2. Indian Supplier Lockdown (Feb 2021)
- Type: EXTERNAL (SUPPLY_DISRUPTION)
- Impact: 50% API-ACETAMINOPHEN supply loss
- Duration: 30 days

3. China Vitamin Export Restrictions (Oct 2022)
- Type: EXTERNAL (EXPORT_CONTROL)
- Impact: Supply chain disruption alert for acetaminophen feedstock
- Affected: China suppliers

4. Uyghur Forced Labor Prevention Act (UFLPA) (Jan 2023)
- Type: REGULATORY (COMPLIANCE_REQUIREMENT)
- Impact: Must audit all suppliers, exclude XINJIANG sources
- Affected: Some Chinese suppliers

5. FDA EUA Authorizations for New APIs (May 2023)
- Type: REGULATORY (NEW_REQUIREMENT)
- Impact: Fast-track approval for specialty APIs
- Result: API-SPECIALTY-001 approved for US market

6. EU CPTPP Trade Agreement (Jul 2023)
- Type: EXTERNAL (FTA_SIGNED)
- Impact: Tariff reduction 5% on EU-sourced APIs
- Affected: European suppliers

7. Supply Chain Resilience Mandate (Jan 2024)
- Type: INTERNAL (STRATEGIC_DECISION)
- Decision: Diversify supplier base to India (50% target)
- Impact: Ramp up India supplier agreements

8. Quality Standard Update (ISO 13485:2024) (Mar 2024)
- Type: REGULATORY (STANDARD_UPDATE)
- Impact: Implement new quality metrics
- Affected: All products, all locations

9. Carbon Neutrality Commitment (Jun 2024)
- Type: INTERNAL (SUSTAINABILITY_GOAL)
- Decision: Track carbon footprint by supplier
- Impact: Prefer green-certified suppliers

10. USFDA Batch Recall Procedure Update (Aug 2024)
- Type: REGULATORY (PROCEDURE_UPDATE)
- Impact: Faster recall procedures required
- Compliance: Enhanced testing protocols

5. Compliance Records

  • FDA Registration: All facilities + products
  • EU QUALIFIED: Select products for European market
  • ISO 13485:2024: Quality management certification
  • Supply Chain Audits: Annual supplier audits
  • Batch Testing: Test results for quality verification

Demo Scenario 3: Retail Newsvendor (MegaRetail)

Scenario Overview

Multi-location retail chain with 67 stores across regions, selling diverse products (297 SKUs) across different categories. Uses VN2 Newsvendor Challenge dataset for realistic weekly optimization problem.

Tenant Configuration

Company Name: MegaRetail
Tenant ID: f9b46ebc-97f7-4fd4-818c-f45a6deeb587 (EXISTING)
Industry: Retail
Key Products: 297 SKUs across categories (clothing, electronics, home goods)
Locations: 67 stores
Time Horizon: 157 weeks (2021-04-12 to 2024-04-08) daily aggregation
Demo Focus: Newsvendor optimization, inventory balancing

Demo Data Source: VN2 Challenge Dataset

1. Locations: 67 Retail Stores

  • Store IDs: 1-67
  • Import via: import_vn2_dataset.js script
  • Format: VN2 Store 001 through VN2 Store 067
  • Location data: Store format, size category, capacity constraints

2. Products: 297 SKUs

  • Product IDs: 1-297
  • Import via: import_vn2_dataset.js script
  • Structure: By department (9 departments × ~33 products per department)
  • Attributes: Product group, division, department hierarchy

3. Sales History: 157 weeks (2021-04-12 to 2024-04-08)

  • Records: 67 stores × 297 products × 157 weeks = ~3.1M sales records
  • Import via: import_vn2_dataset.js script
  • Data file: Week 0 - 2024-04-08 - Sales.csv
  • Metrics: Weekly sales quantity per store-product combination
  • Granularity: Weekly (can be distributed daily as needed)

4. Newsvendor Cost Parameters

  • Holding Cost: 0.2€/unit/week
  • Shortage Cost: 1.0€/unit/week
  • Lead Time: 2 weeks
  • Critical Fractile (α): 0.8333 = shortage cost / (holding + shortage cost)
  • Target Service Level: 83.33%

5. Import Instructions

# Prerequisites: Tenant already exists (MegaRetail)
# VN2 CSV files at: docs/reference-docs/VN2-competition-datasets/

# Run import script
cd backend/
node scripts/import_vn2_dataset.js --tenant-id=f9b46ebc-97f7-4fd4-818c-f45a6deeb587

# Expected output:
# - 67 locations created
# - 297 products created
# - 599 newsvendor cost parameters (store-product combos with non-zero sales)
# - ~3.1M sales history records inserted

# Verification
psql -h 127.0.0.1 -p 54322 -U postgres -d postgres -c "
SELECT
COUNT(*) as total_locations
FROM locations
WHERE tenant_id = 'f9b46ebc-97f7-4fd4-818c-f45a6deeb587'
AND location_code LIKE 'VN2-STORE-%';"

# Should return: 67

Demo Scenario 4: Financial/Agricultural (AgroScience)

Scenario Overview

Agricultural commodity supplier tracking seasonal grain and crop markets with commodity price exposure, weather-driven demand, and financial hedging strategies.

Tenant Configuration

Company Name: AgroScience (use existing or create new)
Tenant ID: (to be assigned or created)
Industry: Agriculture/Commodity Trading
Key Products: Corn, Wheat, Soybeans, Specialty crops
Locations: 5 (Growing regions + distribution)
Time Horizon: 10 years (2015-2025) daily data
Demo Focus: Commodity price correlation, weather impact, financial reporting

Demo Data Requirements

1. Locations (5 regions)

  • Midwest Corn Belt (Iowa) - Primary growing region
  • Great Plains Wheat (Kansas) - Wheat production
  • Upper Midwest Soybeans (Minnesota) - Soy production
  • Distribution Hub (Chicago) - Central grain terminal
  • Export Terminal (New Orleans) - Mississippi River port

2. Products (4 commodity groups)

- CORN: Corn futures (contracts trade on CBOT)
- WHEAT: Wheat futures
- SOY: Soybean futures
- SPECIALTY: High-margin specialty crops (hemp, hemp-derived CBD)

3. Actuals Data (10 years daily = ~3,650 records per product)

  • Date Range: 2015-01-01 to 2025-10-27
  • Metrics:
    • demand_actual: Market volume (futures contracts, bushels)
    • revenue_actual: Commodity price × volume
    • inventory_actual: Grain storage levels
    • production_actual: Harvest volumes (seasonal)

Realistic Patterns:

  • CORN:

    • Seasonality: Plant (May), Grow (Jun-Aug), Harvest (Sep-Nov), Store (Dec-Apr)
    • Price volatility: ±30% annually based on USDA crop reports
    • Demand peaks: Fall harvest, Spring planting
  • Financial data:

    • Commodity prices from FRED API (if available)
    • Futures contracts tracking
    • Hedge ratios for price protection

4. Policy Events (8-10 commodity/trade events)

1. Commodity Crop Insurance Rule Changes (Jan 2015)
- Type: REGULATORY
- Impact: New insurance requirements for hedging

2. China Tariff War on US Agriculture (Jul 2018)
- Type: EXTERNAL (TARIFF_CHANGE)
- Impact: 25% tariff on soybeans, corn
- Duration: Through 2020

3. Donald Trump Agriculture Assistance Program (2018-2020)
- Type: INTERNAL (GOVERNMENT_SUBSIDY)
- Impact: Direct aid, crop insurance subsidies

4. Midwest Flooding Event (2019)
- Type: EXTERNAL (WEATHER_DISASTER)
- Impact: 30% crop loss in corn belt
- Duration: May-Jun 2019

5. USMCA Trade Agreement (Jul 2020)
- Type: EXTERNAL (FTA_SIGNED)
- Impact: Improved market access to Mexico/Canada
- Tariff change: -5% on US grains

6. Historic Commodity Price Spike (2021-2022)
- Type: EXTERNAL (MARKET_SHOCK)
- Impact: Russia-Ukraine war, supply disruptions
- Corn: +200%, Wheat: +300%, Soy: +150%

7. USDA Crop Quality Standards Update (2023)
- Type: REGULATORY (STANDARD_CHANGE)
- Impact: New moisture/grade requirements

8. Carbon Markets Development (2023-2024)
- Type: INTERNAL (SUSTAINABILITY_PROGRAM)
- Decision: Participate in carbon farming markets
- Impact: New revenue stream from carbon credits

9. Hemp/CBD Legalization Expansion (2024)
- Type: REGULATORY (NEW_MARKET)
- Impact: Open new cultivation opportunities
- Products: SPECIALTY crops now legal in all states

10. Climate Adaptation Incentives (2024-2025)
- Type: INTERNAL (STRATEGIC_DECISION)
- Decision: Switch 30% acreage to drought-resistant varieties
- Impact: Shift commodity mix toward climate-resilient crops

5. Financial Time Series

  • P&L Data: 10-year monthly/quarterly income statements
  • Commodity Prices: Daily historical price data from USDA/CBOT
  • Weather Data: Daily temperature, precipitation (critical for crop modeling)
  • Forecast Data: 10-year backtests with actual vs predicted yields

Firebase Email/Password Authentication Setup

Prerequisites

  1. Firebase project already configured for Google OAuth
  2. Need to enable Email/Password provider
  3. Custom claims for tenant_id mapping

Configuration Steps

1. Enable Email/Password in Firebase Console

1. Go to Firebase Console
2. Select ChainAlign project
3. Authentication → Sign-in method
4. Enable "Email/Password"
5. Optionally enable "Email link (passwordless sign-in)"

2. Create Demo Admin Users via Firebase CLI

# Install/update Firebase CLI
npm install -g firebase-tools

# Login to Firebase
firebase login

# Create users for each tenant
firebase auth:import users.json --hash-algo=bcrypt

# Or create individually via Firebase Console
# Email: admin@techemfg.demo
# Password: (auto-generated, provided separately)

3. Set Custom Claims for tenant_id Mapping

// Backend function to set custom claims (called during org creation)
// backend/src/services/AuthService.js

async function setTenantCustomClaim(uid, tenantId) {
const admin = require('firebase-admin');

await admin.auth().setCustomUserClaims(uid, {
tenant_id: tenantId,
role: 'ADMIN'
});
}

4. Frontend: Update AuthContext

// frontend/src/context/AuthContext.jsx
// Already supports Firebase auth, just needs email/password UI update

// In OrganizationAdminPage.jsx:
// Add option to create users with email/password instead of just Google OAuth

Demo User Accounts to Create

TenantEmailPasswordRolePurpose
TechElectronicsadmin@techemfg.demo[secure]ADMINDiscrete mfg demo
PharmaCoadmin@pharma.demo[secure]ADMINPharma demo
MegaRetailadmin@megaretail.demo[secure]ADMINRetail VN2 demo
AgroScienceadmin@agro.demo[secure]ADMINCommodity demo

Data Seeding Execution Plan

Phase 1: Tenant & Admin User Creation (Manual via UI)

Step 1: Create Tenants

Using OrganizationAdminPage:

  1. Create "TechElectronics" tenant → Capture tenant_id
  2. Create "PharmaCo" tenant → Capture tenant_id
  3. Verify MegaRetail exists → f9b46ebc-97f7-4fd4-818c-f45a6deeb587
  4. Create or find "AgroScience" tenant → Capture tenant_id

Step 2: Create Admin Users (Email/Password)

Using Firebase Console or CLI:

firebase auth:create-user \
--email admin@techemfg.demo \
--password 'SecurePassword123!' \
--display-name 'TechElectronics Admin'

Map users to tenants via custom claims:

firebase auth:set-custom-claims <UID> --claims '{"tenant_id":"<TENANT_ID>","role":"ADMIN"}'

Phase 2: Schema & Locations Setup

Step 1: Run Migrations

cd backend/
npx knex migrate:latest --env development

Step 2: Create Locations via SQL

For each tenant, insert locations:

-- TechElectronics: 4 locations
INSERT INTO locations (tenant_id, location_name, location_code, location_type, ...)
VALUES
(..., 'US Manufacturing Plant 1', 'TECH-MFG-001', 'manufacturing', ...),
(..., 'US Manufacturing Plant 2', 'TECH-MFG-002', 'manufacturing', ...),
(..., 'Regional Warehouse', 'TECH-WH-001', 'warehouse', ...),
(..., 'Supplier Hub', 'TECH-HUB-001', 'distribution', ...);

-- PharmaCo: 3 locations
INSERT INTO locations (tenant_id, location_name, location_code, location_type, ...)
VALUES
(..., 'Primary Manufacturing', 'PHARMA-MFG-001', 'manufacturing', ...),
(..., 'Quality Assurance', 'PHARMA-QA-001', 'facility', ...),
(..., 'Finished Goods Warehouse', 'PHARMA-WH-001', 'warehouse', ...);

-- AgroScience: 5 locations
INSERT INTO locations (...)
VALUES
(..., 'Midwest Corn Belt', 'AGRO-MID-001', 'growing_region', ...),
-- ... etc

Or create custom seed file: backend/seeds/demo_locations.js

Phase 3: Products Setup

Create backend/seeds/demo_products.js:

// Seed TechElectronics products
// Seed PharmaCo products
// AgroScience products
// MegaRetail handled by import_vn2_dataset.js

Phase 4: Demand Data (Actuals) Loading

For TechElectronics & PharmaCo:

Create custom script: backend/scripts/seed_demo_actuals.js

// Generate 5-year daily demand with realistic patterns
// - Baseline per product
// - Seasonality (sine wave patterns)
// - Disruption events (step function drops/spikes)
// - Insert into actuals table in daily chunks

For MegaRetail:

cd backend/
node scripts/import_vn2_dataset.js --tenant-id=f9b46ebc-97f7-4fd4-818c-f45a6deeb587

For AgroScience:

Create script: backend/scripts/seed_commodity_data.js

// Load 10-year daily commodity price data
// Integrate USDA/FRED API data if available
// Otherwise use synthetic realistic commodity price patterns

Phase 5: Policy Events & Compliance Records

Create backend/scripts/seed_policy_events.js:

// Insert policy_events for each tenant
// Insert supplier_routing_history for supply chain changes
// Insert policy_compliance_records for regulatory tracking

Phase 6: Verification & Validation

Run validation queries:

node backend/scripts/validate_demo_data.js

# Checks:
# - Each tenant has admin user
# - Each tenant has locations (correct counts)
# - Each location has products
# - Actuals data spans date ranges
# - Policy events exist and are linked
# - No NULL foreign keys
# - Daily data density is maintained

Timeline & Responsibilities

PhaseTaskEffortOwnerStatus
1Create 4 demo tenants30 minPramodPending
1Create Firebase email/pwd users1 hourDevPending
2Create locations for 3 new tenants2 hoursScriptsPending
3Create 30+ products across tenants1.5 hoursScriptsPending
4aTechElectronics: 5-year demand (40 SKUs × 5 yrs × 4 locations)2 hoursScriptPending
4bPharmaCo: 5-year demand (5 SKUs × 5 yrs × 3 locations)1 hourScriptPending
4cMegaRetail: VN2 dataset import2 hoursScriptPending
4dAgroScience: 10-year commodity data3 hoursScriptPending
5Policy events & compliance records2 hoursScriptPending
6Validation & testing1 hourQAPending
Total15.5 hours

Success Criteria

  • ✅ 4 demo tenants fully seeded and ready for demo
  • ✅ Each tenant has admin user with email/password auth
  • ✅ Daily data spans required date ranges (5 or 10 years)
  • ✅ Policy events are created and linked to demand patterns
  • ✅ Compliance records reflect regulatory scenarios
  • ✅ MegaRetail has 67 stores, 297 products, 3.1M sales records
  • ✅ No foreign key violations or data integrity issues
  • ✅ Demo users can log in with email/password
  • ✅ Each tenant's dashboard shows realistic, connected data

Appendix: SQL Validation Queries

Check Tenant Data Density

SELECT
t.tenant_id,
t.company_name,
COUNT(DISTINCT l.location_id) as location_count,
COUNT(DISTINCT ce.entity_id) as product_count,
COUNT(DISTINCT a.actual_id) as actuals_count,
MIN(a.created_at) as earliest_actual,
MAX(a.created_at) as latest_actual,
COUNT(DISTINCT u.user_id) as user_count
FROM tenants t
LEFT JOIN locations l ON t.tenant_id = l.tenant_id
LEFT JOIN core_entities ce ON t.tenant_id = ce.tenant_id
LEFT JOIN actuals a ON t.tenant_id = a.tenant_id
LEFT JOIN users u ON t.tenant_id = u.tenant_id
WHERE t.tenant_id IN (
'f47ac10b-58cc-4372-a567-0e02b2c3d479', -- Test Tenant
'f9b46ebc-97f7-4fd4-818c-f45a6deeb587', -- MegaRetail
-- Add TechElectronics, PharmaCo, AgroScience IDs here
)
GROUP BY t.tenant_id, t.company_name
ORDER BY t.company_name;

Verify MegaRetail VN2 Import

SELECT
COUNT(*) as total_locations,
COUNT(DISTINCT location_code) as unique_stores,
MAX(SUBSTRING(location_code, 11)) as max_store_num
FROM locations
WHERE tenant_id = 'f9b46ebc-97f7-4fd4-818c-f45a6deeb587'
AND location_code LIKE 'VN2-STORE-%';

-- Should return: 67 locations, 67 unique stores

Check Daily Data Density

SELECT
COUNT(*) as total_actuals,
COUNT(DISTINCT DATE(created_at)) as unique_dates,
ROUND(100.0 * COUNT(DISTINCT DATE(created_at)) / 1826, 1) as pct_coverage_5yr
FROM actuals
WHERE tenant_id = 'YOUR_TENANT_ID'
AND created_at >= '2020-01-01'
AND created_at < '2025-10-27';

-- Should be close to 100% for 5-year period (1,826 days)

Notes & Decisions

  1. Why not regenerate data via script?

    • Previous generate-test-data.sh runs were unreproducible and chaotic
    • Demo data is seeded once, then manually reviewed and validated
    • Any updates are tracked as explicit migrations/seed files
  2. Why daily granularity?

    • ML models require daily density for training
    • Monthly/quarterly aggregates lose patterns needed for correlation discovery
    • Demo emphasizes "external factors explain daily movements"
  3. Why email/password auth?

    • Avoids need for multiple Google accounts
    • Allows persistent demo credentials that don't change
    • Separate from production auth for security
  4. Why separate seeding from migrations?

    • Migrations handle schema (policy tables, etc.)
    • Seeding handles demo data (once-only, non-repeating)
    • Easier to debug and modify demo data without schema changes
  5. MegaRetail choice for VN2:

    • Existing tenant ready to use
    • VN2 dataset is comprehensive (67×297 store-product matrix)
    • Real competition dataset adds authenticity to demo

Next Steps: Execute Phase 1 (tenant creation) and Phase 2 (locations), then run validation.