docs: rules engine design spec — Phase 1 (engine + storage + API + worklist)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-31 18:10:57 +05:30
parent 1d1b271227
commit 41dbbbb0fe

View File

@@ -0,0 +1,399 @@
# Rules Engine — Design Spec
**Date**: 2026-03-31
**Status**: Draft
**Phase**: 1 (Engine + Storage + API + Worklist Integration)
---
## Overview
A configurable rules engine that governs how leads flow through the hospital's call center — which leads get called first, which agent handles them, when to escalate, and when to mark them lost. Each hospital defines its own rules. No code changes needed to change behavior.
**Product pitch**: "Your hospital defines the rules, the call center follows them automatically."
---
## Architecture
Self-contained NestJS module inside helix-engage-server (sidecar). Designed for extraction into a standalone microservice when needed.
```
helix-engage-server/src/rules-engine/
├── rules-engine.module.ts # NestJS module (self-contained)
├── rules-engine.service.ts # Core: json-rules-engine wrapper
├── rules-engine.controller.ts # REST API: CRUD + evaluate
├── rules-storage.service.ts # Redis (hot) + JSON file (backup)
├── types/
│ ├── rule.types.ts # Rule schema
│ ├── fact.types.ts # Fact definitions
│ └── action.types.ts # Action definitions
├── facts/
│ ├── lead-facts.provider.ts # Lead/campaign data facts
│ ├── call-facts.provider.ts # Call/SLA data facts
│ └── agent-facts.provider.ts # Agent availability facts
├── actions/
│ ├── score.action.ts # Priority scoring action
│ ├── assign.action.ts # Lead-to-agent assignment
│ ├── escalate.action.ts # SLA breach alerts
│ ├── update.action.ts # Update entity field
│ └── notify.action.ts # Send notification
├── consumers/
│ └── worklist.consumer.ts # Applies scoring rules to worklist
└── templates/
└── hospital-starter.json # Pre-built rule set for new hospitals
```
### Dependencies
- `json-rules-engine` (npm) — rule evaluation
- Redis — active rule storage, score cache
- Platform GraphQL — fact data (leads, calls, campaigns, agents)
- No imports from other sidecar modules except via constructor injection
### Communication
- Own Redis namespace: `rules:*`
- Own route prefix: `/api/rules/*`
- Other modules call `RulesEngineService.evaluate()` — they don't import internals
---
## Rule Schema
```typescript
type Rule = {
id: string; // UUID
name: string; // Human-readable: "High priority for IVF missed calls"
description?: string; // BA-friendly explanation
enabled: boolean; // Toggle on/off without deleting
priority: number; // Evaluation order (lower = first)
trigger: RuleTrigger; // When to evaluate
conditions: RuleConditionGroup; // What to check
action: RuleAction; // What to do
metadata: {
createdAt: string;
updatedAt: string;
createdBy: string; // User who created
category: RuleCategory; // For UI grouping
tags?: string[]; // Optional tags for filtering
};
};
type RuleTrigger =
| { type: 'on_request'; request: 'worklist' | 'assignment' }
| { type: 'on_event'; event: 'lead.created' | 'lead.updated' | 'call.created' | 'call.ended' | 'call.missed' | 'disposition.submitted' }
| { type: 'on_schedule'; interval: string } // cron expression or "5m", "1h"
| { type: 'always' }; // evaluated in all contexts
type RuleCategory =
| 'priority' // Worklist scoring
| 'assignment' // Lead/call routing to agent
| 'escalation' // SLA breach handling
| 'lifecycle' // Lead status transitions
| 'qualification'; // Lead quality scoring
type RuleConditionGroup = {
all?: RuleCondition[]; // AND
any?: RuleCondition[]; // OR
};
type RuleCondition = {
fact: string; // Fact name (see Fact Registry below)
operator: RuleOperator;
value: any;
path?: string; // JSON path for nested facts
} | RuleConditionGroup; // Nested group for complex logic
type RuleOperator =
| 'equal' | 'notEqual'
| 'greaterThan' | 'greaterThanInclusive'
| 'lessThan' | 'lessThanInclusive'
| 'in' | 'notIn'
| 'contains' | 'doesNotContain'
| 'exists' | 'doesNotExist';
type RuleAction = {
type: 'score' | 'assign' | 'escalate' | 'update' | 'notify';
params: Record<string, any>;
};
// Score action params
type ScoreActionParams = {
weight: number; // 0-10 base weight
slaMultiplier?: boolean; // Apply SLA urgency curve
campaignMultiplier?: boolean; // Apply campaign weight
};
// Assign action params
type AssignActionParams = {
agentId?: string; // Specific agent
agentPool?: string[]; // Round-robin from pool
strategy: 'specific' | 'round-robin' | 'least-loaded' | 'skill-based';
};
// Escalate action params
type EscalateActionParams = {
channel: 'toast' | 'notification' | 'sms' | 'email';
recipients: 'supervisor' | 'agent' | string[]; // Specific user IDs
message: string; // Template with {{variables}}
severity: 'warning' | 'critical';
};
// Update action params
type UpdateActionParams = {
entity: 'lead' | 'call' | 'followUp';
field: string;
value: any;
};
// Notify action params
type NotifyActionParams = {
channel: 'toast' | 'bell' | 'sms';
message: string;
target: 'agent' | 'supervisor' | 'all';
};
```
---
## Fact Registry
Facts are the data points rules can check against. Each fact has a provider that fetches/computes the value.
### Lead Facts (`lead-facts.provider.ts`)
| Fact Name | Type | Description |
|---|---|---|
| `lead.source` | string | Lead source (FACEBOOK_AD, GOOGLE_AD, PHONE, etc.) |
| `lead.status` | string | Lead status (NEW, CONTACTED, QUALIFIED, etc.) |
| `lead.priority` | string | Manual priority (LOW, NORMAL, HIGH, URGENT) |
| `lead.campaignId` | string | Associated campaign ID |
| `lead.campaignName` | string | Campaign name (resolved) |
| `lead.campaignPlatform` | string | Campaign platform (FACEBOOK, GOOGLE, etc.) |
| `lead.interestedService` | string | Service interest |
| `lead.contactAttempts` | number | Number of contact attempts |
| `lead.ageMinutes` | number | Minutes since lead created |
| `lead.ageDays` | number | Days since lead created |
| `lead.lastContactedMinutes` | number | Minutes since last contact |
| `lead.hasPatient` | boolean | Whether linked to a patient |
| `lead.isDuplicate` | boolean | Whether marked as duplicate |
| `lead.isSpam` | boolean | Whether marked as spam |
| `lead.spamScore` | number | Spam prediction score |
| `lead.leadScore` | number | Lead quality score |
### Call Facts (`call-facts.provider.ts`)
| Fact Name | Type | Description |
|---|---|---|
| `call.direction` | string | INBOUND or OUTBOUND |
| `call.status` | string | MISSED, COMPLETED, etc. |
| `call.disposition` | string | Call outcome |
| `call.durationSeconds` | number | Call duration |
| `call.callbackStatus` | string | PENDING_CALLBACK, ATTEMPTED, etc. |
| `call.slaElapsedPercent` | number | % of SLA time elapsed (0-100+) |
| `call.slaBreached` | boolean | Whether SLA is breached |
| `call.missedCount` | number | Times this number was missed |
| `call.taskType` | string | missed_call, follow_up, campaign_lead, attempt_2, attempt_3 |
### Agent Facts (`agent-facts.provider.ts`)
| Fact Name | Type | Description |
|---|---|---|
| `agent.status` | string | READY, ON_CALL, BREAK, OFFLINE |
| `agent.activeCallCount` | number | Current active calls |
| `agent.todayCallCount` | number | Calls handled today |
| `agent.skills` | string[] | Agent skill tags |
| `agent.campaigns` | string[] | Assigned campaign IDs |
| `agent.idleMinutes` | number | Minutes idle |
---
## Scoring System
The worklist consumer uses scoring rules to rank items. The formula:
```
finalScore = baseScore × slaMultiplier × campaignMultiplier
```
### Base Score
Determined by the rule's `weight` param (0-10). Multiple rules can fire for the same item — scores are **summed**.
### SLA Multiplier (time-sensitive, computed at request time)
```
if slaElapsed <= 100%: multiplier = (slaElapsed / 100) ^ 1.6
if slaElapsed > 100%: multiplier = 1.0 + (excess × 0.05)
```
Non-linear curve — urgency accelerates as deadline approaches. Continues increasing past breach.
### Campaign Multiplier
```
campaignWeight (0-10) / 10 × sourceWeight (0-10) / 10
```
IVF(9) × WhatsApp(9) = 0.81. Health(7) × Instagram(5) = 0.35.
### Score Caching
- Base scores cached in Redis on data change events (`rules:scores:{itemId}`)
- SLA multiplier computed at request time (changes every minute)
- Cache TTL: 5 minutes (safety — events should invalidate earlier)
---
## Storage
### Redis Keys
```
rules:config # JSON array of all Rule objects
rules:config:backup_path # Path to JSON backup file
rules:scores:{itemId} # Cached base score per worklist item
rules:scores:version # Incremented on rule change (invalidates all scores)
rules:eval:log:{ruleId} # Last evaluation result (debug)
```
### JSON File Backup
On every rule change:
1. Write to Redis
2. Persist `rules:config` to `data/rules-config.json` in sidecar working directory
3. On sidecar startup: if Redis is empty, load from JSON file
---
## API Endpoints
### Rule CRUD
```
GET /api/rules # List all rules
GET /api/rules/:id # Get single rule
POST /api/rules # Create rule
PUT /api/rules/:id # Update rule
DELETE /api/rules/:id # Delete rule
PATCH /api/rules/:id/toggle # Enable/disable
POST /api/rules/reorder # Change evaluation order
```
### Evaluation
```
POST /api/rules/evaluate # Evaluate rules against provided facts
GET /api/rules/explain/:itemId # Why is this item scored this way?
```
### Templates
```
GET /api/rules/templates # List available rule templates
POST /api/rules/templates/:id/apply # Apply a template (creates rules)
```
---
## Worklist Integration (First Consumer)
### Current Flow
```
GET /api/worklist → returns leads + missed calls + follow-ups → frontend sorts by priority + createdAt
```
### New Flow
```
GET /api/worklist → fetch items → RulesEngineService.scoreWorklist(items) → return items with scores → frontend displays by score
```
### Response Change
Each worklist item gains:
```typescript
{
...existingFields,
score: number; // Computed priority score
scoreBreakdown: { // Explainability
baseScore: number;
slaMultiplier: number;
campaignMultiplier: number;
rulesApplied: string[]; // Rule names that fired
};
slaStatus: 'low' | 'medium' | 'high' | 'critical';
slaElapsedPercent: number;
}
```
### Frontend Changes
- Worklist sorts by `score` descending instead of hardcoded priority
- SLA status dot (green/amber/red/dark-red) replaces priority badge
- Tooltip on score shows breakdown ("IVF campaign ×0.81, Missed call weight 9, SLA 72% elapsed")
---
## Hospital Starter Template
Pre-configured rules for a typical hospital. Applied on first setup.
```json
[
{
"name": "Missed calls — high urgency",
"category": "priority",
"trigger": { "type": "on_request", "request": "worklist" },
"conditions": { "all": [{ "fact": "call.taskType", "operator": "equal", "value": "missed_call" }] },
"action": { "type": "score", "params": { "weight": 9, "slaMultiplier": true } }
},
{
"name": "Scheduled follow-ups",
"category": "priority",
"trigger": { "type": "on_request", "request": "worklist" },
"conditions": { "all": [{ "fact": "call.taskType", "operator": "equal", "value": "follow_up" }] },
"action": { "type": "score", "params": { "weight": 8, "slaMultiplier": true } }
},
{
"name": "Campaign leads — weighted by campaign",
"category": "priority",
"trigger": { "type": "on_request", "request": "worklist" },
"conditions": { "all": [{ "fact": "call.taskType", "operator": "equal", "value": "campaign_lead" }] },
"action": { "type": "score", "params": { "weight": 7, "slaMultiplier": true, "campaignMultiplier": true } }
},
{
"name": "SLA breach — escalate to supervisor",
"category": "escalation",
"trigger": { "type": "on_schedule", "interval": "5m" },
"conditions": { "all": [{ "fact": "call.slaBreached", "operator": "equal", "value": true }, { "fact": "call.callbackStatus", "operator": "equal", "value": "PENDING_CALLBACK" }] },
"action": { "type": "escalate", "params": { "channel": "notification", "recipients": "supervisor", "message": "SLA breached for {{lead.name}} — no callback attempted", "severity": "critical" } }
},
{
"name": "Spam leads — deprioritize",
"category": "priority",
"trigger": { "type": "on_request", "request": "worklist" },
"conditions": { "all": [{ "fact": "lead.spamScore", "operator": "greaterThan", "value": 60 }] },
"action": { "type": "score", "params": { "weight": -3 } }
}
]
```
---
## Phase 2 (Future — UI)
Not in this spec, but the engine is designed for:
- Supervisor settings page with visual rule builder
- Untitled UI components: Slider (weights), Toggle (enable/disable), Select (conditions), Tabs (categories)
- Live preview — change a weight, watch worklist re-rank in real-time
- Rule templates — "Hospital starter pack" one-click apply
- Explainability — agent sees why a lead is ranked where it is
---
## Scope Boundaries
**In scope (Phase 1):**
- `json-rules-engine` integration in sidecar
- Complete rule schema (scoring, assignment, escalation, lifecycle, qualification)
- Fact providers (lead, call, agent)
- Action handlers (score only — others are stubs)
- Redis storage + JSON backup
- CRUD API endpoints
- Worklist consumer (scoring integration)
- Hospital starter template
- Score explainability on API response
**Out of scope (Phase 2+):**
- Configuration UI
- Assignment action handler (stub only)
- Escalation action handler (stub only)
- Event-driven rule evaluation (on_event triggers)
- Scheduled rule evaluation (on_schedule triggers)
- Frontend live preview
- Multi-tenant rule isolation (currently single workspace)