CC Agent: - Call transfer (CONFERENCE + KICK_CALL) with inline transfer dialog - Recording pause/resume during active calls - Missed calls API (Ozonetel abandonCalls) - Call history API (Ozonetel fetchCDRDetails) Live Call Assist: - Deepgram Nova STT via raw WebSocket - OpenAI suggestions every 10s with lead context - LiveTranscript component in sidebar during calls - Browser audio capture from remote WebRTC stream Worklist: - Redesigned table: clickable phones, context menu (Call/SMS/WhatsApp) - Last interaction sub-line, source column, improved SLA - Filtered out rows without phone numbers - New missed call notifications Brand: - Logo on login page - Blue scale rebuilt from logo blue rgb(32, 96, 160) - FontAwesome duotone CSS variables set globally - Profile menu icons switched to duotone Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7.6 KiB
Live Call Assist — Design Spec
Problem
CC agents have no real-time intelligence during calls. The AI sidebar shows a static pre-call summary and a chat interface that requires manual typing — useless when the agent is on the phone. The agent has to remember lead history, doctor availability, and past interactions from memory.
Solution
Stream the call's remote audio (customer voice) to the sidecar, transcribe via Deepgram Nova, and every 10 seconds feed the accumulated transcript + full lead context to OpenAI for real-time suggestions. Display a scrolling transcript with AI suggestion cards in the sidebar.
Architecture
Browser (WebRTC call)
│
├─ Remote audio track (customer) ──► AudioWorklet (PCM 16-bit, 16kHz)
│ │
│ WebSocket to sidecar
│ │
│ ┌──────────▼──────────┐
│ │ Sidecar Gateway │
│ │ ws://api/call-assist│
│ └──────────┬──────────┘
│ │
│ ┌──────────────┼──────────────┐
│ ▼ ▼
│ Deepgram Nova WS Every 10s: OpenAI
│ (audio → text) (transcript + context
│ → suggestions)
│ │ │
│ ▼ ▼
│ Transcript lines AI suggestion cards
│ │ │
│ └──────────────┬──────────────┘
│ │
│ WebSocket to browser
│ │
└─────────────────────────────────────────▼
AI Sidebar
(transcript + suggestions)
Components
1. Browser: Audio capture + WebSocket client
Audio capture: When call becomes active, grab the remote audio track from the peer connection. Use an AudioWorklet to downsample to 16-bit PCM at 16kHz (Deepgram's preferred format). Send raw audio chunks (~100ms each) over WebSocket.
WebSocket client: Connects to wss://engage-api.srv1477139.hstgr.cloud/api/call-assist. Sends:
- Initial message:
{ type: "start", ucid, leadId, callerPhone } - Audio chunks: binary PCM data
- End:
{ type: "stop" }
Receives:
{ type: "transcript", text: "...", isFinal: boolean }— real-time transcript lines{ type: "suggestion", text: "...", action?: "book_appointment" | "transfer" }— AI suggestions{ type: "context_loaded", leadName: "...", summary: "..." }— confirmation that lead context was loaded
2. Sidecar: WebSocket Gateway
NestJS WebSocket Gateway at /api/call-assist. On connection:
- Receives
startmessage withucid,leadId,callerPhone - Loads lead context from platform: lead details, past calls, appointments, doctors, follow-ups
- Opens Deepgram Nova WebSocket (
wss://api.deepgram.com/v1/listen) - Pipes incoming audio chunks to Deepgram
- Deepgram returns transcript chunks — forwards to browser
- Every 10 seconds, sends accumulated transcript + lead context to OpenAI
gpt-4o-minifor suggestions - Returns suggestions to browser
System prompt for OpenAI (loaded once with lead context):
You are a real-time call assistant for Global Hospital Bangalore.
You listen to the conversation and provide brief, actionable suggestions.
CALLER CONTEXT:
- Name: {leadName}
- Phone: {phone}
- Source: {source} ({campaign})
- Previous calls: {callCount} (last: {lastCallDate}, disposition: {lastDisposition})
- Appointments: {appointmentHistory}
- Interested in: {interestedService}
- AI Summary: {aiSummary}
AVAILABLE RESOURCES:
- Doctors: {doctorList with departments and clinics}
- Next available slots: {availableSlots}
RULES:
- Keep suggestions under 2 sentences
- Focus on actionable next steps
- If customer mentions a doctor/department, show available slots
- If customer wants to cancel, note the appointment ID
- Flag if customer sounds upset or mentions a complaint
- Do NOT repeat information the agent already said
OpenAI call (every 10 seconds):
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: `Conversation so far:\n${transcript}\n\nProvide a brief suggestion for the agent.` },
],
max_tokens: 150,
});
3. Frontend: Live transcript sidebar
Replace the AI chat tab content during active calls with a live transcript view:
- Scrolling transcript with timestamps
- Customer lines in one color, suggestions in a highlighted card
- Auto-scroll to bottom as new lines arrive
- Suggestions appear as colored cards between transcript lines
- When call ends, transcript stays visible for reference during disposition
4. Context loading
On start message, the sidecar queries the platform for:
# Lead details
{ leads(filter: { id: { eq: "{leadId}" } }) { edges { node { ... } } } }
# Past appointments
{ appointments(filter: { patientId: { eq: "{leadId}" } }) { edges { node { ... } } } }
# Doctors
{ doctors(first: 20) { edges { node { id fullName department clinic } } } }
This context is loaded once and injected into the system prompt. No mid-call refresh needed.
File structure
Sidecar (helix-engage-server)
| File | Responsibility |
|---|---|
src/call-assist/call-assist.gateway.ts |
WebSocket gateway — handles audio streaming, Deepgram connection, OpenAI calls |
src/call-assist/call-assist.module.ts |
Module registration |
src/call-assist/call-assist.service.ts |
Context loading from platform, OpenAI prompt building |
Frontend (helix-engage)
| File | Responsibility |
|---|---|
src/lib/audio-capture.ts |
AudioWorklet to capture + downsample remote audio track |
src/hooks/use-call-assist.ts |
WebSocket connection to sidecar, manages transcript + suggestion state |
src/components/call-desk/live-transcript.tsx |
Scrolling transcript + suggestion cards UI |
src/components/call-desk/context-panel.tsx |
Modify: show LiveTranscript instead of AiChatPanel during active calls |
src/pages/call-desk.tsx |
Modify: remove CallPrepCard during active calls |
Dependencies
- Deepgram SDK:
@deepgram/sdkin sidecar (or raw WebSocket) - DEEPGRAM_API_KEY: environment variable in sidecar
- AudioWorklet: browser API, no dependencies (supported in all modern browsers)
- OpenAI: already configured in sidecar (
gpt-4o-mini)
Cost estimate
Per 5-minute call:
- Deepgram Nova: ~$0.02 (at $0.0043/min)
- OpenAI gpt-4o-mini: ~$0.005 (30 calls × ~500 tokens each)
- Total:
$0.025 per call (₹2)
Out of scope
- Agent mic transcription (only customer audio for now — agent's words are visible in the AI suggestions context)
- Voice response from AI (text only)
- Persistent transcript storage (future: save to Call record after call ends)
- Multi-language support (English only for now)