Files
helix-engage/docs/developer-operations-runbook.md
saridsa2 f09250f3ef docs: update developer runbook for EC2, remove duplicate
Rewrote developer-operations-runbook.md to reflect the current EC2
multi-tenant deployment (was VPS-only). Covers SSH key setup, all
containers, accounts, deploy steps, E2E tests, Redis ops, DB access,
and troubleshooting. Removed duplicate runbook.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:06:49 +05:30

11 KiB

Helix Engage — Developer Operations Runbook

Architecture

See architecture.md for the full multi-tenant topology diagram.

Browser (India)
    ↓ HTTPS
Caddy (reverse proxy, TLS, host-routed)
    ├── ramaiah.engage.healix360.net  → sidecar-ramaiah:4100
    ├── global.engage.healix360.net   → sidecar-global:4100
    ├── telephony.engage.healix360.net → telephony:4200
    ├── *.app.healix360.net           → server:4000 (platform)
    └── engage.healix360.net          → 404 (no catchall)

Docker Compose stack (EC2 — 13.234.31.194):
    ├── caddy              — Reverse proxy + TLS (Let's Encrypt)
    ├── server             — FortyTwo platform (NestJS, port 4000)
    ├── worker             — BullMQ background jobs
    ├── sidecar-ramaiah    — Ramaiah sidecar (NestJS, port 4100)
    ├── sidecar-global     — Global sidecar (NestJS, port 4100)
    ├── telephony          — Event dispatcher (NestJS, port 4200)
    ├── redis-ramaiah      — Ramaiah sidecar Redis
    ├── redis-global       — Global sidecar Redis
    ├── redis-telephony    — Telephony dispatcher Redis
    ├── redis              — Platform Redis
    ├── db                 — PostgreSQL 16 (workspace-per-schema)
    ├── clickhouse         — Analytics
    ├── minio              — S3-compatible object storage
    └── redpanda           — Event bus (Kafka-compatible)

EC2 Access

# SSH into EC2
ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194
Detail Value
Host 13.234.31.194
User ubuntu
SSH key /tmp/ramaiah-ec2-key (decrypted from ~/Downloads/fortytwoai_hostinger)
Docker compose dir /opt/fortytwo
Frontend static files /opt/fortytwo/helix-engage-frontend
Caddyfile /opt/fortytwo/Caddyfile

SSH Key Setup

The key at ~/Downloads/fortytwoai_hostinger is passphrase-protected (SasiSuman@2007). Create a decrypted copy for non-interactive use:

# One-time setup
openssl pkey -in ~/Downloads/fortytwoai_hostinger -out /tmp/ramaiah-ec2-key
chmod 600 /tmp/ramaiah-ec2-key

# Verify
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 hostname

Handy alias

alias ec2="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"

URLs

Service URL
Ramaiah Engage (Frontend + API) https://ramaiah.engage.healix360.net
Global Engage (Frontend + API) https://global.engage.healix360.net
Ramaiah Platform https://ramaiah.app.healix360.net
Global Platform https://global.app.healix360.net
Telephony Dispatcher https://telephony.engage.healix360.net

Login Credentials

Ramaiah Workspace

Role Email Password
Marketing Executive marketing@ramaiahcare.com AdRamaiah@2026
Marketing Executive supervisor@ramaiahcare.com MrRamaiah@2026
CC Agent ccagent@ramaiahcare.com CcRamaiah@2026
Platform Admin dev@fortytwo.dev tim@apple.dev

Ozonetel

Field Value
API Key KK8110e6c3de02527f7243ffaa924fa93e
Username global_healthx
Ramaiah Campaign Inbound_918041763400
Ramaiah Agent ramaiahadmin / ext 524435

Local Development

Frontend (Vite dev server)

cd helix-engage
npm run dev          # http://localhost:5173
npx tsc --noEmit     # Type check
npm run build        # Production build

The .env.local controls which sidecar the frontend talks to:

# Remote (default — uses EC2 backend)
VITE_API_URL=https://ramaiah.engage.healix360.net

# Local sidecar
# VITE_API_URL=http://localhost:4100

Sidecar (NestJS dev server)

cd helix-engage-server
npm run start:dev    # http://localhost:4100 (watch mode)
npm run build        # Build only

Sidecar .env must have:

PLATFORM_GRAPHQL_URL=https://ramaiah.app.healix360.net/graphql
PLATFORM_API_KEY=<Ramaiah workspace API key>
PLATFORM_WORKSPACE_SUBDOMAIN=ramaiah
REDIS_URL=redis://localhost:6379

Pre-deploy checklist

  1. npx tsc --noEmit — passes (frontend)
  2. npm run build — succeeds (sidecar)
  3. Test the changed feature locally
  4. Check package.json for new dependencies → decides quick vs full deploy

Deployment

Frontend

cd helix-engage && npm run build

rsync -avz -e "ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no" \
  dist/ ubuntu@13.234.31.194:/opt/fortytwo/helix-engage-frontend/

ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "cd /opt/fortytwo && sudo docker compose restart caddy"

Sidecar (quick — code only, no new dependencies)

cd helix-engage-server

aws ecr get-login-password --region ap-south-1 | \
  docker login --username AWS --password-stdin 043728036361.dkr.ecr.ap-south-1.amazonaws.com

docker buildx build --platform linux/amd64 \
  -t 043728036361.dkr.ecr.ap-south-1.amazonaws.com/fortytwo-eap/helix-engage-sidecar:alpha \
  --push .

ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "cd /opt/fortytwo && sudo docker compose pull sidecar-ramaiah sidecar-global && sudo docker compose up -d sidecar-ramaiah sidecar-global"

How to decide

Did package.json change?
  ├── YES → ECR build + push + pull (above)
  └── NO  → Same steps (ECR is the only deploy path for EC2)

Post-Deploy: E2E Smoke Tests

cd helix-engage
npx playwright test

27 tests covering login, all CC Agent pages, all Supervisor pages, and sign-out. The last test completes sign-out so the agent session is released for the next run.


Checking Logs

# Ramaiah sidecar
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 30 2>&1"

# Follow live
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-sidecar-ramaiah-1 -f --tail 10 2>&1"

# Filter errors
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 100 2>&1" | grep -i "error\|fail"

# Telephony dispatcher
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"

# Caddy
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-caddy-1 --tail 20 2>&1"

# Platform server
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-server-1 --tail 30 2>&1"

# All container status
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker ps --format 'table {{.Names}}\t{{.Status}}'"

Healthy startup

Look for these in sidecar logs:

[NestApplication] Nest application successfully started
Helix Engage Server running on port 4100
[SessionService] Redis connected

Common failure patterns

Log pattern Meaning Fix
Cannot find module 'xxx' Missing npm dependency Rebuild ECR image
UndefinedModuleException Circular dependency Fix code, redeploy
ECONNREFUSED redis:6379 Redis not ready docker compose up -d redis-ramaiah
Forbidden resource Platform permission issue Check user roles
429 Too Many Requests Ozonetel rate limit Wait, reduce polling

Redis Operations

SSH="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"
REDIS="docker exec ramaiah-prod-redis-ramaiah-1 redis-cli"

# Clear agent session lock (fixes "already logged in from another device")
$SSH "$REDIS DEL agent:session:ramaiahadmin"

# List all keys
$SSH "$REDIS KEYS '*'"

# Clear caller cache (stale patient names)
$SSH "$REDIS --scan --pattern 'caller:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"

# Clear masterdata cache (departments/doctors/clinics/slots)
$SSH "$REDIS --scan --pattern 'masterdata:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"

# Clear recording analysis cache
$SSH "$REDIS --scan --pattern 'call:analysis:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"

# Clear agent name cache
$SSH "$REDIS --scan --pattern 'agent:name:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"

# Nuclear: flush all sidecar Redis
$SSH "$REDIS FLUSHDB"

Database Access

ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker exec -it ramaiah-prod-db-1 psql -U fortytwo -d fortytwo_eap"

Useful queries

-- List workspace schemas
SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'workspace_%';

-- List custom entities
SELECT "nameSingular", "isCustom" FROM core."objectMetadata" ORDER BY "nameSingular";

-- List users
SELECT u.email, u."firstName", u."lastName", uw.id as workspace_id
FROM core."user" u
JOIN core."userWorkspace" uw ON uw."userId" = u.id;

-- List roles
SELECT r.label, rt."userWorkspaceId"
FROM core."roleTarget" rt
JOIN core."role" r ON r.id = rt."roleId";

Troubleshooting

"Already logged in from another device"

Single-session enforcement per Ozonetel agent. Clear the lock:

ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker exec ramaiah-prod-redis-ramaiah-1 redis-cli DEL agent:session:ramaiahadmin"

Agent stuck in ACW / Wrapping Up

curl -X POST https://ramaiah.engage.healix360.net/api/maint/force-ready \
  -H "Content-Type: application/json" \
  -d '{"agentId": "ramaiahadmin"}'

Telephony events not routing

# Check dispatcher logs
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"

# Check service discovery registry
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
  "docker exec ramaiah-prod-redis-telephony-1 redis-cli KEYS '*'"

Theme/branding reset after Redis flush

curl -X PUT https://ramaiah.engage.healix360.net/api/config/theme \
  -H "Content-Type: application/json" \
  -d '{"defaults": {"brandName": "Helix Engage", "hospitalName": "Ramaiah Hospitals"}}'

Rollback

Frontend

Checkout previous commit → npm run build → rsync to EC2.

Sidecar

Checkout previous commit → ECR build + push → pull on EC2.

For immediate rollback, re-tag a known-good ECR image as :alpha and pull.


Git Repositories

Repo Azure DevOps Branch
Frontend helix-engage in Patient Engagement Platform feature/omnichannel-widget
Sidecar helix-engage-server in Patient Engagement Platform master
SDK App FortyTwoApps/helix-engage/ (monorepo) dev
Telephony helix-engage-telephony in Patient Engagement Platform master

ECR Details

Detail Value
Registry 043728036361.dkr.ecr.ap-south-1.amazonaws.com
Sidecar repo fortytwo-eap/helix-engage-sidecar
Tag alpha
Region ap-south-1 (Mumbai)