docs: update developer runbook for EC2, remove duplicate

Rewrote developer-operations-runbook.md to reflect the current EC2
multi-tenant deployment (was VPS-only). Covers SSH key setup, all
containers, accounts, deploy steps, E2E tests, Redis ops, DB access,
and troubleshooting. Removed duplicate runbook.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-10 21:06:49 +05:30
parent 1cdb7fe9e7
commit f09250f3ef
2 changed files with 222 additions and 586 deletions

View File

@@ -2,326 +2,285 @@
## Architecture ## Architecture
See [architecture.md](./architecture.md) for the full multi-tenant topology diagram.
``` ```
Browser (India) Browser (India)
↓ HTTPS ↓ HTTPS
Caddy (reverse proxy, TLS, static files) Caddy (reverse proxy, TLS, host-routed)
├── engage.srv1477139.hstgr.cloud → /srv/engage (static frontend) ├── ramaiah.engage.healix360.net → sidecar-ramaiah:4100
├── engage-api.srv1477139.hstgr.cloud → sidecar:4100 ├── global.engage.healix360.net → sidecar-global:4100
── *.srv1477139.hstgr.cloud → server:4000 (platform) ── telephony.engage.healix360.net → telephony:4200
├── *.app.healix360.net → server:4000 (platform)
└── engage.healix360.net → 404 (no catchall)
Docker Compose stack: Docker Compose stack (EC2 — 13.234.31.194):
├── caddy — Reverse proxy + TLS ├── caddy — Reverse proxy + TLS (Let's Encrypt)
├── server — FortyTwo platform (ECR image) ├── server — FortyTwo platform (NestJS, port 4000)
├── worker — Background jobs ├── worker — BullMQ background jobs
├── sidecar — Helix Engage NestJS API (ECR image) ├── sidecar-ramaiah — Ramaiah sidecar (NestJS, port 4100)
├── db — PostgreSQL 16 ├── sidecar-global — Global sidecar (NestJS, port 4100)
├── redisSession + cache ├── telephonyEvent dispatcher (NestJS, port 4200)
├── redis-ramaiah — Ramaiah sidecar Redis
├── redis-global — Global sidecar Redis
├── redis-telephony — Telephony dispatcher Redis
├── redis — Platform Redis
├── db — PostgreSQL 16 (workspace-per-schema)
├── clickhouse — Analytics ├── clickhouse — Analytics
├── minio — Object storage ├── minio — S3-compatible object storage
└── redpanda — Event bus (Kafka) └── redpanda — Event bus (Kafka-compatible)
``` ```
## VPS Access ---
## EC2 Access
```bash ```bash
# SSH into the VPS # SSH into EC2
sshpass -p 'SasiSuman@2007' ssh -o StrictHostKeyChecking=no root@148.230.67.184 ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194
# Or with SSH key (if configured)
ssh -i ~/Downloads/fortytwoai_hostinger root@148.230.67.184
``` ```
| Detail | Value | | Detail | Value |
|---|---| |---|---|
| Host | 148.230.67.184 | | Host | `13.234.31.194` |
| User | root | | User | `ubuntu` |
| Password | SasiSuman@2007 | | SSH key | `/tmp/ramaiah-ec2-key` (decrypted from `~/Downloads/fortytwoai_hostinger`) |
| Docker compose dir | /opt/fortytwo | | Docker compose dir | `/opt/fortytwo` |
| Frontend static files | /opt/fortytwo/helix-engage-frontend | | Frontend static files | `/opt/fortytwo/helix-engage-frontend` |
| Caddyfile | /opt/fortytwo/Caddyfile | | Caddyfile | `/opt/fortytwo/Caddyfile` |
### SSH Key Setup
The key at `~/Downloads/fortytwoai_hostinger` is passphrase-protected (`SasiSuman@2007`).
Create a decrypted copy for non-interactive use:
```bash
# One-time setup
openssl pkey -in ~/Downloads/fortytwoai_hostinger -out /tmp/ramaiah-ec2-key
chmod 600 /tmp/ramaiah-ec2-key
# Verify
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 hostname
```
### Handy alias
```bash
alias ec2="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"
```
---
## URLs ## URLs
| Service | URL | | Service | URL |
|---|---| |---|---|
| Frontend | https://engage.srv1477139.hstgr.cloud | | Ramaiah Engage (Frontend + API) | `https://ramaiah.engage.healix360.net` |
| Sidecar API | https://engage-api.srv1477139.hstgr.cloud | | Global Engage (Frontend + API) | `https://global.engage.healix360.net` |
| Platform | https://fortytwo-dev.srv1477139.hstgr.cloud | | Ramaiah Platform | `https://ramaiah.app.healix360.net` |
| Global Platform | `https://global.app.healix360.net` |
## Login Credentials | Telephony Dispatcher | `https://telephony.engage.healix360.net` |
| Role | Email | Password |
|---|---|---|
| CC Agent | rekha.cc@globalhospital.com | Global@123 |
| CC Agent | ganesh.cc@globalhospital.com | Global@123 |
| Marketing | sanjay.marketing@globalhospital.com | Global@123 |
| Admin/Supervisor | dr.ramesh@globalhospital.com | Global@123 |
--- ---
## Local Testing ## Login Credentials
Always test locally before deploying to staging. ### Ramaiah Workspace
| Role | Email | Password |
|---|---|---|
| Marketing Executive | `marketing@ramaiahcare.com` | `AdRamaiah@2026` |
| Marketing Executive | `supervisor@ramaiahcare.com` | `MrRamaiah@2026` |
| CC Agent | `ccagent@ramaiahcare.com` | `CcRamaiah@2026` |
| Platform Admin | `dev@fortytwo.dev` | `tim@apple.dev` |
### Ozonetel
| Field | Value |
|---|---|
| API Key | `KK8110e6c3de02527f7243ffaa924fa93e` |
| Username | `global_healthx` |
| Ramaiah Campaign | `Inbound_918041763400` |
| Ramaiah Agent | `ramaiahadmin` / ext `524435` |
---
## Local Development
### Frontend (Vite dev server) ### Frontend (Vite dev server)
```bash ```bash
cd helix-engage cd helix-engage
npm run dev # http://localhost:5173
# Start dev server (hot reload) npx tsc --noEmit # Type check
npm run dev npm run build # Production build
# → http://localhost:5173
# Type check (catches production build errors)
npx tsc --noEmit
# Production build (same as deploy)
npm run build
``` ```
The `.env.local` controls which sidecar the frontend talks to: The `.env.local` controls which sidecar the frontend talks to:
```bash ```bash
# Remote sidecar (default — uses deployed backend) # Remote (default — uses EC2 backend)
VITE_API_URL=https://engage-api.srv1477139.hstgr.cloud VITE_API_URL=https://ramaiah.engage.healix360.net
VITE_SIDECAR_URL=https://engage-api.srv1477139.hstgr.cloud
# Local sidecar (for testing sidecar changes) # Local sidecar
# VITE_API_URL=http://localhost:4100 # VITE_API_URL=http://localhost:4100
# VITE_SIDECAR_URL=http://localhost:4100
# Split — theme endpoint local, everything else remote
# VITE_THEME_API_URL=http://localhost:4100
``` ```
**Important:** When `VITE_API_URL` points to `localhost:4100`, login and GraphQL only work if the local sidecar can reach the platform. The local sidecar's `.env` must have valid `PLATFORM_GRAPHQL_URL` and `PLATFORM_API_KEY`.
### Sidecar (NestJS dev server) ### Sidecar (NestJS dev server)
```bash ```bash
cd helix-engage-server cd helix-engage-server
npm run start:dev # http://localhost:4100 (watch mode)
# Start with watch mode (auto-restart on changes) npm run build # Build only
npm run start:dev
# → http://localhost:4100
# Build only (no run)
npm run build
# Production start
npm run start:prod
``` ```
The sidecar `.env` must have: Sidecar `.env` must have:
```bash ```bash
PLATFORM_GRAPHQL_URL=... # Platform GraphQL endpoint PLATFORM_GRAPHQL_URL=https://ramaiah.app.healix360.net/graphql
PLATFORM_API_KEY=... # Platform API key for server-to-server calls PLATFORM_API_KEY=<Ramaiah workspace API key>
PLATFORM_WORKSPACE_SUBDOMAIN=fortytwo-dev PLATFORM_WORKSPACE_SUBDOMAIN=ramaiah
REDIS_URL=redis://localhost:6379 # Local Redis required REDIS_URL=redis://localhost:6379
``` ```
### Local Docker stack (full environment)
For testing with a local platform + database + Redis:
```bash
cd helix-engage-local
# First time — pull images + start
./deploy-local.sh up
# Deploy frontend to local stack
./deploy-local.sh frontend
# Deploy sidecar to local stack
./deploy-local.sh sidecar
# Both
./deploy-local.sh all
# Logs
./deploy-local.sh logs
# Stop
./deploy-local.sh down
```
Local stack URLs:
- Platform: `http://localhost:5001`
- Sidecar: `http://localhost:5100`
- Frontend: `http://localhost:5080`
### Pre-deploy checklist ### Pre-deploy checklist
Before running `deploy.sh`: 1. `npx tsc --noEmit` — passes (frontend)
1. `npx tsc --noEmit` — passes with no errors (frontend)
2. `npm run build` — succeeds (sidecar) 2. `npm run build` — succeeds (sidecar)
3. Test the changed feature locally (dev server or local stack) 3. Test the changed feature locally
4. Check `package.json` for new dependencies → decides quick vs full deploy 4. Check `package.json` for new dependencies → decides quick vs full deploy
--- ---
## Deployment ## Deployment
### Prerequisites (local machine) ### Frontend
```bash ```bash
# Required tools cd helix-engage && npm run build
brew install sshpass # SSH with password
aws configure # AWS CLI (for ECR)
docker desktop # Docker with buildx
# Verify AWS access rsync -avz -e "ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no" \
aws sts get-caller-identity # Should show account 043728036361 dist/ ubuntu@13.234.31.194:/opt/fortytwo/helix-engage-frontend/
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"cd /opt/fortytwo && sudo docker compose restart caddy"
``` ```
### Path 1: Quick Deploy (no new dependencies) ### Sidecar (quick — code only, no new dependencies)
Use when only code changes — no new npm packages.
```bash ```bash
cd /path/to/fortytwo-eap cd helix-engage-server
# Deploy frontend only aws ecr get-login-password --region ap-south-1 | \
bash deploy.sh frontend docker login --username AWS --password-stdin 043728036361.dkr.ecr.ap-south-1.amazonaws.com
# Deploy sidecar only
bash deploy.sh sidecar
# Deploy both
bash deploy.sh all
```
**What it does:**
- Frontend: `npm run build` → tar `dist/` → SCP to VPS → extract to `/opt/fortytwo/helix-engage-frontend`
- Sidecar: `nest build` → tar `dist/` + `src/` → docker cp into running container → `docker compose restart sidecar`
### Path 2: Full Deploy (new dependencies)
Use when `package.json` changed (new npm packages added).
```bash
cd /path/to/fortytwo-eap/helix-engage-server
# 1. Login to ECR
aws ecr get-login-password --region ap-south-1 | docker login --username AWS --password-stdin 043728036361.dkr.ecr.ap-south-1.amazonaws.com
# 2. Build cross-platform image and push
docker buildx build --platform linux/amd64 \ docker buildx build --platform linux/amd64 \
-t 043728036361.dkr.ecr.ap-south-1.amazonaws.com/fortytwo-eap/helix-engage-sidecar:alpha \ -t 043728036361.dkr.ecr.ap-south-1.amazonaws.com/fortytwo-eap/helix-engage-sidecar:alpha \
--push . --push .
# 3. Pull and restart on VPS ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
ECR_TOKEN=$(aws ecr get-login-password --region ap-south-1) "cd /opt/fortytwo && sudo docker compose pull sidecar-ramaiah sidecar-global && sudo docker compose up -d sidecar-ramaiah sidecar-global"
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "
echo '$ECR_TOKEN' | docker login --username AWS --password-stdin 043728036361.dkr.ecr.ap-south-1.amazonaws.com
cd /opt/fortytwo
docker compose pull sidecar
docker compose up -d sidecar
"
``` ```
### How to decide which path ### How to decide
``` ```
Did package.json change? Did package.json change?
├── YES → Path 2 (ECR build + push + pull) ├── YES → ECR build + push + pull (above)
└── NO → Path 1 (deploy.sh) └── NO Same steps (ECR is the only deploy path for EC2)
``` ```
--- ---
## Post-Deploy: E2E Smoke Tests
```bash
cd helix-engage
npx playwright test
```
27 tests covering login, all CC Agent pages, all Supervisor pages, and sign-out.
The last test completes sign-out so the agent session is released for the next run.
---
## Checking Logs ## Checking Logs
### Sidecar logs
```bash ```bash
# SSH into VPS first, or run remotely: # Ramaiah sidecar
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker logs fortytwo-staging-sidecar-1 --tail 30" ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 30 2>&1"
# Follow live # Follow live
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker logs fortytwo-staging-sidecar-1 -f --tail 10" ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 -f --tail 10 2>&1"
# Filter for errors # Filter errors
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker logs fortytwo-staging-sidecar-1 --tail 100 2>&1 | grep -i error" ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 100 2>&1" | grep -i "error\|fail"
# Via deploy.sh # Telephony dispatcher
bash deploy.sh logs ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"
# Caddy
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-caddy-1 --tail 20 2>&1"
# Platform server
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-server-1 --tail 30 2>&1"
# All container status
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker ps --format 'table {{.Names}}\t{{.Status}}'"
``` ```
### Caddy logs ### Healthy startup
```bash Look for these in sidecar logs:
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker logs fortytwo-staging-caddy-1 --tail 30"
```
### Platform server logs
```bash
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker logs fortytwo-staging-server-1 --tail 30"
```
### All container status
```bash
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
```
---
## Health Checks
### Sidecar healthy startup
Look for these lines in sidecar logs:
``` ```
[NestApplication] Nest application successfully started [NestApplication] Nest application successfully started
Helix Engage Server running on port 4100 Helix Engage Server running on port 4100
[SessionService] Redis connected [SessionService] Redis connected
[ThemeService] Theme loaded from file (or "Using default theme")
[RulesStorageService] Initialized empty rules config
``` ```
### Common failure patterns ### Common failure patterns
| Log pattern | Meaning | Fix | | Log pattern | Meaning | Fix |
|---|---|---| |---|---|---|
| `Cannot find module 'xxx'` | Missing npm dependency | Path 2 deploy (rebuild ECR image) | | `Cannot find module 'xxx'` | Missing npm dependency | Rebuild ECR image |
| `UndefinedModuleException` | Circular dependency or missing import | Fix code, redeploy | | `UndefinedModuleException` | Circular dependency | Fix code, redeploy |
| `ECONNREFUSED redis:6379` | Redis not ready | `docker compose restart redis sidecar` | | `ECONNREFUSED redis:6379` | Redis not ready | `docker compose up -d redis-ramaiah` |
| `Forbidden resource` | Platform permission issue | Check user roles | | `Forbidden resource` | Platform permission issue | Check user roles |
| `429 Too Many Requests` | Ozonetel rate limit | Wait, reduce polling frequency | | `429 Too Many Requests` | Ozonetel rate limit | Wait, reduce polling |
--- ---
## Redis Cache Operations ## Redis Operations
### Clear caller resolution cache
```bash ```bash
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-redis-1 redis-cli KEYS 'caller:*'" SSH="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"
REDIS="docker exec ramaiah-prod-redis-ramaiah-1 redis-cli"
# Clear all caller cache # Clear agent session lock (fixes "already logged in from another device")
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-redis-1 redis-cli --scan --pattern 'caller:*' | xargs -r docker exec -i fortytwo-staging-redis-1 redis-cli DEL" $SSH "$REDIS DEL agent:session:ramaiahadmin"
```
### Clear recording analysis cache # List all keys
$SSH "$REDIS KEYS '*'"
```bash # Clear caller cache (stale patient names)
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-redis-1 redis-cli --scan --pattern 'call:analysis:*' | xargs -r docker exec -i fortytwo-staging-redis-1 redis-cli DEL" $SSH "$REDIS --scan --pattern 'caller:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
```
### Clear agent name cache # Clear masterdata cache (departments/doctors/clinics/slots)
$SSH "$REDIS --scan --pattern 'masterdata:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
```bash # Clear recording analysis cache
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-redis-1 redis-cli --scan --pattern 'agent:name:*' | xargs -r docker exec -i fortytwo-staging-redis-1 redis-cli DEL" $SSH "$REDIS --scan --pattern 'call:analysis:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
```
### Clear all session/cache keys # Clear agent name cache
$SSH "$REDIS --scan --pattern 'agent:name:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
```bash # Nuclear: flush all sidecar Redis
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-redis-1 redis-cli FLUSHDB" $SSH "$REDIS FLUSHDB"
``` ```
--- ---
@@ -329,7 +288,8 @@ sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-stagin
## Database Access ## Database Access
```bash ```bash
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "docker exec fortytwo-staging-db-1 psql -U fortytwo -d fortytwo_staging" ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker exec -it ramaiah-prod-db-1 psql -U fortytwo -d fortytwo_eap"
``` ```
### Useful queries ### Useful queries
@@ -354,70 +314,68 @@ JOIN core."role" r ON r.id = rt."roleId";
--- ---
## Rollback ## Troubleshooting
### Frontend rollback ### "Already logged in from another device"
The previous frontend build is overwritten. To rollback: Single-session enforcement per Ozonetel agent. Clear the lock:
1. Checkout the previous git commit ```bash
2. `npm run build` ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
3. `bash deploy.sh frontend` "docker exec ramaiah-prod-redis-ramaiah-1 redis-cli DEL agent:session:ramaiahadmin"
```
### Sidecar rollback (quick deploy) ### Agent stuck in ACW / Wrapping Up
Same as frontend — checkout previous commit, rebuild, redeploy.
### Sidecar rollback (ECR)
```bash ```bash
# Tag the current image as rollback curl -X POST https://ramaiah.engage.healix360.net/api/maint/force-ready \
# Then re-tag the previous image as :alpha -H "Content-Type: application/json" \
# Or use a specific tag/digest -d '{"agentId": "ramaiahadmin"}'
```
# On VPS: ### Telephony events not routing
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 "
cd /opt/fortytwo ```bash
docker compose restart sidecar # Check dispatcher logs
" ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"
# Check service discovery registry
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker exec ramaiah-prod-redis-telephony-1 redis-cli KEYS '*'"
```
### Theme/branding reset after Redis flush
```bash
curl -X PUT https://ramaiah.engage.healix360.net/api/config/theme \
-H "Content-Type: application/json" \
-d '{"defaults": {"brandName": "Helix Engage", "hospitalName": "Ramaiah Hospitals"}}'
``` ```
--- ---
## Theme Management ## Rollback
### View current theme ### Frontend
```bash
curl -s https://engage-api.srv1477139.hstgr.cloud/api/config/theme | python3 -m json.tool
```
### Reset theme to defaults Checkout previous commit → `npm run build` → rsync to EC2.
```bash
curl -s -X POST https://engage-api.srv1477139.hstgr.cloud/api/config/theme/reset | python3 -m json.tool
```
### Theme backups ### Sidecar
Stored on the sidecar container at `/app/data/theme-backups/`. Each save creates a timestamped backup.
Checkout previous commit → ECR build + push → pull on EC2.
For immediate rollback, re-tag a known-good ECR image as `:alpha` and pull.
--- ---
## Git Repositories ## Git Repositories
| Repo | Azure DevOps URL | Branch | | Repo | Azure DevOps | Branch |
|---|---|---| |---|---|---|
| Frontend | `https://dev.azure.com/globalhealthx/EMR/_git/helix-engage` | `dev` | | Frontend | `helix-engage` in Patient Engagement Platform | `feature/omnichannel-widget` |
| Sidecar | `https://dev.azure.com/globalhealthx/EMR/_git/helix-engage-server` | `dev` | | Sidecar | `helix-engage-server` in Patient Engagement Platform | `master` |
| SDK App | `FortyTwoApps/helix-engage/` (in fortytwo-eap monorepo) | `dev` | | SDK App | `FortyTwoApps/helix-engage/` (monorepo) | `dev` |
| Telephony | `helix-engage-telephony` in Patient Engagement Platform | `master` |
### Commit and push pattern
```bash
# Frontend
cd helix-engage
git add -A && git commit -m "feat: description" && git push origin dev
# Sidecar
cd helix-engage-server
git add -A && git commit -m "feat: description" && git push origin dev
```
--- ---
@@ -425,7 +383,7 @@ git add -A && git commit -m "feat: description" && git push origin dev
| Detail | Value | | Detail | Value |
|---|---| |---|---|
| Registry | 043728036361.dkr.ecr.ap-south-1.amazonaws.com | | Registry | `043728036361.dkr.ecr.ap-south-1.amazonaws.com` |
| Repository | fortytwo-eap/helix-engage-sidecar | | Sidecar repo | `fortytwo-eap/helix-engage-sidecar` |
| Tag | alpha | | Tag | `alpha` |
| Region | ap-south-1 (Mumbai) | | Region | `ap-south-1` (Mumbai) |

View File

@@ -1,322 +0,0 @@
# Helix Engage — Operations Runbook
Day-to-day operations guide for deploying, debugging, and maintaining Helix Engage.
---
## Environments
| | **VPS (Global)** | **EC2 (Ramaiah)** |
|---|---|---|
| **Host** | `148.230.67.184` | `13.234.31.194` |
| **Domain** | `engage-api.srv1477139.hstgr.cloud` | `*.engage.healix360.net` |
| **Docker path** | `/opt/fortytwo` | `/opt/fortytwo` |
| **Topology** | Single-tenant | Multi-tenant (2 sidecars + telephony) |
---
## SSH Access
### VPS (Global)
```bash
sshpass -p 'SasiSuman@2007' ssh -o StrictHostKeyChecking=no root@148.230.67.184
```
### EC2 (Ramaiah)
The SSH key is at `~/Downloads/fortytwoai_hostinger` (passphrase-protected).
A decrypted copy must exist at `/tmp/ramaiah-ec2-key`.
**First-time setup (one of these):**
```bash
# Option A: Decrypt key file (non-interactive, passphrase: SasiSuman@2007)
openssl pkey -in ~/Downloads/fortytwoai_hostinger -out /tmp/ramaiah-ec2-key
chmod 600 /tmp/ramaiah-ec2-key
# Option B: Add to ssh-agent (interactive — prompts for passphrase)
ssh-add ~/Downloads/fortytwoai_hostinger
```
**After setup:**
```bash
ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194
```
**Quick alias for repeated use:**
```bash
alias ec2="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"
alias vps="sshpass -p 'SasiSuman@2007' ssh -o StrictHostKeyChecking=no root@148.230.67.184"
```
---
## Accounts
### Ramaiah (EC2)
| Role | Email | Password | Notes |
|------|-------|----------|-------|
| Marketing Executive | `marketing@ramaiahcare.com` | `AdRamaiah@2026` | Landing: Lead Workspace |
| Marketing Executive | `supervisor@ramaiahcare.com` | `MrRamaiah@2026` | Landing: Lead Workspace |
| CC Agent | `ccagent@ramaiahcare.com` | `CcRamaiah@2026` | Ozonetel agent: `ramaiahadmin` |
| Platform Admin | `dev@fortytwo.dev` | `tim@apple.dev` | Break-glass admin. **NEVER delete.** |
### Ozonetel
| Field | Value |
|-------|-------|
| API Key | `KK8110e6c3de02527f7243ffaa924fa93e` |
| Username | `global_healthx` |
| Ramaiah Campaign | `Inbound_918041763400` |
| Global Campaign | `Inbound_918041763265` |
| Ramaiah Agent | `ramaiahadmin` / ext `524435` |
---
## EC2 Containers
```bash
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker ps --format 'table {{.Names}}\t{{.Status}}'"
```
| Container | Purpose | Port |
|-----------|---------|------|
| `ramaiah-prod-caddy-1` | Reverse proxy + TLS | 80, 443 |
| `ramaiah-prod-server-1` | Platform API | 4000 |
| `ramaiah-prod-worker-1` | BullMQ worker | — |
| `ramaiah-prod-sidecar-ramaiah-1` | Ramaiah sidecar | 4100 |
| `ramaiah-prod-sidecar-global-1` | Global sidecar | 4100 |
| `ramaiah-prod-telephony-1` | Event dispatcher | 4200 |
| `ramaiah-prod-redis-ramaiah-1` | Ramaiah Redis | 6379 |
| `ramaiah-prod-redis-global-1` | Global Redis | 6379 |
| `ramaiah-prod-redis-telephony-1` | Telephony Redis | 6379 |
| `ramaiah-prod-redis-1` | Platform Redis | 6379 |
| `ramaiah-prod-db-1` | PostgreSQL | 5432 |
---
## Checking Logs
```bash
# EC2 — Ramaiah sidecar
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 30 2>&1"
# EC2 — Telephony dispatcher
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"
# EC2 — Platform server
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-server-1 --tail 30 2>&1"
# EC2 — Caddy
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-caddy-1 --tail 20 2>&1"
# EC2 — Filter errors only
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 100 2>&1" | grep -i "error\|fail\|crash"
# VPS — Sidecar
sshpass -p 'SasiSuman@2007' ssh root@148.230.67.184 \
"docker logs fortytwo-staging-sidecar-1 --tail 30 2>&1"
```
**Healthy sidecar output:**
- `Nest application successfully started`
- `Helix Engage Server running on port 4100`
- `SessionService Redis connected`
---
## Deploying
### Pre-flight checks
```bash
# Frontend type check
cd helix-engage && npx tsc --noEmit
# Sidecar build check
cd helix-engage-server && npm run build
```
### Frontend (EC2)
```bash
cd helix-engage && npm run build
rsync -avz -e "ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no" \
dist/ ubuntu@13.234.31.194:/opt/fortytwo/helix-engage-frontend/
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"cd /opt/fortytwo && sudo docker compose restart caddy"
```
### Sidecar (EC2 — via ECR)
```bash
cd helix-engage-server
# ECR login + build + push
aws ecr get-login-password --region ap-south-1 | \
docker login --username AWS --password-stdin 043728036361.dkr.ecr.ap-south-1.amazonaws.com
docker buildx build --platform linux/amd64 \
-t 043728036361.dkr.ecr.ap-south-1.amazonaws.com/fortytwo-eap/helix-engage-sidecar:alpha \
--push .
# Pull + restart on EC2
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"cd /opt/fortytwo && sudo docker compose pull sidecar-ramaiah sidecar-global && sudo docker compose up -d sidecar-ramaiah sidecar-global"
```
### VPS (Global)
```bash
cd /Users/satyasumansaridae/Downloads/fortytwo-eap
bash deploy.sh frontend # Frontend only
bash deploy.sh sidecar # Sidecar only
bash deploy.sh all # Both
```
---
## Post-Deploy: E2E Smoke Tests
```bash
cd helix-engage
# Run against EC2 (default)
npx playwright test
# Run against VPS
E2E_BASE_URL=https://engage-api.srv1477139.hstgr.cloud npx playwright test
```
27 tests covering login (invalid creds, CC Agent, Supervisor), every page
for both roles, and sign-out. The last test completes sign-out so the agent
session is released for the next run.
---
## Redis Operations
### EC2 (Ramaiah sidecar Redis)
```bash
SSH="ssh -i /tmp/ramaiah-ec2-key -o StrictHostKeyChecking=no ubuntu@13.234.31.194"
REDIS="docker exec ramaiah-prod-redis-ramaiah-1 redis-cli"
# Clear agent session lock (fixes "already logged in from another device")
$SSH "$REDIS DEL agent:session:ramaiahadmin"
# List all keys
$SSH "$REDIS KEYS '*'"
# Clear caller cache (stale patient names)
$SSH "$REDIS --scan --pattern 'caller:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
# Clear masterdata cache
$SSH "$REDIS --scan --pattern 'masterdata:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
# Clear agent name cache
$SSH "$REDIS --scan --pattern 'agent:name:*' | xargs -r docker exec -i ramaiah-prod-redis-ramaiah-1 redis-cli DEL"
# Nuclear: flush all
$SSH "$REDIS FLUSHDB"
```
### VPS (Global sidecar Redis)
```bash
SSH="sshpass -p 'SasiSuman@2007' ssh -o StrictHostKeyChecking=no root@148.230.67.184"
REDIS="docker exec fortytwo-staging-redis-1 redis-cli"
$SSH "$REDIS DEL agent:session:<agentId>"
$SSH "$REDIS FLUSHDB"
```
---
## Troubleshooting
### "Already logged in from another device"
The sidecar enforces single-session per Ozonetel agent. Clear the lock:
```bash
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker exec ramaiah-prod-redis-ramaiah-1 redis-cli DEL agent:session:ramaiahadmin"
```
### Agent stuck in ACW / Wrapping Up
Three protection layers exist (beforeunload → sendBeacon → server 30s timer).
If all fail, force-ready:
```bash
curl -X POST https://ramaiah.engage.healix360.net/api/maint/force-ready \
-H "Content-Type: application/json" \
-d '{"agentId": "ramaiahadmin"}'
```
### Container restart loop
```bash
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-sidecar-ramaiah-1 --tail 50 2>&1" | grep -i "error\|fail\|crash"
```
Common causes:
- `Cannot find module` → need ECR rebuild (new dependencies)
- `UndefinedModuleException` → circular dependency in code
- `ECONNREFUSED` to Redis → Redis container down, `docker compose up -d redis-ramaiah`
### Theme/branding reset after sidecar restart
Config is in Redis. If flushed, re-apply:
```bash
curl -X PUT https://ramaiah.engage.healix360.net/api/config/theme \
-H "Content-Type: application/json" \
-d '{"defaults": {"brandName": "Helix Engage", "hospitalName": "Ramaiah Hospitals"}}'
```
### Telephony events not routing
Check dispatcher logs and verify sidecar registration:
```bash
# Dispatcher logs
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker logs ramaiah-prod-telephony-1 --tail 30 2>&1"
# Check service discovery registry
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 \
"docker exec ramaiah-prod-redis-telephony-1 redis-cli KEYS '*'"
```
### Full DB Reset (nuclear — destroys all data)
Only when field metadata is missing (0 rows in `core.fieldMetadata`):
```bash
ssh -i /tmp/ramaiah-ec2-key ubuntu@13.234.31.194 << 'EOF'
cd /opt/fortytwo
sudo docker compose stop server worker
sudo docker exec ramaiah-prod-db-1 psql -U fortytwo -d fortytwo_eap -c "DELETE FROM core.workspace;"
# Find and drop orphaned workspace schemas
sudo docker exec ramaiah-prod-db-1 psql -U fortytwo -d fortytwo_eap -c "SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'workspace_%';"
# DROP SCHEMA ... CASCADE for each
sudo docker exec ramaiah-prod-redis-1 redis-cli FLUSHALL
sudo docker compose up -d server worker
EOF
```