Incident Response Playbook

Africa Watch — Savvy Ventures Limited  ·  Last updated:

Severity Matrix & SLAs

SeverityDescriptionDetect SLAContain SLAExamples
P0 CriticalPlatform down, auth breach, mass data exposure, confirmed misinfo causing field harm<15 min<1 hourJWT secret leaked, DB exposed publicly, XSS exfiltrating tokens
P1 HighSignificant feed contamination, auth bypass, critical route returning wrong data<1 hour<4 hoursChad-person articles in live feed, admin route accessible to free user
P2 MediumDegraded accuracy, partial service failure, elevated false-positive rate<4 hours<24 hoursSocial search returning >30% irrelevant results, LLM analysis timing out
P3 LowMinor UI defects, non-critical metric drift, cosmetic issues<24 hours<72 hoursConfidence badge missing on some items, timestamp formatting wrong

Escalation RACI

Incident TypeResponsibleAccountableConsultedInformed
Data integrity / feed contaminationBackend EngineerEngineering LeadData/Intel LeadAll users via status page
Auth / access control breachSecurity LeadCISO / FounderBackend EngineerAffected users, legal if data exposed
Misinfo injection / LLM manipulationData/Intel LeadEngineering LeadSecurity LeadField operators using affected country data
Platform outage (P0)DevOps/SREEngineering LeadAll engineersAll users, management

Playbooks

Playbook 1 — Feed Contamination / Geo-Disambiguation Failure P1

Trigger

Live Incident Feed or social-search fallback showing sports/entertainment/person-name content for an African country. Example: MLB articles appearing for Chad, music articles for Mali.

Response Steps

  1. Detect: Monitor /health and check /social-search?q=Chad&county=Chad meta for dropped_geo_irrelevant counter. If counter is 0 and feed has bad content, filter is not firing.
  2. Isolate: Check if AMBIGUOUS_COUNTRY_NAMES Set includes the affected country. Check if isGeoRelevant and topicScore are being called in the affected code path.
  3. Contain: If immediate fix not possible, temporarily disable the fallback path by returning empty array for affected country until fix is deployed.
  4. Fix: Add country to AMBIGUOUS_COUNTRY_NAMES if missing. Verify fix covers all three paths: monitor ingestion, /africa/events, /social-search.
  5. Validate: Run manual query: curl "/social-search?q=Chad Africa&county=Chad" and confirm meta shows dropped_geo_irrelevant > 0 and results are geopolitical.
  6. Deploy & monitor: Deploy fix via scp + pm2 restart. Watch PM2 logs for 10 minutes post-deploy.
Evidence to preserve: screenshot of feed, browser network tab showing API response JSON with results array, server log excerpt showing dropped counters.
Playbook 2 — Authentication / Authorization Breach P0

Trigger

Unauthorized access to admin routes, JWT tokens accepted after revocation, API key bypass, privilege escalation from free-tier to admin.

Response Steps

  1. Detect: Check audit log at /audit for unexpected admin actions. Check /admin/route-matrix to confirm all admin routes require requireRole('admin').
  2. Contain immediately: If active breach suspected — rotate JWT_SECRET in .env and restart server. This invalidates ALL active sessions (all users must re-login).
  3. Rotate secrets: Generate new JWT_SECRET (32+ bytes), new WEBHOOK_SIGNING_KEY. Update /opt/africa-watch/.env. Restart: pm2 restart africa-watch.
  4. Audit accounts: Query SELECT * FROM users WHERE role='admin' — verify no unexpected admin accounts.
  5. Review logs: Pull full audit log for past 48h. Identify all actions taken by compromised session/token.
  6. Root cause: Check if bypass was via x-api-key header (should now return 401 "not configured"), stale JWT, or role escalation.
  7. Notify: If user data was accessed — notify affected users within 72h per data protection obligations.
Evidence: audit_log table export, PM2 logs from incident window, compromised JWT (decode at jwt.io for claims), IP addresses from logs.
Playbook 3 — LLM Misinfo Injection / Prompt Manipulation P1

Trigger

LLM analysis output contains fabricated events, contradicts known facts, or shows signs of prompt injection (unusual instruction-following tone, policy-violating content, off-topic analysis).

Response Steps

  1. Detect: Cross-reference LLM output against raw articles in the analysis modal. If LLM claims X but no source article supports it — likely hallucination or injection.
  2. Isolate: Capture the exact prompt sent to the LLM: add temporary debug logging to buildPrompt() in llm-analysis.js.
  3. Check inputs: Review the article text that fed the prompt. Check sanitizePromptInput() was applied. Look for injection patterns: "Ignore previous", "System:", "You are now".
  4. Contain: If active injection detected — add the offending article source to BLOCKED_DOMAINS in /social-search. Clear _explainCache in memory (restart server).
  5. Review sanitizer: Update INJECTION_PATTERN regex in security-middleware.js to catch new pattern.
  6. Validate: Re-run the affected location's analysis and confirm output is grounded in cited articles only.
Evidence: LLM prompt (log it), LLM response verbatim, offending source article URL, browser console showing the analysis API response.

Communications Templates

Internal (Slack / WhatsApp)

🚨 INCIDENT DECLARED — [P0/P1/P2] [SHORT TITLE] Time detected: [HH:MM UTC] Affected: [system/feature/users] Current status: [investigating / contained / resolved] Incident lead: [@name] Next update: [HH:MM UTC] Thread for updates ↓

External (User-Facing Status)

We are aware of an issue affecting [feature] on the Africa Watch platform. Our team is investigating and working to resolve this as quickly as possible. Current status: [Investigating / Fix deployed / Monitoring] We will provide an update by [time]. We apologise for any inconvenience. — Africa Watch Team

Drill Log

DatePlaybookParticipantsOutcomeActions Raised