3.8 KiB

Raw Blame History

06. Risk Assessment

1. Risk Scoring Method

Probability: 1 (low) to 5 (high)
Impact: 1 (low) to 5 (high)
Risk score = Probability x Impact

2. Top Risks

ID	Risk	Prob	Impact	Score	Mitigation	Contingency Trigger
R1	Secret exfiltration via unredacted outbound payload	3	5	15	Multi-layer redaction tests, egress deny-by-default policy, seeded canary secrets	Any unredacted canary secret seen outside tenant
R2	Command gating bypass due misclassification	3	5	15	Deterministic policy engine, contract tests per class, human-readable reason logging	Red/Critical executes without approval in tests
R3	OpenClaw upstream changes break plugin behavior	3	4	12	Pin stable tags, adapter compatibility suite, staged upgrade canaries	Hook contract test fails against new tag
R4	Provisioner regressions reduce provisioning success	4	4	16	Idempotent checkpoints, replay tests, synthetic VPS CI	First-attempt success < 90%
R5	Billing usage mismatch vs provider costs	3	4	12	Dual-entry usage checks, nightly reconciliation jobs, alert thresholds	>1% sustained variance for 24h
R6	Mobile approval notification delays/drop	3	3	9	Push retries + in-app queue fallback + email fallback	p95 approval notify > 30s
R7	Performance overhead exceeds Lite-tier budget	2	4	8	Memory profiling budget gates, disable non-essential plugins, tune browser lifecycle	LetsBe overhead > 800MB sustained
R8	Tool API churn breaks adapters	4	3	12	Adapter integration tests against pinned versions, fallback to browser playbook	Adapter failure rate > 5%
R9	Security debt from AI-generated code quality	4	4	16	Mandatory senior review on security modules, lint rules, banned patterns checks	Critical static-analysis finding unresolved >48h
R10	Legal/compliance drift (license/source disclosure pages)	2	4	8	Automated license manifest publishing, pre-release legal checklist	Missing OSS disclosure page at RC freeze

3. Risk Register By Domain

3.1 Security Risks

Redaction misses non-standard secret formats.
External comms gate incorrectly tied to autonomy level.
Local logs/transcripts persist raw secret material.
Local execution adapters allow shell metacharacter bypass.

3.2 Delivery Risks

Too much simultaneous change across Hub + provisioner + tenant runtime.
Underestimated migration effort from deprecated orchestrator/sysadmin behaviors.
Browser automation migration complexity for setup scripts.

3.3 Operational Risks

Dual-region Hub operations increase DB and deploy complexity.
Insufficient on-call runbooks for approval outages and provisioning failures.
Canary rollout without automated rollback criteria.

4. Mitigation Program

4.1 Pre-Launch Controls

Security invariants are encoded as executable tests (not checklist-only).
Every release candidate must pass redaction canary probes.
Dry-run provisioning must pass on both Netcup and Hetzner targets.

4.2 Runtime Controls

Alert on heartbeat freshness degradation.
Alert on approval queue lag and expiration spikes.
Alert on sudden drop in cache-read ratio (cost anomaly indicator).

4.3 Governance Controls

Security design review required for changes in Safety Wrapper, redaction, or secrets flows.
Migration freeze on deprecated paths after Phase 0.
Weekly risk review with updated probability/impact re-scoring.

5. Launch Go/No-Go Risk Gates

No launch if any condition is true:

unresolved severity-1 security defect
redaction tests fail for any supported secret class
command gating matrix not fully passing
usage reconciliation error >1% over 72h canary
provisioning first-attempt success below 85% in final week