31 KiB

Raw Blame History

LetsBe Biz — Risk Assessment

Date: February 27, 2026 Team: Claude Opus 4.6 Architecture Team Document: 06 of 09 Status: Proposal — Competing with independent team

Risk Matrix Overview
HIGH Risks
MEDIUM Risks
LOW Risks
Known Unknowns
Security-Specific Risks
Business & Operational Risks
Dependency Risks
Risk Monitoring Plan

1. Risk Matrix Overview

Scoring

Impact: How bad is it if this happens? (1-5, where 5 = catastrophic)
Likelihood: How likely is it? (1-5, where 5 = almost certain)
Risk Score: Impact × Likelihood
Severity: HIGH (≥15), MEDIUM (8-14), LOW (≤7)

Summary

Severity	Count	Action Required
HIGH	6	Active mitigation required; block launch if unresolved
MEDIUM	9	Mitigation planned; monitor weekly
LOW	7	Accepted; monitor monthly

2. HIGH Risks

H1 — Secrets Redaction Bypass

Attribute	Value
Impact	5 (Catastrophic — customer secrets sent to LLM provider)
Likelihood	3 (Possible — novel encoding/nesting could evade patterns)
Risk Score	15
Category	Security

Description: The 4-layer redaction pipeline (Aho-Corasick → regex → entropy → JSON keys) may fail to catch secrets in edge cases: base64-encoded values, URL-encoded strings, secrets split across multiple JSON fields, secrets embedded in error messages from tools, or secrets in non-UTF-8 encodings.

Mitigation:

TDD approach — write adversarial tests BEFORE implementation (Phase 1, week 3)
Adversarial testing matrix from Technical Architecture §19.2: Unicode edge cases, base64, URL-encoded, nested JSON, YAML, log output
Shannon entropy filter (Layer 3) as catch-all for unknown patterns (≥4.5 bits/char, ≥32 chars)
Dedicated security audit in Phase 4 (week 13) with crafted bypass payloads
Post-launch: bug bounty program for redaction bypass (internal at first, public later)
Monitoring: log all redaction events; alert on suspiciously high entropy in outbound LLM calls

Residual risk: MEDIUM after mitigation. The entropy filter is the safety net, but it has false-positive trade-offs.

H2 — OpenClaw Hook Gap (before_tool_call not bridged to external plugins)

Attribute	Value
Impact	5 (Catastrophic — Safety Wrapper cannot intercept tool calls)
Likelihood	2 (Unlikely — we've already planned for this via separate process)
Risk Score	10 → Elevated to HIGH due to impact severity
Category	Technical / Dependency

Description: The Technical Architecture v1.2 proposes the Safety Wrapper as an in-process OpenClaw extension using before_tool_call / after_tool_call hooks. Our analysis (GitHub Discussion #20575) found these hooks are NOT bridged to external plugins — they only work for bundled/internal hooks. This means the in-process extension model proposed in the Technical Architecture does not work as documented.

Mitigation:

Already addressed: Our architecture uses the Safety Wrapper as a SEPARATE PROCESS (localhost:8200). OpenClaw's tool calls are configured to route through the Safety Wrapper's HTTP API, not through in-process hooks.
OpenClaw's exec tool is configured to call the Safety Wrapper's execute endpoint instead of running commands directly.
OpenClaw's model provider is configured to proxy through the Secrets Proxy (localhost:8100) for LLM calls.
This approach is hook-independent — it works regardless of OpenClaw's internal hook architecture.

Residual risk: LOW after mitigation. The separate-process architecture was specifically designed to avoid this risk.

H3 — OpenClaw Upstream Breaking Changes

Attribute	Value
Impact	4 (Major — could break tool routing, sessions, or agent management)
Likelihood	4 (Likely — OpenClaw is actively developed with calendar-versioned releases)
Risk Score	16
Category	Dependency

Description: OpenClaw uses calendar versioning (2026.2.6-3) and is under active development. Breaking changes to the config format, tool system, session management, or API could break our integration. The v1.2 architecture already found one breaking change (hook bridging gap).

Mitigation:

Pin to a specific release tag (e.g., v2026.2.6-3). Never float to latest.
Monthly review of OpenClaw releases during development; quarterly post-launch.
Staging-first rollout: test new releases on staging VPS before any production deployment.
Canary deployment: staging → 5% → 25% → 100% (see 03-DEPLOYMENT-STRATEGY).
Maintain a compatibility test suite: 20-30 tests verifying our integration points (tool routing, LLM proxy, session management, config loading).
Document all integration points in a single "OpenClaw Integration Surface" document.

Residual risk: MEDIUM. We control the pin, but upstream changes may require adaptation work that delays feature development.

H4 — Provisioner Reliability (Zero Tests)

Attribute	Value
Impact	5 (Catastrophic — new customers can't be onboarded)
Likelihood	3 (Possible — 4,477 LOC Bash with zero tests, complex SSH-based provisioning)
Risk Score	15
Category	Technical

Description: The provisioner (letsbe-provisioner) is ~4,477 LOC of Bash scripts with zero automated tests. It performs 10-step SSH-based provisioning including Docker deployment, secret generation, nginx configuration, and SSL certificate setup. Any failure in this pipeline blocks new customer onboarding. The step 10 rewrite (replacing orchestrator/sysadmin with OpenClaw/Safety Wrapper) adds significant risk.

Mitigation:

Containerized integration test: run provisioner inside Docker against a test VPS (or mock SSH target). Phase 4, week 14.
Incremental testing during development: test each provisioner step independently.
Keep the existing provisioner working alongside the new step 10 until verified.
Pre-provisioned server pool: have 3-5 servers ready so provisioner failures don't block immediate customer needs.
Rollback procedure: if new provisioner fails, manually deploy the existing stack and convert later.
Manual verification checklist for the first 5 provisioning runs.

Residual risk: MEDIUM. The lack of automated tests is a persistent concern, but manual verification and the pre-provisioned pool mitigate the immediate impact.

H5 — CVE-2026-25253 (Cross-Site WebSocket Hijacking in OpenClaw)

Attribute	Value
Impact	4 (Major — potential unauthorized session access)
Likelihood	2 (Unlikely — patched in v2026.1.29, but must verify pin includes fix)
Risk Score	8 → Elevated to HIGH due to security nature
Category	Security / Dependency

Description: CVE-2026-25253 (CVSS 8.8) is a cross-site WebSocket hijacking vulnerability in OpenClaw. Patched 2026-01-29. Our pinned version (v2026.2.6-3) includes the fix, but any downgrade or use of an older version would reintroduce it.

Mitigation:

Verify pinned version ≥ v2026.1.29 during CI build (automated check).
OpenClaw bound to loopback (127.0.0.1) — not exposed to external network, reducing attack surface.
openclaw security audit --deep run during provisioning (catches known CVEs).
Include CVE check in monthly OpenClaw review process.

Residual risk: LOW after mitigation. Loopback binding means external exploitation requires prior VPS access.

H6 — Single Point of Failure: Safety Wrapper Lead

Attribute	Value
Impact	4 (Major — critical path stalls; no one else understands security layer)
Likelihood	3 (Possible — single senior engineer on core IP)
Risk Score	12 → Elevated to HIGH due to critical path impact
Category	Organizational

Description: The Safety Wrapper is the core IP and critical path item. It requires a senior engineer with security expertise. If this person is unavailable (illness, departure, burnout), the entire project stalls.

Mitigation:

Pair programming on all safety-critical code (classification, redaction, injection).
Weekly architecture reviews where the second engineer (Hub or DevOps) reviews Safety Wrapper changes.
Comprehensive documentation: every design decision, every edge case, every test rationale.
Cross-training: Hub Backend engineer should be able to make minor Safety Wrapper changes by week 8.
Code review culture: no Safety Wrapper PR merges without review from at least one other engineer.

Residual risk: MEDIUM. Documentation and cross-training reduce bus factor from 1 to ~1.5 by week 8.

3. MEDIUM Risks

M1 — Mobile App Platform Inconsistencies

Attribute	Value
Impact	3 (Moderate — degraded experience on one platform)
Likelihood	4 (Likely — iOS/Android differences are common with Expo)
Risk Score	12
Category	Technical

Description: Expo Bare Workflow mitigates many platform differences, but push notification behavior, background app refresh, secure storage, and SSE streaming can differ between iOS and Android.

Mitigation:

Test on both platforms from week 9 (not just week 14).
Focus on Android first (more forgiving platform for initial testing), polish iOS separately.
Use Expo's managed push notification service (Expo Push) which abstracts APNs/FCM differences.
Secure storage: use expo-secure-store which wraps Keychain (iOS) and EncryptedSharedPreferences (Android).
Keep mobile app simple for v1 — chat, approvals, basic dashboard. Advanced features post-launch.

M2 — Stripe Billing Meters Complexity

Attribute	Value
Impact	3 (Moderate — billing inaccurate or overage not triggered)
Likelihood	3 (Possible — Stripe Billing Meters API is relatively new)
Risk Score	9
Category	Technical

Description: Token overage billing requires Stripe Billing Meters to track usage and generate invoices. This API is newer and has less community documentation than standard Stripe subscriptions.

Mitigation:

Prototype Stripe Billing Meters in week 1-2 (during Prisma model planning) — verify the API works as expected.
Fallback: if Billing Meters are too complex, use Stripe usage records on subscription items (older, well-documented API).
Overage billing is in the scope cut table — can be deferred (hard stop at pool limit instead).

M3 — Tool API Stability

Attribute	Value
Impact	3 (Moderate — specific tool becomes unusable until cheat sheet updated)
Likelihood	3 (Possible — open-source tools update APIs between major versions)
Risk Score	9
Category	Technical

Description: Cheat sheets document specific API endpoints for tools like Portainer, Nextcloud, Chatwoot, etc. If a tool updates its API (breaking changes), the agent's cheat sheet becomes inaccurate, causing failed API calls.

Mitigation:

Pin Docker image versions for all tools (already done in provisioner Compose files).
Cheat sheets include tool version they were tested against.
Agent behavior: if API call fails, retry with browser fallback automatically.
Post-launch: automated cheat sheet validation tests (curl against running tools, verify endpoints return expected shapes).

M4 — Hub Performance Under Tenant Load

Attribute	Value
Impact	3 (Moderate — slow approvals, delayed heartbeats)
Likelihood	3 (Possible — Hub was designed for admin use, not 100+ tenant heartbeats)
Risk Score	9
Category	Technical

Description: The Hub currently handles admin dashboard requests. With 100+ tenants sending heartbeats every 60 seconds, token usage every hour, approval requests, and customer portal requests, the load profile changes significantly.

Mitigation:

Heartbeat endpoint must be lightweight: accept payload, queue for async processing, return 200 immediately.
Database: add indexes on ServerConnection.status, TokenUsageBucket.periodId, CommandApproval.status.
Connection pooling: Prisma's default connection pool (10 connections) may need to increase.
Load test with simulated tenants before launch (week 14-15).
Horizontal scaling: Hub runs behind nginx — add second instance if needed (session storage is JWT, no sticky sessions required).

M5 — Secrets Proxy Latency Impact

Attribute	Value
Impact	3 (Moderate — noticeable delay in agent responses)
Likelihood	3 (Possible — 4-layer pipeline on every LLM call)
Risk Score	9
Category	Performance

Description: Every LLM call routes through the Secrets Proxy, which runs 4 layers of redaction. With 50+ secrets in the registry, the Aho-Corasick pattern matching, regex scanning, entropy analysis, and JSON key scanning must complete within the 10ms latency budget.

Mitigation:

Aho-Corasick is O(n) where n = input length (not number of patterns). This is inherently fast.
Pre-compile regex patterns at startup, not per-request.
Entropy filter only runs on strings ≥32 chars that weren't caught by earlier layers.
Benchmark at startup: if latency exceeds 10ms with the current secret count, log a warning.
Cache the Aho-Corasick automaton rebuild (only when secrets change, not per-request).

M6 — LLM Provider Reliability

Attribute	Value
Impact	3 (Moderate — agents unable to respond during outage)
Likelihood	4 (Likely — OpenRouter/Anthropic/Google have periodic outages)
Risk Score	12
Category	External Dependency

Description: If the LLM provider (OpenRouter or direct provider) goes down, agents cannot respond. This directly impacts user experience.

Mitigation:

OpenClaw's native model failover chains: primary → fallback1 → fallback2.
Auth profile rotation before model fallback (OpenClaw native feature).
Graceful degradation: agent reports "I'm having trouble reaching my AI backend right now. I'll try again in a few minutes."
Heartbeat keep-warm (heartbeat.every: "55m") prevents cold starts after brief outages.
Multiple OpenRouter API keys for rate limit distribution.

M7 — Config.json Plaintext Password (Existing Critical Bug)

Attribute	Value
Impact	4 (Major — root password exposed on provisioned servers)
Likelihood	5 (Almost certain — it's a known issue documented in the repo analysis)
Risk Score	20 → Classified as MEDIUM because fix is already planned
Category	Security

Description: The provisioner's config.json contains the root password in plaintext after provisioning. This is a known issue from the repo analysis.

Mitigation:

Already in scope: Task 11.3 in implementation plan — 0.5 day effort in week 11.
Fix: delete config.json after provisioning completes (or redact sensitive fields).
Additional: ensure config.json is not committed to any git repository.
Verify fix during provisioner integration testing (week 14).

M8 — Token Metering Accuracy

Attribute	Value
Impact	3 (Moderate — billing disputes, lost revenue, or overcharges)
Likelihood	3 (Possible — token counting varies by provider, model, and caching)
Risk Score	9
Category	Business

Description: Token metering captures counts from OpenRouter response headers. But different providers count tokens differently (e.g., cache-read vs. cache-write, system prompt tokens, tool use tokens). Inaccurate metering leads to billing disputes or revenue leakage.

Mitigation:

Trust OpenRouter's x-openrouter-usage headers as source of truth (they normalize across providers).
Track input/output/cache-read/cache-write separately (OpenClaw native).
Reconciliation: compare Safety Wrapper's local aggregation with OpenRouter's billing dashboard monthly.
Buffer: include a 5% tolerance in pool tracking to handle rounding differences.
Alert on anomalies: if hourly usage spikes >3× average, flag for investigation.

M9 — n8n Cleanup Completeness

Attribute	Value
Impact	2 (Minor — leftover references cause confusion, not functional failure)
Likelihood	4 (Likely — n8n references are scattered across provisioner, compose, scripts)
Risk Score	8
Category	Technical Debt

Description: n8n was removed from the tool stack (Sustainable Use License issue), but references remain in Playwright scripts, Docker Compose stacks, adapter code, and config files. Incomplete cleanup leads to provisioning errors or wasted container resources.

Mitigation:

Comprehensive grep: grep -rn "n8n" letsbe-provisioner/ — enumerate all references.
Remove systematically: Compose services, nginx configs, Playwright scripts, environment templates, tool registry entries.
Verify: run provisioner on staging after cleanup — confirm no n8n containers start.
Replace in tool inventory: n8n's P1 cheat sheet slot → Activepieces.

4. LOW Risks

L1 — Expo SDK Upgrade During Development

Attribute	Value
Impact	2 (Minor — time spent on SDK migration instead of features)
Likelihood	3 (Possible — Expo releases new SDK every ~3 months)
Risk Score	6
Category	Technical

Mitigation: Pin to Expo SDK 52 for development. Upgrade post-launch.

L2 — Gitea Actions Limitations

Attribute	Value
Impact	2 (Minor — workarounds needed for CI/CD edge cases)
Likelihood	3 (Possible — Gitea Actions is younger than GitHub Actions)
Risk Score	6
Category	Tooling

Mitigation: Use simple, well-tested workflow patterns. Avoid advanced GitHub Actions features that may not have Gitea equivalents.

L3 — Domain/DNS Automation Failure

Attribute	Value
Impact	2 (Minor — manual DNS record creation as fallback)
Likelihood	3 (Possible — Cloudflare/Entri API integration complexity)
Risk Score	6
Category	Technical

Mitigation: DNS automation is in the scope cut table. Manual DNS creation is the existing, proven flow.

L4 — Chromium Memory Usage on Lite Tier

Attribute	Value
Impact	3 (Moderate — Lite tier too constrained for browser tool)
Likelihood	2 (Unlikely — Chromium headless is ~128MB, within budget)
Risk Score	6
Category	Performance

Mitigation: Monitor Chromium memory on Lite tier. If excessive, limit browser tool to single tab. Chromium is only active during browser automation — it doesn't run permanently.

L5 — Founding Member Churn

Attribute	Value
Impact	2 (Minor — reduced early feedback, not technical failure)
Likelihood	3 (Possible — early product may not meet all expectations)
Risk Score	6
Category	Business

Mitigation: Hands-on onboarding for first 10 customers. Weekly check-ins. Fast iteration on feedback. Founding member 2× token bonus incentivizes retention.

L6 — Time Zone Coordination (Distributed Team)

Attribute	Value
Impact	2 (Minor — slower iteration cycles)
Likelihood	2 (Unlikely — team likely EU-based)
Risk Score	4
Category	Organizational

Mitigation: Async communication culture. Overlap hours for critical decisions. Written architecture documents (this proposal) reduce synchronous dependency.

L7 — Image Registry Availability

Attribute	Value
Impact	3 (Moderate — can't deploy or provision if registry down)
Likelihood	1 (Rare — self-hosted Gitea registry)
Risk Score	3
Category	Infrastructure

Mitigation: Cache images on all provisioned servers. Provisioner pre-pulls during off-peak. Registry backup via Gitea's built-in backup.

5. Known Unknowns

Things we know we don't know — areas requiring investigation during Phase 1-2.

U1 — Exact OpenClaw Tool Routing Configuration

Unknown: How exactly do we configure OpenClaw to route tool calls to our Safety Wrapper HTTP API instead of executing them directly?

Options under investigation:

A) Configure exec tool to call Safety Wrapper endpoint via curl
B) Use OpenClaw's custom tool definition to register Safety Wrapper as a tool provider
C) Override the exec tool's handler via plugin registration

Investigation timeline: Week 1-2 (during Safety Wrapper skeleton work) Impact if unresolved: HIGH — blocks all tool integration

U2 — OpenClaw LLM Proxy Configuration

Unknown: How do we tell OpenClaw to route LLM calls through our Secrets Proxy (localhost:8100) instead of directly to OpenRouter?

Expected approach: Configure the model provider's apiBaseUrl to point to http://127.0.0.1:8100 instead of the actual provider URL. The Secrets Proxy forwards to the real provider after redaction.

Investigation timeline: Week 1 (during Secrets Proxy skeleton) Impact if unresolved: HIGH — secrets redaction won't work

U3 — Expo Push Notification Reliability for Time-Sensitive Approvals

Unknown: How reliable are Expo Push notifications for time-sensitive approval requests? What's the delivery latency? What happens if the notification is delayed by 30+ seconds?

Investigation timeline: Week 9-10 (during mobile app development) Fallback: If push notifications are unreliable, add polling fallback in the mobile app (check for pending approvals every 30 seconds when app is foregrounded).

U4 — Stripe Billing Meters Invoice Timing

Unknown: When do Stripe Billing Meters generate invoices? At the end of the billing period? Can we trigger mid-period for real-time usage updates?

Investigation timeline: Week 5-6 (during billing pipeline development) Fallback: If Billing Meters don't support real-time, use webhook events from usage threshold alerts instead.

U5 — Secrets in Tool Output (Post-Execution Redaction)

Unknown: When a tool returns output that contains secrets (e.g., docker inspect returns environment variables with passwords), are those redacted before reaching the LLM?

Expected approach: The Safety Wrapper redacts tool output before returning it to OpenClaw. But this means the Safety Wrapper must see the output, which it does since it's the execution layer.

Verification needed: Confirm that tool output flows through Safety Wrapper → redacted → returned to OpenClaw, not bypassed.

Investigation timeline: Week 4 (during OpenClaw integration)

U6 — OpenClaw Session Persistence Across Restarts

Unknown: When OpenClaw restarts (e.g., after a Docker container restart), do agent sessions resume cleanly? Do in-flight tool calls get replayed or lost?

Investigation timeline: Week 4 (integration testing) Impact: If sessions don't survive restarts, users may lose conversation context after Safety Proxy or OpenClaw crashes.

6. Security-Specific Risks

Attack Surface Analysis

Attack Vector	Component	Severity	Mitigation
Prompt injection via tool output	Safety Wrapper → OpenClaw	HIGH	Redact secrets from tool output; validate tool responses; OpenClaw's native context safety
Shell command injection	Safety Wrapper shell executor	HIGH	Allowlist-based execution; no shell metacharacters; execFile (not exec); path validation
Path traversal in file operations	Safety Wrapper file executor	HIGH	Jail to allowed directories; reject `..`, symlinks outside jail; canonical path resolution
SSRF via browser tool	OpenClaw browser → internal network	MEDIUM	SSRF protection lists (OpenClaw native); restrict to localhost ports
Credential exfiltration via encoding	Secrets Proxy	HIGH	4-layer pipeline including entropy filter; base64/URL-decode before scanning
Approval bypass via race condition	Safety Wrapper approval queue	MEDIUM	Atomic approval state transitions; database locking on approval check
Hub API key theft	Tenant server → Hub	MEDIUM	API keys stored encrypted; transmitted via TLS; rotatable
Cross-tenant data leakage	Hub database	LOW	One customer = one VPS; Hub enforces tenant isolation via API key scoping
DoS via LLM token exhaustion	Safety Wrapper token metering	MEDIUM	Per-hour rate limits; automatic pause at pool exhaustion; alert at 80/90/100%
WebSocket hijacking	OpenClaw WebSocket	LOW	CVE-2026-25253 patched; OpenClaw bound to loopback

Security Invariants (Must Hold Under All Conditions)

Invariant	Enforcement	Verification
Secrets never reach LLM providers	Secrets Proxy transport-layer redaction	P0 test suite + adversarial audit
AI never sees raw credential values	SECRET_REF placeholders; injection at execution time	Integration tests
Destructive operations require human approval (at levels 1-2)	Safety Wrapper autonomy engine	P0 test suite
External comms always gated by default	External Comms Gate (independent of autonomy)	Configuration verification
Audit trail captures every tool call	Append-only SQLite audit log	Log completeness check
Container runs as non-root	Docker security configuration	Provisioner verification
OpenClaw not accessible from external network	Loopback binding	Network scan
Elevated Mode permanently disabled	OpenClaw configuration	Config verification

7. Business & Operational Risks

B1 — Market Timing

Attribute	Value
Risk	AI agent platforms are proliferating rapidly. Delay risks competitor capturing the SMB privacy-first niche.
Impact	3 (Moderate)
Likelihood	3 (Possible)
Mitigation	Focus on the privacy moat — competitors would need to redesign their architecture to match the secrets-never-leave guarantee. Ship fast on the core differentiator.

B2 — Unit Economics at Scale

Attribute	Value
Risk	Token costs, LLM API prices, and VPS costs may shift. The current pricing model (€29-109/mo) assumes specific cost structures.
Impact	3 (Moderate)
Likelihood	3 (Possible — LLM prices are dropping, but usage patterns are unpredictable)
Mitigation	Token pool sizes are configurable in Hub settings. Markup thresholds are configurable. Pricing tiers can be adjusted without code changes. Monitor unit economics from founding member data.

B3 — Customer Support at Scale

Attribute	Value
Risk	Each customer has their own VPS with unique configuration. Debugging customer issues is more complex than multi-tenant SaaS.
Impact	3 (Moderate)
Likelihood	4 (Likely — one-VPS-per-customer means one-off issues)
Mitigation	Hub monitoring dashboard. Tenant health heartbeats. Centralized logging via Hub. Remote diagnostic commands via Hub API. Consider adding remote shell access for LetsBe staff (gated by customer approval).

B4 — Regulatory Risk (EU AI Act)

Attribute	Value
Risk	EU AI Act may impose requirements on AI agents acting autonomously on behalf of businesses.
Impact	2 (Minor — likely "limited risk" category for business tools)
Likelihood	2 (Unlikely to affect v1 launch)
Mitigation	Audit trail captures every AI decision. Human-in-the-loop via approval system. Transparency via agent activity feed. Monitor EU AI Act implementation timeline.

8. Dependency Risks

External Dependencies

Dependency	Version	Risk	Mitigation
OpenClaw	v2026.2.6-3	Breaking changes; hook gaps	Pin release; compatibility tests; separate-process architecture
OpenRouter	API v1	Rate limits; outages; pricing changes	Failover chains; multiple API keys; direct provider fallback
Stripe	v17.7.0	API deprecations; Billing Meters maturity	Use stable APIs; test mode validation; fallback to usage records
Expo SDK	52	Breaking changes in SDK upgrades	Pin SDK; upgrade post-launch
Netcup SCP API	OAuth2	API changes; rate limits	Existing integration proven; Hetzner as overflow provider
PostgreSQL	16	Minimal risk — mature and stable	Standard backup strategy
Node.js	22	LTS until April 2027	Aligned with OpenClaw's runtime requirement
better-sqlite3	Latest	Native compilation on different platforms	Pin version; test in CI Docker
Prisma	7.0.0	Migration compatibility; query performance	Well-established ORM; large community

Internal Dependencies

Dependency	Owner	Risk	Mitigation
Hub (existing codebase)	Hub Backend Engineer	80+ endpoints to maintain alongside new development	Additive-only changes; no breaking existing endpoints
Provisioner (Bash scripts)	DevOps Engineer	Zero tests; complex SSH operations	Integration tests; manual verification; incremental changes
Gitea (self-hosted)	DevOps Engineer	Single point of failure for source control and CI	Regular backups; consider mirror to external Git provider

9. Risk Monitoring Plan

Weekly Risk Review (Every Friday)

Activity	Owner	Output
Review risk register	Project Lead	Updated risk scores; new risks added
Check milestone progress vs. plan	Project Lead	Buffer consumption tracked
Security invariant spot-check	Safety Wrapper Lead	Random adversarial test run
Dependency version check	DevOps	Alert on new OpenClaw releases or CVEs

Automated Monitoring (Post-Deployment)

Monitor	Frequency	Alert Threshold
Secrets redaction miss rate	Per-request	Any non-zero rate
Safety Wrapper uptime	Every 60s	Downtime > 30s
Hub ↔ SW heartbeat	Every 60s	2 missed heartbeats
Token usage anomaly	Hourly	>3× average hourly usage
Provisioner success rate	Per-provisioning	Any failure
LLM provider latency	Per-request	p95 > 30s
Memory usage per component	Every 5min	>90% of budget

Risk Escalation Matrix

Risk Score Change	Action
Score increases by ≥5	Escalate to project lead; discuss in weekly review
New HIGH risk identified	Immediate team notification; mitigation plan within 24h
Milestone at risk (>3 days behind)	Scope cut discussion; buffer reallocation
Security invariant violation	STOP DEPLOYMENT. All hands on fix. No exceptions.

End of Document — 06 Risk Assessment

31 KiB Raw Blame History Unescape Escape

LetsBe Biz — Risk Assessment

Table of Contents

1. Risk Matrix Overview

Scoring

Summary

2. HIGH Risks

H1 — Secrets Redaction Bypass

H2 — OpenClaw Hook Gap (before_tool_call not bridged to external plugins)

H3 — OpenClaw Upstream Breaking Changes

H4 — Provisioner Reliability (Zero Tests)

H5 — CVE-2026-25253 (Cross-Site WebSocket Hijacking in OpenClaw)

H6 — Single Point of Failure: Safety Wrapper Lead

3. MEDIUM Risks

M1 — Mobile App Platform Inconsistencies

M2 — Stripe Billing Meters Complexity

M3 — Tool API Stability

M4 — Hub Performance Under Tenant Load

M5 — Secrets Proxy Latency Impact

M6 — LLM Provider Reliability

M7 — Config.json Plaintext Password (Existing Critical Bug)

M8 — Token Metering Accuracy

M9 — n8n Cleanup Completeness

4. LOW Risks

L1 — Expo SDK Upgrade During Development

L2 — Gitea Actions Limitations

L3 — Domain/DNS Automation Failure

L4 — Chromium Memory Usage on Lite Tier

L5 — Founding Member Churn

L6 — Time Zone Coordination (Distributed Team)

L7 — Image Registry Availability

5. Known Unknowns

U1 — Exact OpenClaw Tool Routing Configuration

U2 — OpenClaw LLM Proxy Configuration

U3 — Expo Push Notification Reliability for Time-Sensitive Approvals

U4 — Stripe Billing Meters Invoice Timing

U5 — Secrets in Tool Output (Post-Execution Redaction)

U6 — OpenClaw Session Persistence Across Restarts

6. Security-Specific Risks

Attack Surface Analysis

Security Invariants (Must Hold Under All Conditions)

7. Business & Operational Risks

B1 — Market Timing

B2 — Unit Economics at Scale

B3 — Customer Support at Scale

B4 — Regulatory Risk (EU AI Act)

8. Dependency Risks

External Dependencies

Internal Dependencies

9. Risk Monitoring Plan

Weekly Risk Review (Every Friday)

Automated Monitoring (Post-Deployment)

Risk Escalation Matrix

31 KiB

Raw Blame History