Initial commit: LetsBe Biz project with openclaw source

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 16:24:23 +01:00
commit 14ff8fd54c
93 changed files with 31651 additions and 0 deletions
--- a/docs/architecture-proposal/gpt/00-executive-summary.md
+++ b/docs/architecture-proposal/gpt/00-executive-summary.md
@@ -0,0 +1,37 @@
+# 00. Executive Summary
+
+## Recommended Direction
+
+- Retain and extend `letsbe-hub` instead of rewriting backend.
+- Build Safety Wrapper as OpenClaw plugin with a separate local egress redaction proxy.
+- Treat OpenClaw as a pinned upstream dependency (no fork).
+- Make `n8n`/deprecated stack removal and plaintext credential leak fixes the first gate.
+- Launch mobile with React Native + Expo and web onboarding as separate frontend app.
+- Move first-party code to a monorepo for shared contracts and coordinated CI.
+
+## Delivery Window
+
+- Start: March 2, 2026
+- Founding member launch target: May 24, 2026
+- Buffer: May 25-31, 2026
+
+## Hard Requirements Preserved
+
+- 4-layer security model
+- secrets-never-leave-server invariant
+- 3-tier autonomy with independent external-comms gate
+- one customer per VPS
+
+## Most Critical Risks
+
+- security bypass in redaction/gating
+- provisioner migration instability
+- billing metering accuracy drift
+
+## First Build Gate
+
+Do not start feature tracks until:
+
+1. all `n8n` production references removed
+2. deprecated deploy paths disabled
+3. plaintext provisioning secret storage eliminated
--- a/docs/architecture-proposal/gpt/01-architecture-and-dataflows.md
+++ b/docs/architecture-proposal/gpt/01-architecture-and-dataflows.md
@@ -0,0 +1,250 @@
+# 01. Architecture And Data Flows
+
+## 1. Scope And Non-Negotiables
+
+This proposal is explicitly designed around the fixed constraints from the Architecture Brief:
+
+- 4-layer security model is mandatory.
+- Secrets never leave tenant server is mandatory.
+- 3-tier autonomy + external communications gate is mandatory.
+- OpenClaw is upstream dependency (no fork by default).
+- One customer = one VPS is mandatory.
+- `n8n` removal is prerequisite.
+
+## 2. Proposed Target Architecture
+
+### 2.1 Core Decisions
+
+| Decision | Proposal | Why |
+|---|---|---|
+| Hub stack | Keep Next.js + Prisma + PostgreSQL | Existing app already has major workflows and 80+ APIs; rewrite is timeline-risky for 3-month launch. |
+| OpenClaw integration | Use pinned upstream release, no fork | Maximizes upgrade velocity and avoids merge debt. |
+| Safety Wrapper shape | Hybrid: OpenClaw plugin + local egress proxy + local execution adapters | Gives direct hook interception plus transport-level redaction guarantee. |
+| Mobile | React Native + Expo | Fastest path to iOS/Android with TypeScript contract reuse. |
+| Website | Separate public web app (same monorepo) + Hub public APIs | Security isolation between public onboarding and admin/customer portal. |
+| Repo strategy | Monorepo for first-party services; OpenClaw kept separate upstream repo | Strong contract sharing + CI simplicity without violating upstream dependency model. |
+
+### 2.2 System Context Diagram
+
+```mermaid
+flowchart LR
+    subgraph Client[Client Layer]
+      M[Mobile App\nReact Native + Expo]
+      W[Website\nOnboarding + Checkout]
+      C[Customer Portal Web]
+      A[Admin Portal Web]
+    end
+
+    subgraph Control[Central Platform]
+      H[Hub API + UI\nNext.js + Prisma]
+      DB[(PostgreSQL)]
+      Q[Background Workers\nAutomation + Metering]
+      N[Notification Service\nPush/Email]
+      ST[Stripe]
+      NC[Netcup/Hetzner]
+    end
+
+    subgraph Tenant[Per-Customer VPS]
+      OC[OpenClaw Gateway\nUpstream]
+      SW[Safety Wrapper Plugin\nHooks + Classification]
+      SP[LLM Egress Proxy\nSecrets Firewall]
+      SV[(Secrets Vault SQLite\nEncrypted)]
+      TA[Tool Adapters + Exec Guards]
+      TS[(Tool Stacks 25+)]
+      AP[(Approval Cache SQLite)]
+      TU[(Token Usage Buckets)]
+    end
+
+    M --> H
+    W --> H
+    C --> H
+    A --> H
+
+    H --> DB
+    H --> Q
+    H --> N
+    H <--> ST
+    H <--> NC
+
+    H <--> OC
+    OC --> SW
+    SW --> SP
+    SP --> LLM[(LLM Providers)]
+
+    SW <--> SV
+    SW <--> TA
+    TA <--> TS
+    SW <--> AP
+    SW --> TU
+    TU --> H
+```
+
+## 3. Tenant Runtime Architecture
+
+### 3.1 4-Layer Security Enforcement
+
+| Layer | Enforcement Point | Implementation |
+|---|---|---|
+| 1. Sandbox | OpenClaw runtime/tool sandbox settings | OpenClaw native sandbox + process/container isolation. |
+| 2. Tool Policy | OpenClaw agent tool allow/deny | Per-agent tool manifest; tools not listed are unreachable. |
+| 3. Command Gating | Safety Wrapper `before_tool_call` | Green/Yellow/Yellow+External/Red/Critical Red classification + approval flow. |
+| 4. Secrets Redaction | Local egress proxy + transcript hooks | Outbound prompt redaction before network egress, plus log/transcript redaction hooks. |
+
+### 3.2 Safety Wrapper Components
+
+- `classification-engine`: deterministic rules engine with signed policy bundle from Hub.
+- `approval-gateway`: sync/async approval requests to Hub, with 24h expiry.
+- `secret-ref-resolver`: resolves `SECRET_REF(...)` at execution time only.
+- `adapter-runtime`: executes tool API adapters and guarded shell/docker/file actions.
+- `metering-collector`: captures per-agent/per-model token usage and aggregates hourly.
+- `hub-sync-client`: registration, heartbeat, config pull, backup status, command results.
+
+### 3.3 OpenClaw Hook Usage (No Fork)
+
+Safety Wrapper plugin uses upstream hook points for enforcement and observability:
+
+- `before_tool_call`: classify/gate/block/require approval.
+- `after_tool_call`: audit capture + normalization.
+- `message_sending`: outbound content redaction.
+- `before_message_write`, `tool_result_persist`: local persistence redaction.
+- `llm_output`: token accounting and per-model usage capture.
+- `before_prompt_build`: inject cacheable SOUL/TOOLS prefix metadata.
+- `subagent_spawning`: enforce max depth/budget.
+- `gateway_start`: health checks + Hub session bootstrap.
+
+## 4. Primary Data Flows
+
+### 4.1 Signup To Provisioning Flow
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant Site as Website
+    participant Hub
+    participant Stripe
+    participant Worker as Automation Worker
+    participant Provider as Netcup/Hetzner
+    participant Prov as Provisioner
+    participant VPS as Tenant VPS
+
+    User->>Site: Describe business + pick tools
+    Site->>Hub: Create onboarding draft
+    Site->>Stripe: Checkout session
+    Stripe-->>Hub: checkout.session.completed
+    Hub->>Worker: Create order (PAYMENT_CONFIRMED)
+    Worker->>Provider: Allocate VPS
+    Provider-->>Worker: VPS ready (IP + creds)
+    Worker->>Hub: DNS_PENDING -> DNS_READY
+    Worker->>Prov: Start provisioning job
+    Prov->>VPS: Install stacks + OpenClaw + Safety
+    Prov->>VPS: Seed secrets vault + tool registry
+    Prov->>VPS: Register tenant with Hub
+    VPS-->>Hub: register + first heartbeat
+    Hub-->>User: Provisioning complete + app links
+```
+
+### 4.2 Agent Tool Call With Gating
+
+```mermaid
+sequenceDiagram
+    participant U as User
+    participant OC as OpenClaw
+    participant SW as Safety Wrapper
+    participant H as Hub
+    participant T as Tool/API
+
+    U->>OC: "Publish this newsletter"
+    OC->>SW: tool call proposal
+    SW->>SW: classify = Yellow+External
+    SW->>H: approval request
+    H-->>U: push approval request
+    U->>H: approve
+    H-->>SW: approval grant
+    SW->>T: execute with SECRET_REF injection
+    T-->>SW: result
+    SW-->>OC: redacted result
+    OC-->>U: completion summary
+```
+
+### 4.3 Secrets Redaction Outbound Flow
+
+```mermaid
+flowchart LR
+    A[OpenClaw Prompt Payload] --> B[Safety Wrapper Pre-Redaction]
+    B --> C[Secrets Registry Match]
+    C --> D[Pattern Safety Net]
+    D --> E[Function-Call SecretRef Rebinding]
+    E --> F[Local Egress Proxy]
+    F --> G[Provider API]
+
+    C --> C1[(Vault SQLite)]
+    D --> D1[(Regex + Entropy Rules)]
+    F --> F1[Transport-Level Block if bypass attempt]
+```
+
+### 4.4 Token Metering And Billing
+
+```mermaid
+flowchart LR
+    O[OpenClaw llm_output hook] --> M[Metering Collector]
+    M --> B[(Hourly Buckets SQLite)]
+    B --> H[Hub Usage Ingest API]
+    H --> P[(Billing Period + Usage Tables)]
+    P --> S[Stripe Usage/Billing]
+    H --> UI[Usage Dashboard + Alerts]
+```
+
+## 5. Prompt Caching Architecture
+
+- SOUL.md and TOOLS.md are split into stable cacheable prefix blocks and dynamic suffix blocks.
+- Stable prefix hash is generated per agent version.
+- Prefix changes only when agent config changes; day-to-day conversations hit cache-read pricing.
+- Metering persists `input/output/cache_read/cache_write` separately to preserve margin analytics.
+
+## 6. Mobile, Website, And Channel Architecture
+
+### 6.1 Mobile App
+
+- React Native + Expo app as primary interface.
+- Real-time chat via Hub websocket gateway.
+- Approvals as push notifications (approve/deny quick actions).
+- Fallback channel switchboard in Hub for WhatsApp/Telegram relay adapters.
+
+### 6.2 Website + Onboarding
+
+- Dedicated public frontend app (`apps/website`) with strict network boundary to Hub public APIs.
+- Onboarding classifier service (cheap model profile) performs 1-2 message business classification.
+- Tool bundle recommendation engine returns editable stack + resource calculator.
+- Checkout remains Stripe-hosted.
+
+## 7. First-Hour Workflow Templates (Architecture Proof)
+
+| Template | Cross-Tool Actions | Gating Profile |
+|---|---|---|
+| Freelancer First Hour | Connect mail + calendar, create folders, configure intake form, first daily brief | Mostly Green/Yellow |
+| Agency First Hour | Chat inbox setup, project board scaffolding, proposal template generation, shared KB setup | Yellow + Yellow+External approval |
+| E-commerce First Hour | Inventory import, support inbox routing, analytics dashboard baseline, recovery email draft | Mixed Yellow/Yellow+External |
+| Consulting First Hour | Scheduling links, client doc signature template, CRM stages, weekly report automation | Mostly Yellow + one external gate |
+
+These templates are codified as audited workflow blueprints executed through the same command classification path as ad-hoc agent actions.
+
+## 8. Interactive Demo Architecture (Pre-Purchase)
+
+Proposal: shared but isolated "Demo Tenant Pool" instead of a single static demo VPS.
+
+- Each prospect gets a short-lived demo tenant snapshot (TTL 2 hours).
+- Demo runs synthetic data and fake outbound integrations only.
+- Same Safety Wrapper + approvals UI as production to demonstrate trust model.
+- Recycled automatically after session expiry.
+
+This is safer and more realistic than one long-lived shared "Bella's Bakery" host.
+
+## 9. Required Pre-Launch Cleanup Baseline
+
+Before core build starts, execute repository cleanup gate:
+
+- Remove all `n8n` references from Hub, Provisioner, stacks, scripts, tests, and docs used for production behavior.
+- Remove deployment references to deprecated `orchestrator` and `sysadmin-agent` from active provisioning paths.
+- Close plaintext credential leak path (`jobs/*/config.json` root password exposure) by moving to one-time secret files + immediate secure deletion.
+
+No feature work should proceed until this baseline passes CI policy checks.
--- a/docs/architecture-proposal/gpt/02-components-and-api-contracts.md
+++ b/docs/architecture-proposal/gpt/02-components-and-api-contracts.md
@@ -0,0 +1,334 @@
+# 02. Component Breakdown And API Contracts
+
+## 1. Component Breakdown
+
+## 1.1 Control Plane Components
+
+| Component | Runtime | Responsibility | Notes |
+|---|---|---|---|
+| Hub Web/API | Next.js 16 + Node | Admin UI, customer portal, public APIs, tenant APIs | Keep existing app, add route groups and API contracts below. |
+| Billing Engine | Node worker + Prisma | Usage aggregation, pool accounting, overage invoicing | Hourly usage compaction + end-of-period invoice sync. |
+| Provisioning Orchestrator | Existing automation worker | Order state machine and provisioning job dispatch | Keep and harden existing job pipeline. |
+| Notification Gateway | Node service | Push notifications, email alerts, approval prompts | Expo push + email provider adapters. |
+| Onboarding Classifier | Lightweight service | Business-type classification + starter bundle recommendation | Cheap fast model profile; capped context. |
+
+## 1.2 Tenant Components (Per VPS)
+
+| Component | Runtime | Responsibility | State Store |
+|---|---|---|---|
+| OpenClaw Gateway | Node 22+ upstream | Agent runtime, sessions, tool orchestration | OpenClaw JSON/JSONL storage |
+| Safety Wrapper Plugin | TypeScript package | Classification, gating, hooks, metering, Hub sync | SQLite (`safety.db`) |
+| Egress Proxy | Node/Rust sidecar | Outbound redaction + transport enforcement | In-memory + policy cache |
+| Execution Adapters | Local modules | Shell/Docker/file/env and tool REST adapters | Audit log in SQLite |
+| Secrets Vault | SQLite + encryption | Secret values, rotation history, fingerprints | `vault.db` |
+
+## 1.3 Deprecated Components (Explicitly Out)
+
+- `letsbe-orchestrator`: behavior studied for migration inputs only.
+- `letsbe-sysadmin-agent`: executor patterns ported, service itself not retained.
+- `letsbe-mcp-browser`: replaced by OpenClaw native browser tooling.
+
+## 2. API Design Rules (Applies To All Contracts)
+
+- Base path versioning: `/api/v1/...`
+- JSON request/response with strict schema validation.
+- Idempotency required on mutating tenant commands (`Idempotency-Key` header).
+- Authn/authz split by channel:
+  - Tenant channel: `Bearer <tenant_api_key>` (hash stored server-side)
+  - Mobile/customer channel: session JWT + RBAC
+  - Public website onboarding: scoped API key + anti-abuse limits
+- All mutating endpoints emit audit event rows.
+- All time fields are ISO 8601 UTC.
+
+## 3. Hub ↔ Tenant API Contracts
+
+## 3.1 Register Tenant Node
+
+`POST /api/v1/tenant/register`
+
+Purpose: first boot registration from Safety Wrapper.
+
+Request:
+
+```json
+{
+  "registrationToken": "rt_...",
+  "orderId": "ord_...",
+  "agentVersion": "safety-wrapper@0.1.0",
+  "openclawVersion": "2026.2.26",
+  "hostname": "cust-vps-001",
+  "capabilities": ["browser", "exec", "docker", "approval_queue"]
+}
+```
+
+Response `201`:
+
+```json
+{
+  "tenantApiKey": "tk_live_...",
+  "tenantId": "ten_...",
+  "heartbeatIntervalSec": 30,
+  "configEtag": "cfg_9f1a...",
+  "time": "2026-02-26T20:15:00Z"
+}
+```
+
+## 3.2 Heartbeat + Pull Deltas
+
+`POST /api/v1/tenant/heartbeat`
+
+Purpose: status signal plus lightweight config/update pull.
+
+Request:
+
+```json
+{
+  "tenantId": "ten_...",
+  "server": {
+    "uptimeSec": 86400,
+    "diskPct": 61.2,
+    "memPct": 57.8,
+    "openclawHealthy": true
+  },
+  "agents": [
+    {"agentId": "marketing", "status": "online", "autonomyLevel": 2}
+  ],
+  "pendingApprovals": 1,
+  "lastAppliedConfigEtag": "cfg_9f1a..."
+}
+```
+
+Response `200`:
+
+```json
+{
+  "configChanged": true,
+  "nextConfigEtag": "cfg_9f1b...",
+  "commands": [],
+  "clock": "2026-02-26T20:15:30Z"
+}
+```
+
+## 3.3 Pull Full Tenant Config
+
+`GET /api/v1/tenant/config?etag=cfg_9f1a...`
+
+Response `200` includes:
+
+- agent definitions (SOUL/TOOLS refs, model profile)
+- autonomy policy
+- external comms gate unlock map
+- command classification ruleset checksum
+- tool registry template version
+
+## 3.4 Approval Request / Resolve
+
+`POST /api/v1/tenant/approval-requests`
+
+```json
+{
+  "tenantId": "ten_...",
+  "requestId": "apr_...",
+  "agentId": "marketing",
+  "class": "yellow_external",
+  "tool": "listmonk.send_campaign",
+  "humanSummary": "Send campaign 'March Offer' to 1,204 recipients",
+  "expiresAt": "2026-02-27T20:15:30Z",
+  "context": {"recipientCount": 1204}
+}
+```
+
+`GET /api/v1/tenant/approval-requests/{requestId}` returns `PENDING|APPROVED|DENIED|EXPIRED`.
+
+## 3.5 Usage Ingestion
+
+`POST /api/v1/tenant/usage-buckets`
+
+```json
+{
+  "tenantId": "ten_...",
+  "buckets": [
+    {
+      "hour": "2026-02-26T20:00:00Z",
+      "agentId": "marketing",
+      "model": "openrouter/deepseek-v3.2",
+      "inputTokens": 12000,
+      "outputTokens": 3800,
+      "cacheReadTokens": 6400,
+      "cacheWriteTokens": 0,
+      "webSearchCalls": 3,
+      "webFetchCalls": 1
+    }
+  ]
+}
+```
+
+## 3.6 Backup Status
+
+`POST /api/v1/tenant/backup-status`
+
+Tracks last run, duration, snapshot ID, integrity verification state.
+
+## 4. Customer/Mobile API Contracts
+
+## 4.1 Agent And Autonomy Management
+
+- `GET /api/v1/customer/agents`
+- `PATCH /api/v1/customer/agents/{agentId}`
+- `PATCH /api/v1/customer/agents/{agentId}/autonomy`
+- `PATCH /api/v1/customer/agents/{agentId}/external-comms-gate`
+
+Autonomy update request:
+
+```json
+{
+  "autonomyLevel": 2,
+  "externalComms": {
+    "defaultLocked": true,
+    "toolUnlocks": [
+      {"tool": "chatwoot.reply_external", "enabled": true, "expiresAt": null}
+    ]
+  }
+}
+```
+
+## 4.2 Approval Queue
+
+- `GET /api/v1/customer/approvals?status=pending`
+- `POST /api/v1/customer/approvals/{id}` with `{ "decision": "approve" | "deny" }`
+
+## 4.3 Usage And Billing
+
+- `GET /api/v1/customer/usage/summary`
+- `GET /api/v1/customer/usage/by-agent`
+- `GET /api/v1/customer/billing/current-period`
+- `POST /api/v1/customer/billing/payment-method`
+
+## 4.4 Realtime Channels
+
+- `GET /api/v1/customer/events/stream` (SSE fallback)
+- `WS /api/v1/customer/ws` (chat updates, approvals, status)
+
+## 5. Public Website/Onboarding API Contracts
+
+## 5.1 Business Classification
+
+`POST /api/v1/public/onboarding/classify`
+
+```json
+{
+  "sessionId": "onb_...",
+  "messages": [
+    {"role": "user", "content": "I run a 5-person digital agency"}
+  ]
+}
+```
+
+Response:
+
+```json
+{
+  "businessType": "agency",
+  "confidence": 0.91,
+  "recommendedBundle": "agency_core_v1",
+  "followUpQuestion": "Do you need ticketing or only chat?"
+}
+```
+
+## 5.2 Bundle Quote
+
+`POST /api/v1/public/onboarding/quote`
+
+Returns min tier, projected token pool, monthly estimate, and Stripe checkout seed payload.
+
+## 5.3 Order Creation
+
+`POST /api/v1/public/orders` with strict schema + anti-fraud controls.
+
+## 6. Safety Wrapper Internal Contract (Local Only)
+
+Local Unix socket JSON-RPC interface between plugin orchestration and execution layer.
+
+Method examples:
+
+- `exec.run`
+- `docker.compose`
+- `file.read`
+- `file.write`
+- `env.update`
+- `tool.http.call`
+
+Example request:
+
+```json
+{
+  "id": "rpc_1",
+  "method": "tool.http.call",
+  "params": {
+    "tool": "ghost",
+    "operation": "posts.create",
+    "secretRefs": ["ghost_admin_key"],
+    "payload": {"title": "..."}
+  }
+}
+```
+
+Guarantees:
+
+- Secrets passed only as references, never raw values in request logs.
+- Execution engine resolves references inside isolated process boundary.
+- Full request/result hashes persisted for audit traceability.
+
+## 7. Tool Registry Contract
+
+`tool-registry.json` shape (tenant-local):
+
+```json
+{
+  "version": "2026-02-26",
+  "tools": [
+    {
+      "id": "chatwoot",
+      "baseUrl": "https://chat.customer-domain.tld",
+      "auth": {"type": "bearer_secret_ref", "ref": "chatwoot_api_token"},
+      "adapters": ["contacts.list", "conversation.reply"],
+      "externalCommsOperations": ["conversation.reply_external"],
+      "cheatsheet": "/opt/letsbe/cheatsheets/chatwoot.md"
+    }
+  ]
+}
+```
+
+## 8. Error Contract And Retries
+
+Standard error envelope:
+
+```json
+{
+  "error": {
+    "code": "APPROVAL_REQUIRED",
+    "message": "Operation requires approval",
+    "requestId": "req_...",
+    "retryable": true,
+    "details": {"approvalRequestId": "apr_..."}
+  }
+}
+```
+
+Common error codes:
+
+- `AUTH_INVALID`
+- `TENANT_UNKNOWN`
+- `APPROVAL_REQUIRED`
+- `APPROVAL_EXPIRED`
+- `CLASSIFICATION_BLOCKED`
+- `SECRET_REF_UNRESOLVED`
+- `POLICY_VERSION_MISMATCH`
+- `RATE_LIMITED`
+
+## 9. API Compatibility And Change Policy
+
+- Backward-compatible additions: allowed in-place.
+- Breaking changes: new version path (`/api/v2`).
+- Deprecation window: minimum 60 days for tenant APIs.
+- Contract tests run in CI for Hub, Safety Wrapper, Mobile, and Website clients.
--- a/docs/architecture-proposal/gpt/03-deployment-strategy.md
+++ b/docs/architecture-proposal/gpt/03-deployment-strategy.md
@@ -0,0 +1,145 @@
+# 03. Deployment Strategy
+
+## 1. Goals
+
+- Ship to founding members in ~12 weeks without compromising security invariants.
+- Maintain one-VPS-per-customer isolation.
+- Keep OpenClaw upstream-pinned and independently upgradeable.
+- Make tenant rollout reversible with fast rollback paths.
+
+## 2. Environment Topology
+
+## 2.1 Control Plane Environments
+
+| Environment | Purpose | Data |
+|---|---|---|
+| `dev` | Rapid feature iteration | Synthetic/local data |
+| `staging` | Release-candidate validation, e2e, load, security checks | Sanitized fixtures |
+| `prod-eu` | EU customers (default EU routing) | Real customer data |
+| `prod-us` | NA customers (default NA routing) | Real customer data |
+
+Control plane services (Hub + worker + notifications) are region-deployed with independent DBs and clear region affinity.
+
+## 2.2 Tenant Environments
+
+- `sandbox tenants`: internal QA and interactive demo pool.
+- `canary tenants`: first real-production update recipients.
+- `general tenants`: full customer fleet.
+
+## 3. Deployment Units
+
+## 3.1 Control Plane Units
+
+- `hub-web-api` container (Next.js standalone runtime)
+- `hub-worker` container (automation + billing jobs)
+- `notifications` container (push/email delivery)
+- `postgres` (managed or self-hosted HA)
+
+## 3.2 Tenant Units (Per Customer VPS)
+
+- `openclaw` container (upstream image/tag pinned)
+- `safety-wrapper` plugin package mounted into OpenClaw extension dir
+- `egress-proxy` service (localhost-only)
+- tool containers and nginx from provisioner
+- local SQLite data stores for secrets/approvals/metering
+
+## 4. Provisioning Deployment Plan
+
+## 4.1 Provisioner Mode
+
+Continue with existing one-shot SSH provisioner flow, retooled to:
+
+- deploy OpenClaw + Safety components
+- remove legacy orchestrator/sysadmin deployment
+- strip deprecated stacks and n8n references
+- write secrets into encrypted vault only (no plaintext long-lived config)
+
+## 4.2 Immutable Artifact Inputs
+
+Provisioning uses pinned artifacts only:
+
+- OpenClaw release tag (`stable` channel pin)
+- Safety Wrapper image/package digest
+- Tool stack compose templates with hash
+- policy bundle version + checksum
+
+## 5. Secrets And Credential Deployment
+
+- Registration token is one-time and short-lived.
+- Tenant API key returned at registration; only hash stored in Hub DB.
+- Provisioner writes bootstrap secrets to tmpfs file, consumed once, then shredded.
+- Existing plaintext job config path (`jobs/<id>/config.json`) replaced by encrypted payload + ephemeral decrypt-on-run.
+
+## 6. Release Strategy
+
+## 6.1 Control Plane
+
+- Trunk-based merges behind feature flags.
+- Deploy via Gitea Actions with staged promotions (`dev -> staging -> prod`).
+- DB migrations run in expand/contract pattern.
+
+## 6.2 Tenant Plane
+
+Tenant updates split into independent channels:
+
+- `policy-only`: classification/autonomy/tool policy updates (no binary change)
+- `wrapper patch`: Safety Wrapper version bump
+- `openclaw bump`: upstream release bump (separate tracked campaign)
+
+Rollout:
+
+1. Internal sandbox tenants
+2. 5% canary customer tenants
+3. 25%
+4. 100%
+
+Auto-stop criteria:
+
+- redaction test failure
+- approval-routing failure >1%
+- tenant heartbeat drop >3%
+
+## 7. Rollback Strategy
+
+## 7.1 Control Plane Rollback
+
+- Keep last two container digests deployable.
+- Migration rollback policy: only for reversible migrations; otherwise hotfix-forward.
+
+## 7.2 Tenant Rollback
+
+- Policy rollback via previous signed policy bundle.
+- Wrapper rollback to previous plugin package.
+- OpenClaw rollback to previous pinned stable tag after compatibility check.
+
+## 8. Observability And SLOs
+
+## 8.1 Required Telemetry
+
+- tenant heartbeat latency and freshness
+- approval queue latency (request -> decision)
+- redaction pipeline counters (matches by layer)
+- token usage ingest lag
+- provisioning success/failure per step
+
+## 8.2 Launch SLO Targets
+
+- Hub API availability: 99.9%
+- Tenant heartbeat freshness: 99% under 2 minutes
+- Approval propagation: p95 < 5 seconds (Hub to mobile push)
+- Provisioning success first-attempt: >= 90%
+
+## 9. Dual-Provider Strategy (Netcup + Hetzner)
+
+- Primary capacity pool on Netcup (EU/US).
+- Overflow path on Hetzner with same provisioner scripts and hardened baseline.
+- Provider adapter abstraction lives in Hub `server-provisioning` module; provisioner remains Debian-focused and provider-agnostic.
+
+## 10. Cutover Plan From Current State
+
+1. Freeze legacy orchestrator/sysadmin deployment paths.
+2. Land prerequisite cleanup release (n8n/deprecated removal + credential leak fix).
+3. Enable new tenant register/heartbeat APIs in Hub.
+4. Provision first new-architecture internal tenant.
+5. Execute parallel-run window (old and new provisioning flows side-by-side for internal only).
+6. Flip default provisioning to new flow for production orders.
--- a/docs/architecture-proposal/gpt/04-implementation-plan-and-dependency-graph.md
+++ b/docs/architecture-proposal/gpt/04-implementation-plan-and-dependency-graph.md
@@ -0,0 +1,176 @@
+# 04. Detailed Implementation Plan And Dependency Graph
+
+## 1. Planning Assumptions
+
+- Target launch window: 12 weeks.
+- Team model assumed for schedule below:
+  - 2 backend/platform engineers
+  - 1 mobile/fullstack engineer
+  - 1 DevOps/SRE engineer
+  - 1 QA/security engineer (shared)
+- Existing Hub codebase is retained and extended.
+
+## 2. Work Breakdown Structure (WBS)
+
+## Phase 0: Prerequisite Cleanup And Hardening (Week 1)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P0-1 | Remove all `n8n` code references (Hub, provisioner, stacks, scripts, tests) | 3d | - | `rg -n n8n` clean in production code paths; CI policy check added |
+| P0-2 | Remove deprecated deploy targets (`orchestrator`, `sysadmin`) from active provisioning | 2d | P0-1 | No new orders can deploy deprecated services |
+| P0-3 | Fix plaintext provisioning secret leak (`jobs/*/config.json`) | 2d | P0-1 | No root/server password persisted in plaintext job files |
+| P0-4 | Baseline security regression tests for cleanup changes | 1d | P0-2,P0-3 | Green CI + sign-off |
+
+## Phase 1: Safety Substrate (Weeks 2-3)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P1-1 | Build encrypted secrets vault SQLite schema + key management | 3d | P0-4 | CRUD, rotation, audit log implemented |
+| P1-2 | Implement egress redaction proxy (registry + regex + entropy layers) | 4d | P1-1 | Redaction test suite pass with seeded secrets |
+| P1-3 | Implement command classification engine (5-tier + external gate) | 3d | P1-1 | Deterministic policy tests pass |
+| P1-4 | Implement approval state cache + retry logic (tenant-local) | 2d | P1-3 | Approval resilience tests pass |
+| P1-5 | OpenClaw plugin skeleton with hooks + telemetry envelope | 3d | P1-2,P1-3 | Hook smoke tests green against pinned OpenClaw tag |
+
+## Phase 2: Hub Tenant APIs + Data Model (Weeks 3-4)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P2-1 | Add Prisma models: approval queue, usage buckets, agent policy, comms unlocks | 2d | P0-4 | Migration applied in staging |
+| P2-2 | Implement tenant register/heartbeat/config APIs | 3d | P2-1 | Contract tests pass |
+| P2-3 | Implement tenant approval-request APIs + customer approval endpoints | 3d | P2-1 | End-to-end approval cycle works |
+| P2-4 | Implement usage ingest + billing period updates | 3d | P2-1 | Usage events visible in dashboard |
+| P2-5 | Add push notification pipeline for approvals | 2d | P2-3 | Mobile push test path validated |
+
+## Phase 3: Safety Wrapper Execution Layer (Weeks 4-6)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P3-1 | Port shell/docker/file/env guarded executors from sysadmin patterns | 5d | P1-5 | Security unit tests pass |
+| P3-2 | Implement tool registry loader + SECRET_REF resolver | 3d | P1-1,P3-1 | Tool calls run without raw secret exposure |
+| P3-3 | Implement core adapters (Chatwoot, Ghost, Nextcloud, Cal.com, Odoo, Listmonk) | 6d | P3-2 | Adapter contract tests pass |
+| P3-4 | Implement metering capture and hourly bucket compaction | 2d | P1-5,P2-4 | Buckets reliably posted to Hub |
+| P3-5 | Add subagent budget/depth limits and policy enforcement | 2d | P1-5 | Policy tests and abuse tests pass |
+
+## Phase 4: Provisioner Retool (Weeks 5-7)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P4-1 | Add OpenClaw + Safety deployment steps to provisioner | 4d | P3-2 | Fresh VPS comes online with heartbeat |
+| P4-2 | Remove legacy stack templates and nginx configs from default deployment path | 2d | P0-2 | Deprecated stacks excluded from installs |
+| P4-3 | Generate and deploy tenant configs/policies during provisioning | 3d | P2-2,P4-1 | Config sync succeeds on first boot |
+| P4-4 | Migrate initial browser setup scenarios to OpenClaw browser tool | 4d | P4-1 | 8 scenarios replaced or retired |
+| P4-5 | Add idempotent recovery checkpoints per provisioning step | 2d | P4-1 | Retry from failed step validated |
+
+## Phase 5: Customer Interfaces (Weeks 6-9)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P5-1 | Customer web portal for approvals, agent settings, usage | 5d | P2-3,P2-4 | Beta usable on staging |
+| P5-2 | Mobile app MVP (chat, approvals, health, usage) | 8d | P2-5,P5-1 | TestFlight/internal distribution ready |
+| P5-3 | Public onboarding website + classifier + bundle calculator | 6d | P2-1 | Stripe flow works end-to-end |
+| P5-4 | WhatsApp/Telegram fallback relay (minimal) | 3d | P2-3 | Approval fallback path works |
+
+## Phase 6: Workflow Templates + Demo Experience (Weeks 8-10)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P6-1 | Implement 4 first-hour workflow templates as auditable blueprints | 5d | P3-3,P5-1 | Templates executable end-to-end |
+| P6-2 | Build interactive demo tenant pool manager (TTL snapshots) | 4d | P4-1,P5-3 | Demo session provisioning <5 min |
+| P6-3 | Add product telemetry for template completion and demo conversion | 2d | P6-1,P6-2 | Metrics dashboards live |
+
+## Phase 7: Quality, Hardening, Launch (Weeks 10-12)
+
+| ID | Task | Duration | Depends On | Exit Criteria |
+|---|---|---:|---|---|
+| P7-1 | Full security test suite (redaction, gating, injection, auth) | 4d | P3-5,P4-5 | Critical findings resolved |
+| P7-2 | Load, soak, and chaos tests on staging fleet | 3d | P6-1 | SLO gates met |
+| P7-3 | Canary launch (5% -> 25% -> 100%) with rollback drills | 4d | P7-1,P7-2 | Canary metrics stable |
+| P7-4 | Launch readiness review + runbook finalization | 2d | P7-3 | Founding member launch sign-off |
+
+## 3. Dependency Graph
+
+```mermaid
+graph TD
+  P0_1[P0-1 n8n cleanup] --> P0_2[P0-2 deprecated deploy removal]
+  P0_1 --> P0_3[P0-3 plaintext secret fix]
+  P0_2 --> P0_4[P0-4 baseline security tests]
+  P0_3 --> P0_4
+
+  P0_4 --> P1_1[P1-1 vault]
+  P1_1 --> P1_2[P1-2 egress proxy]
+  P1_1 --> P1_3[P1-3 classification]
+  P1_3 --> P1_4[P1-4 approval cache]
+  P1_2 --> P1_5[P1-5 openclaw plugin skeleton]
+  P1_3 --> P1_5
+
+  P0_4 --> P2_1[P2-1 hub prisma models]
+  P2_1 --> P2_2[P2-2 tenant register/heartbeat/config]
+  P2_1 --> P2_3[P2-3 approval APIs]
+  P2_1 --> P2_4[P2-4 usage ingest]
+  P2_3 --> P2_5[P2-5 push notifications]
+
+  P1_5 --> P3_1[P3-1 guarded executors]
+  P1_1 --> P3_2[P3-2 tool registry + secret ref]
+  P3_1 --> P3_2
+  P3_2 --> P3_3[P3-3 tool adapters]
+  P1_5 --> P3_4[P3-4 metering]
+  P2_4 --> P3_4
+  P1_5 --> P3_5[P3-5 subagent controls]
+
+  P3_2 --> P4_1[P4-1 provisioner openclaw+safety]
+  P0_2 --> P4_2[P4-2 legacy stack template removal]
+  P2_2 --> P4_3[P4-3 config generation]
+  P4_1 --> P4_3
+  P4_1 --> P4_4[P4-4 browser scenario migration]
+  P4_1 --> P4_5[P4-5 idempotent checkpoints]
+
+  P2_3 --> P5_1[P5-1 customer portal]
+  P2_4 --> P5_1
+  P2_5 --> P5_2[P5-2 mobile app MVP]
+  P5_1 --> P5_2
+  P2_1 --> P5_3[P5-3 onboarding website]
+  P2_3 --> P5_4[P5-4 whatsapp/telegram fallback]
+
+  P3_3 --> P6_1[P6-1 first-hour templates]
+  P5_1 --> P6_1
+  P4_1 --> P6_2[P6-2 interactive demo pool]
+  P5_3 --> P6_2
+  P6_1 --> P6_3[P6-3 template/demo telemetry]
+  P6_2 --> P6_3
+
+  P3_5 --> P7_1[P7-1 full security suite]
+  P4_5 --> P7_1
+  P6_1 --> P7_2[P7-2 load/soak/chaos]
+  P7_1 --> P7_3[P7-3 canary launch]
+  P7_2 --> P7_3
+  P7_3 --> P7_4[P7-4 launch readiness]
+```
+
+## 4. Critical Path
+
+Primary critical chain:
+
+`P0 cleanup -> P1 safety substrate -> P3 execution layer -> P4 provisioner retool -> P7 hardening/canary`
+
+Secondary critical chain:
+
+`P2 Hub APIs -> P5 mobile approvals -> P7 canary`
+
+## 5. Parallelization Strategy
+
+To meet 12 weeks, run these in parallel after Week 3:
+
+- Track A: Safety Wrapper + adapters (P3)
+- Track B: Provisioner retool (P4)
+- Track C: Customer interfaces (P5)
+
+## 6. Definition Of Done (Program-Level)
+
+Launch gate passes only when all are true:
+
+- secrets-never-leave-server invariant passes automated red-team test suite
+- gating matrix works exactly for all 5 command classes and 3 autonomy levels
+- external comms gate enforces lock-by-default at all autonomy levels
+- provisioning succeeds >=90% first attempt and >=99% with retries
+- approval path works across web + mobile push with audit completeness
+- usage metering reconciles with provider usage within <=1% variance
--- a/docs/architecture-proposal/gpt/05-estimated-timelines.md
+++ b/docs/architecture-proposal/gpt/05-estimated-timelines.md
@@ -0,0 +1,75 @@
+# 05. Estimated Timelines
+
+## 1. Date Anchors
+
+- Planning baseline date: **Thursday, February 26, 2026**
+- Proposed execution start: **Monday, March 2, 2026**
+- 12-week target launch window end: **Sunday, May 24, 2026**
+- Recommended contingency buffer: **May 25-31, 2026**
+
+## 2. Timeline Summary
+
+| Phase | Dates | Duration | Confidence |
+|---|---|---:|---|
+| Phase 0 prerequisites | Mar 2 - Mar 8 | 1 week | High |
+| Phase 1 safety substrate | Mar 9 - Mar 22 | 2 weeks | Medium |
+| Phase 2 Hub APIs/models | Mar 16 - Mar 29 | 2 weeks (overlap) | High |
+| Phase 3 wrapper execution layer | Mar 23 - Apr 12 | 3 weeks | Medium |
+| Phase 4 provisioner retool | Mar 30 - Apr 19 | 3 weeks (overlap) | Medium |
+| Phase 5 mobile + website + portal | Apr 6 - May 3 | 4 weeks | Medium |
+| Phase 6 templates + demo | Apr 27 - May 10 | 2 weeks | Medium |
+| Phase 7 hardening + canary + launch | May 4 - May 24 | 3 weeks | Medium-Low |
+
+## 3. Milestones
+
+| Milestone | Target Date | Exit Condition |
+|---|---|---|
+| M1: Cleanup gate passed | Mar 8, 2026 | n8n and deprecated deploy paths removed; plaintext secret leak fixed |
+| M2: Security substrate alpha | Mar 22, 2026 | redaction proxy + classifier + plugin skeleton integrated |
+| M3: Hub tenant APIs beta | Mar 29, 2026 | register/heartbeat/approval/usage contracts stable |
+| M4: First full tenant provision | Apr 12, 2026 | new VPS boots with OpenClaw + Safety + heartbeat |
+| M5: Customer interface beta | May 3, 2026 | web portal + mobile approvals + onboarding flow functional |
+| M6: Launch candidate | May 17, 2026 | full security/perf test pass; canary starts |
+| M7: Founding member launch | May 24, 2026 | canary complete; runbooks and rollback drills signed off |
+
+## 4. Weekly View (Condensed)
+
+```text
+Week 1  (Mar 2)   : Phase 0 prerequisite cleanup
+Week 2-3          : Phase 1 safety substrate begins
+Week 3-4          : Phase 2 Hub API/data model work (parallel)
+Week 4-6          : Phase 3 wrapper execution + adapters
+Week 5-7          : Phase 4 provisioner retool and browser migration
+Week 6-9          : Phase 5 customer portal/mobile/website
+Week 9-10         : Phase 6 templates + interactive demo
+Week 10-12        : Phase 7 hardening, canary rollout, launch
+Buffer Week       : May 25-31 contingency
+```
+
+## 5. Critical Timeline Risks
+
+| Risk | Schedule Impact If Realized |
+|---|---|
+| OpenClaw hook behavior drift or undocumented edge cases | +1 to +2 weeks |
+| Provisioner migration instability on fresh VPS images | +1 week |
+| Mobile push approval reliability issues (iOS/Android differences) | +0.5 to +1 week |
+| Token billing reconciliation defects with Stripe meter events | +1 week |
+| Security findings in redaction/gating late in cycle | +1 to +3 weeks |
+
+## 6. Confidence Ranges
+
+| Scenario | Launch Window |
+|---|---|
+| Optimistic | May 17-24, 2026 |
+| Most likely | May 24-31, 2026 |
+| Conservative | June 7-14, 2026 |
+
+## 7. Scope Compression Options (If Needed)
+
+To preserve security and launch by May 24-31, de-scope in this order:
+
+1. Delay WhatsApp/Telegram fallback to post-launch.
+2. Limit initial tool adapter set to top 8 usage tools, keep others on browser fallback.
+3. Ship 3 first-hour templates at launch, add the 4th in first patch.
+
+Do **not** cut redaction, gating, approval, or metering correctness work.
--- a/docs/architecture-proposal/gpt/06-risk-assessment.md
+++ b/docs/architecture-proposal/gpt/06-risk-assessment.md
@@ -0,0 +1,73 @@
+# 06. Risk Assessment
+
+## 1. Risk Scoring Method
+
+- Probability: 1 (low) to 5 (high)
+- Impact: 1 (low) to 5 (high)
+- Risk score = Probability x Impact
+
+## 2. Top Risks
+
+| ID | Risk | Prob | Impact | Score | Mitigation | Contingency Trigger |
+|---|---|---:|---:|---:|---|---|
+| R1 | Secret exfiltration via unredacted outbound payload | 3 | 5 | 15 | Multi-layer redaction tests, egress deny-by-default policy, seeded canary secrets | Any unredacted canary secret seen outside tenant |
+| R2 | Command gating bypass due misclassification | 3 | 5 | 15 | Deterministic policy engine, contract tests per class, human-readable reason logging | Red/Critical executes without approval in tests |
+| R3 | OpenClaw upstream changes break plugin behavior | 3 | 4 | 12 | Pin stable tags, adapter compatibility suite, staged upgrade canaries | Hook contract test fails against new tag |
+| R4 | Provisioner regressions reduce provisioning success | 4 | 4 | 16 | Idempotent checkpoints, replay tests, synthetic VPS CI | First-attempt success < 90% |
+| R5 | Billing usage mismatch vs provider costs | 3 | 4 | 12 | Dual-entry usage checks, nightly reconciliation jobs, alert thresholds | >1% sustained variance for 24h |
+| R6 | Mobile approval notification delays/drop | 3 | 3 | 9 | Push retries + in-app queue fallback + email fallback | p95 approval notify > 30s |
+| R7 | Performance overhead exceeds Lite-tier budget | 2 | 4 | 8 | Memory profiling budget gates, disable non-essential plugins, tune browser lifecycle | LetsBe overhead > 800MB sustained |
+| R8 | Tool API churn breaks adapters | 4 | 3 | 12 | Adapter integration tests against pinned versions, fallback to browser playbook | Adapter failure rate > 5% |
+| R9 | Security debt from AI-generated code quality | 4 | 4 | 16 | Mandatory senior review on security modules, lint rules, banned patterns checks | Critical static-analysis finding unresolved >48h |
+| R10 | Legal/compliance drift (license/source disclosure pages) | 2 | 4 | 8 | Automated license manifest publishing, pre-release legal checklist | Missing OSS disclosure page at RC freeze |
+
+## 3. Risk Register By Domain
+
+## 3.1 Security Risks
+
+- Redaction misses non-standard secret formats.
+- External comms gate incorrectly tied to autonomy level.
+- Local logs/transcripts persist raw secret material.
+- Local execution adapters allow shell metacharacter bypass.
+
+## 3.2 Delivery Risks
+
+- Too much simultaneous change across Hub + provisioner + tenant runtime.
+- Underestimated migration effort from deprecated orchestrator/sysadmin behaviors.
+- Browser automation migration complexity for setup scripts.
+
+## 3.3 Operational Risks
+
+- Dual-region Hub operations increase DB and deploy complexity.
+- Insufficient on-call runbooks for approval outages and provisioning failures.
+- Canary rollout without automated rollback criteria.
+
+## 4. Mitigation Program
+
+## 4.1 Pre-Launch Controls
+
+- Security invariants are encoded as executable tests (not checklist-only).
+- Every release candidate must pass redaction canary probes.
+- Dry-run provisioning must pass on both Netcup and Hetzner targets.
+
+## 4.2 Runtime Controls
+
+- Alert on heartbeat freshness degradation.
+- Alert on approval queue lag and expiration spikes.
+- Alert on sudden drop in cache-read ratio (cost anomaly indicator).
+
+## 4.3 Governance Controls
+
+- Security design review required for changes in Safety Wrapper, redaction, or secrets flows.
+- Migration freeze on deprecated paths after Phase 0.
+- Weekly risk review with updated probability/impact re-scoring.
+
+## 5. Launch Go/No-Go Risk Gates
+
+No launch if any condition is true:
+
+- unresolved severity-1 security defect
+- redaction tests fail for any supported secret class
+- command gating matrix not fully passing
+- usage reconciliation error >1% over 72h canary
+- provisioning first-attempt success below 85% in final week
--- a/docs/architecture-proposal/gpt/07-testing-strategy.md
+++ b/docs/architecture-proposal/gpt/07-testing-strategy.md
@@ -0,0 +1,111 @@
+# 07. Testing Strategy Proposal
+
+## 1. Testing Principles
+
+- Security-critical behavior is verified with invariant tests, not only unit coverage.
+- Contract-first testing between Hub, Safety Wrapper, Mobile, Website, and Provisioner.
+- Fast feedback in CI, deep verification in staging and nightly runs.
+- AI-generated code receives stricter review and mutation testing on critical paths.
+
+## 2. Test Pyramid By Component
+
+| Layer | Hub | Safety Wrapper | Provisioner | Mobile/Website |
+|---|---|---|---|---|
+| Unit | services, validators, policy logic | classifier, redactor, secret resolver, adapters | parser/utils/template render | UI logic, state stores, hooks |
+| Integration | Prisma + API handlers + auth | plugin hooks vs OpenClaw test harness | SSH runner against disposable VM | API integration against mock Hub |
+| End-to-end | full order/provision/approval/billing flow | tenant command execution path | full 10-step provisioning with checkpoints | chat/approval/onboarding user journeys |
+| Security | authz, rate-limit, session hardening | secret exfil tests, gating bypass tests | credential leakage scans | token storage, deep link auth |
+| Performance | API p95 and DB load | per-turn latency overhead, memory usage | provisioning duration and retry cost | startup latency, push receipt latency |
+
+## 3. Mandatory Security Invariant Suite
+
+The following automated tests are required before each release:
+
+1. **Secrets Never Leave Server Test**
+- Seed known secrets in vault and files.
+- Trigger prompts/tool outputs containing these values.
+- Assert outbound payloads and persisted logs contain only placeholders.
+
+2. **Command Classification Matrix Test**
+- Execute fixtures for each command class (Green/Yellow/Yellow+External/Red/Critical).
+- Validate behavior across autonomy levels 1-3.
+
+3. **External Comms Independence Test**
+- At autonomy level 3, external action remains blocked when comms gate locked.
+- Unlock only targeted tool; validate others remain blocked.
+
+4. **Approval Expiry Test**
+- Approval request expires at 24h.
+- Late approval cannot be replayed.
+
+5. **SECRET_REF Boundary Test**
+- Secrets cannot be requested directly by raw name/value.
+- Only valid references in allowlisted tool operations resolve.
+
+## 4. Provisioning Test Strategy
+
+## 4.1 Fast Checks
+
+- Shellcheck + static checks for bash scripts.
+- Template substitution tests (all placeholders resolved, none leaked).
+- Stack inventory policy tests (no banned tools like n8n).
+
+## 4.2 Disposable VPS E2E
+
+Nightly automated runs:
+
+- create disposable Debian VPS
+- run full provisioning
+- run smoke checks on selected tool endpoints
+- verify tenant registration + heartbeat + approvals
+- tear down VPS and collect artifacts
+
+## 5. Contract Testing
+
+- OpenAPI specs for Hub APIs and tenant APIs.
+- Consumer-driven contract tests for:
+  - Safety Wrapper against Hub tenant endpoints
+  - Mobile app against customer endpoints
+  - Website onboarding against public endpoints
+- Contract break blocks merge.
+
+## 6. Data And Billing Validation
+
+- Synthetic token event generator with known totals.
+- Reconcile tenant usage buckets against Hub aggregated totals.
+- Reconcile Hub totals against Stripe meter/invoice preview.
+- Fail build if variance exceeds threshold.
+
+## 7. Quality Gates (CI)
+
+- Unit + integration tests must pass.
+- Security invariants must pass.
+- Critical package diff review for Safety Wrapper and Provisioner.
+- Minimum thresholds:
+  - security-critical modules: >=90% branch coverage
+  - overall backend: >=75% branch coverage
+- Mutation testing on classifier and redactor modules.
+
+## 8. Human Review Workflow (Anti-AI-Slop)
+
+Required for security-critical PRs:
+
+- one reviewer validates threat model assumptions
+- one reviewer validates test completeness and failure cases
+- checklist includes: error paths, rollback behavior, idempotency, logging hygiene
+
+No direct auto-merge for changes in:
+
+- redaction engine
+- command classifier
+- secret storage/resolution
+- provisioning credential handling
+
+## 9. Launch Validation Checklist
+
+Before founding-member launch:
+
+- 7-day staging soak with no sev-1/2 defects
+- two successful rollback drills (control plane and tenant plane)
+- production canary with live approval + billing reconciliation
+- first-hour templates executed successfully on staging tenants
--- a/docs/architecture-proposal/gpt/08-cicd-strategy-gitea.md
+++ b/docs/architecture-proposal/gpt/08-cicd-strategy-gitea.md
@@ -0,0 +1,128 @@
+# 08. CI/CD Strategy (Gitea-Based)
+
+## 1. Objectives
+
+- Keep release cadence high without bypassing security checks.
+- Provide deterministic, reproducible artifacts for Hub, Safety components, and Provisioner.
+- Enforce policy gates (security invariants, banned tools, contract compatibility) in CI.
+
+## 2. Platform Baseline
+
+- CI engine: **Gitea Actions** with self-hosted **act_runner**.
+- Artifact registry: private container registry (`code.letsbe.solutions/...`).
+- Deployment target:
+  - Control plane: Docker hosts (EU + US)
+  - Tenant plane: provisioner-managed customer VPS rollout jobs
+
+## 3. Branch And Release Model
+
+- `main`: releasable at all times.
+- short-lived feature branches.
+- release tags: `hub/vX.Y.Z`, `safety/vX.Y.Z`, `provisioner/vX.Y.Z`.
+- hotfix branch only for production incidents, merged back to `main` immediately.
+
+## 4. Pipeline Stages
+
+## 4.1 Pull Request Pipeline
+
+1. `lint-typecheck`
+2. `unit-tests`
+3. `integration-tests`
+4. `contract-tests`
+5. `security-scan` (SAST, dependency vulnerabilities, secret scan)
+6. `policy-checks`:
+   - banned stack/reference detector (`n8n`, deprecated deploy targets)
+   - no plaintext credentials in artifacts/config
+7. `build-preview-images`
+
+## 4.2 Main Branch Pipeline
+
+1. re-run all PR checks
+2. build immutable release images
+3. generate SBOMs
+4. image signing (cosign/sigstore-compatible)
+5. push to registry with digest pins
+6. deploy to `dev` automatically
+
+## 4.3 Promotion Pipelines
+
+- `promote-staging`: manual approval gate + smoke tests
+- `promote-prod-eu`: manual approval + canary checks
+- `promote-prod-us`: separate manual gate after EU health confirmation
+
+## 5. Tenant Rollout Pipeline
+
+Separate workflow for tenant-plane updates:
+
+- policy-only rollout job
+- wrapper package rollout job
+- OpenClaw version rollout campaign
+
+Rollout controller enforces:
+
+- canary percentages
+- halt thresholds
+- automated rollback trigger execution
+
+## 6. Required Checks Per Package
+
+| Package | Required Jobs |
+|---|---|
+| Hub | lint, unit, integration, Prisma migration check, API contract tests |
+| Safety Wrapper | unit, hook integration (OpenClaw pinned tag), redaction/gating invariants |
+| Egress Proxy | redaction corpus tests, outbound policy tests, perf checks |
+| Provisioner | shellcheck, template checks, disposable VPS smoke run |
+| Mobile | typecheck, unit/UI tests, API contract tests, build verification |
+| Website | lint/typecheck, onboarding flow tests, pricing/quote tests |
+
+## 7. Example Gitea Workflow Skeleton
+
+```yaml
+name: pr-checks
+on: [pull_request]
+
+jobs:
+  lint-test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - run: pnpm install --frozen-lockfile
+      - run: pnpm lint && pnpm typecheck
+      - run: pnpm test:unit
+
+  security-policy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - run: pnpm test:security-invariants
+      - run: ./scripts/ci/check-banned-references.sh
+      - run: ./scripts/ci/check-no-plaintext-secrets.sh
+```
+
+## 8. Secrets And Runner Security
+
+- Gitea secrets scoped by environment (`dev/staging/prod`).
+- Runner hosts are isolated and ephemeral where possible.
+- No production credentials in PR jobs.
+- OIDC-based short-lived cloud/provider credentials preferred over long-lived static tokens.
+
+## 9. Change Management Gates
+
+Security-critical paths require extra gate:
+
+- files under `safety-wrapper/`, `egress-proxy/`, `provisioner/scripts/credentials*`
+- mandatory 2 reviewers
+- security test suite pass required
+- no force-merge override
+
+## 10. Metrics For CI/CD Quality
+
+Track weekly:
+
+- median PR cycle time
+- flaky test rate
+- change failure rate
+- mean time to rollback
+- canary abort count
+
+Use these metrics in weekly engineering ops review to keep speed/quality balance aligned with launch target.
--- a/docs/architecture-proposal/gpt/09-repository-structure-proposal.md
+++ b/docs/architecture-proposal/gpt/09-repository-structure-proposal.md
@@ -0,0 +1,105 @@
+# 09. Repository Structure Proposal
+
+## 1. Decision
+
+**Choose: Monorepo for LetsBe first-party code, with OpenClaw kept as separate pinned upstream dependency.**
+
+This is the best speed/quality tradeoff for a 3-month launch while preserving the non-fork requirement.
+
+## 2. Why This Over Multi-Repo
+
+## 2.1 Benefits
+
+- Shared TypeScript contracts across Hub, Mobile, Website, and Safety services.
+- One CI graph with selective test execution and consistent policy checks.
+- Easier cross-cutting refactors (API shape changes, auth, telemetry schema updates).
+- Better fit for AI-assisted coding workflows where context continuity matters.
+
+## 2.2 Risks
+
+- Larger repo and CI complexity.
+- Migration effort from existing repo layout.
+
+## 2.3 Mitigations
+
+- Use path-based CI execution and build caching.
+- Keep OpenClaw external to avoid massive vendor code in monorepo.
+- Execute migration in controlled steps with history-preserving imports.
+
+## 3. Proposed Structure
+
+```text
+letsbe-platform/
+  apps/
+    hub/                    # Next.js admin + customer portal + APIs
+    website/                # public onboarding and marketing app
+    mobile/                 # React Native + Expo
+  services/
+    safety-wrapper/         # OpenClaw plugin package
+    egress-proxy/           # LLM redaction proxy
+    provisioner/            # provisioning controller + scripts/templates
+  packages/
+    api-contracts/          # OpenAPI specs + TS SDKs
+    policy-engine/          # shared classification and gate logic
+    tooling-sdk/            # adapter framework + SECRET_REF utilities
+    ui-kit/                 # shared design components (web/mobile where possible)
+    config/                 # eslint/tsconfig/jest/shared tooling
+  infra/
+    gitea-workflows/
+    docker/
+    scripts/
+  docs/
+    architecture-proposal/
+    runbooks/
+```
+
+## 4. OpenClaw Upstream Strategy (No Fork)
+
+OpenClaw remains outside monorepo as independent upstream source:
+
+- Track pinned release tag in `services/safety-wrapper/openclaw-version.lock`.
+- CI job pulls pinned OpenClaw version for compatibility tests.
+- Upgrade workflow:
+  1. open compatibility PR bumping lock file
+  2. run hook-contract test suite
+  3. run staging canary tenants
+  4. promote if green
+
+If a temporary patch is unavoidable, maintain patch as isolated overlay and upstream contribution plan; do not maintain long-lived fork branch.
+
+## 5. Migration Plan From Current Repos
+
+## 5.1 Current Inputs
+
+- `letsbe-hub`
+- `letsbe-ansible-runner`
+- `letsbe-orchestrator` (reference only, not migrated as active runtime)
+- `letsbe-sysadmin-agent` (reference only, patterns ported into Safety)
+- `openclaw` (kept external)
+
+## 5.2 Migration Steps
+
+1. Create monorepo skeleton and shared package manager workspace.
+2. Import `letsbe-hub` into `apps/hub` with history.
+3. Import `letsbe-ansible-runner` into `services/provisioner`.
+4. Create new `services/safety-wrapper` and `services/egress-proxy`.
+5. Scaffold `apps/mobile` and `apps/website`.
+6. Extract shared contracts from hub into `packages/api-contracts`.
+7. Add compatibility adapters so existing deployments continue during transition.
+8. Archive deprecated repos as read-only references after cutover.
+
+## 6. Governance Model
+
+- CODEOWNERS by area (`hub`, `safety`, `provisioner`, `mobile`, `website`).
+- Required reviewer policy:
+  - 2 reviewers for `safety-wrapper`, `egress-proxy`, `provisioner` secrets paths.
+  - 1 reviewer for non-security UI changes.
+- Architectural Decision Records (ADR) stored under `docs/adr`.
+
+## 7. Alternative Considered: Keep Multi-Repo
+
+Rejected for v1 because cross-repo contract drift is already visible in current state (legacy APIs, deprecated stacks, stale references). Under a 12-week launch window, contract drift risk is higher than monorepo migration overhead.
+
+## 8. Post-Launch Option
+
+After launch, if team scaling or compliance requirements demand stricter isolation, split out mobile and website into separate repos while preserving shared contract package publication.
--- a/docs/architecture-proposal/gpt/10-technology-validation-sources.md
+++ b/docs/architecture-proposal/gpt/10-technology-validation-sources.md
@@ -0,0 +1,102 @@
+# 10. Technology Validation Sources
+
+Validation date: **2026-02-26**
+
+This proposal uses current official documentation (and release notes where relevant) for each major recommended technology.
+
+## 1. OpenClaw
+
+- Docs home: https://docs.openclaw.ai/
+- Plugin development/hooks: https://docs.openclaw.ai/guide/developers/plugins/overview/
+- Browser tool docs: https://docs.openclaw.ai/guide/tools/browser/
+- OpenClaw GitHub releases/readme: https://github.com/openclawai/openclaw
+
+Used for:
+- hook names and plugin lifecycle
+- browser capabilities and profile modes
+- upstream release/update model
+
+## 2. Next.js
+
+- Official docs: https://nextjs.org/docs
+- Release notes: https://nextjs.org/blog
+
+Used for:
+- app router patterns
+- deployment/runtime guidance
+- version-aware migration planning
+
+## 3. Prisma
+
+- Official docs: https://www.prisma.io/docs
+- ORM release notes: https://github.com/prisma/prisma/releases
+
+Used for:
+- schema/migration guidance
+- Prisma Client behavior and deployment practices
+
+## 4. React Native + Expo
+
+- React Native docs: https://reactnative.dev/docs/getting-started
+- React Native releases: https://github.com/facebook/react-native/releases
+- Expo docs: https://docs.expo.dev/
+- Expo SDK changelog: https://expo.dev/changelog
+
+Used for:
+- mobile stack decision
+- push notification and build pipeline planning
+
+## 5. Flutter (evaluated alternative)
+
+- Flutter docs: https://docs.flutter.dev/
+- Flutter releases: https://github.com/flutter/flutter/releases
+
+Used for:
+- alternative comparison for mobile stack decision
+
+## 6. Playwright
+
+- Official docs: https://playwright.dev/docs/intro
+- Release notes: https://playwright.dev/docs/release-notes
+
+Used for:
+- browser automation fallback strategy
+- testing and scenario migration approach
+
+## 7. SQLite
+
+- SQLite docs: https://www.sqlite.org/docs.html
+- SQLite file format/security references: https://www.sqlite.org/fileformat.html
+
+Used for:
+- tenant-local vault, approval cache, and usage bucket storage design
+
+## 8. Stripe
+
+- Stripe API docs: https://docs.stripe.com/api
+- Usage-based billing/meter events: https://docs.stripe.com/billing/subscriptions/usage-based
+
+Used for:
+- overage billing architecture
+- usage ingestion and invoice flow design
+
+## 9. Gitea Actions / Act Runner
+
+- Gitea Actions docs: https://docs.gitea.com/usage/actions/overview
+- Act runner docs: https://docs.gitea.com/usage/actions/act-runner
+
+Used for:
+- CI/CD workflow strategy
+- runner security and deployment pipeline design
+
+## 10. Additional Provider References
+
+- Netcup API context (existing integration baseline): https://www.netcup.com/en
+- Hetzner Cloud docs (overflow strategy): https://docs.hetzner.cloud/
+
+Used for:
+- provider-agnostic provisioning strategy
+
+## 11. Note On Source Priority
+
+For technical decisions, this proposal prioritizes primary official documentation and release notes over secondary summaries.
--- a/docs/architecture-proposal/gpt/README.md
+++ b/docs/architecture-proposal/gpt/README.md
@@ -0,0 +1,41 @@
+# LetsBe Biz Architecture Proposal (GPT Team)
+
+Date: 2026-02-26  
+Author: GPT Architecture Team
+
+This folder contains the complete architecture development plan requested in `docs/technical/LetsBe_Biz_Architecture_Brief.md` Section 1.
+
+## Deliverables Index
+
+0. [00-executive-summary.md](./00-executive-summary.md)  
+Executive direction and launch gating summary.
+1. [01-architecture-and-dataflows.md](./01-architecture-and-dataflows.md)  
+Architecture document with system diagrams and data flow diagrams.
+2. [02-components-and-api-contracts.md](./02-components-and-api-contracts.md)  
+Component breakdown and API contracts.
+3. [03-deployment-strategy.md](./03-deployment-strategy.md)  
+Deployment strategy for control plane and tenant plane.
+4. [04-implementation-plan-and-dependency-graph.md](./04-implementation-plan-and-dependency-graph.md)  
+Detailed implementation plan, task breakdown, and dependency graph.
+5. [05-estimated-timelines.md](./05-estimated-timelines.md)  
+Estimated timelines and milestone schedule.
+6. [06-risk-assessment.md](./06-risk-assessment.md)  
+Risk assessment and mitigation plan.
+7. [07-testing-strategy.md](./07-testing-strategy.md)  
+Testing strategy proposal.
+8. [08-cicd-strategy-gitea.md](./08-cicd-strategy-gitea.md)  
+Gitea-based CI/CD strategy.
+9. [09-repository-structure-proposal.md](./09-repository-structure-proposal.md)  
+Repository structure proposal and migration plan.
+10. [10-technology-validation-sources.md](./10-technology-validation-sources.md)  
+Current official documentation references used to validate technology choices.
+
+## Executive Direction (One-Page Summary)
+
+- Keep `letsbe-hub` (Next.js + Prisma) and retool it; do not rewrite core backend in v1 launch window.
+- Build Safety Wrapper as OpenClaw plugin + local egress secrets proxy; keep OpenClaw upstream and un-forked.
+- Remove all `n8n` and deprecated-stack references as a hard prerequisite (Week 1).
+- Replace orchestrator/sysadmin responsibilities with explicit Hub↔Safety APIs and local execution adapters.
+- Build mobile app with React Native + Expo for speed, push approvals, and shared TypeScript contracts.
+- Use monorepo for first-party LetsBe code (Hub, Mobile, Safety services, Provisioner), while consuming OpenClaw as pinned upstream dependency.
+- Target 12-week founding-member launch with strict security quality gates, canary rollout, and staged feature hardening.