Initial commit: LetsBe Biz project with openclaw source
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
37
docs/architecture-proposal/gpt/00-executive-summary.md
Normal file
37
docs/architecture-proposal/gpt/00-executive-summary.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# 00. Executive Summary
|
||||
|
||||
## Recommended Direction
|
||||
|
||||
- Retain and extend `letsbe-hub` instead of rewriting backend.
|
||||
- Build Safety Wrapper as OpenClaw plugin with a separate local egress redaction proxy.
|
||||
- Treat OpenClaw as a pinned upstream dependency (no fork).
|
||||
- Make `n8n`/deprecated stack removal and plaintext credential leak fixes the first gate.
|
||||
- Launch mobile with React Native + Expo and web onboarding as separate frontend app.
|
||||
- Move first-party code to a monorepo for shared contracts and coordinated CI.
|
||||
|
||||
## Delivery Window
|
||||
|
||||
- Start: March 2, 2026
|
||||
- Founding member launch target: May 24, 2026
|
||||
- Buffer: May 25-31, 2026
|
||||
|
||||
## Hard Requirements Preserved
|
||||
|
||||
- 4-layer security model
|
||||
- secrets-never-leave-server invariant
|
||||
- 3-tier autonomy with independent external-comms gate
|
||||
- one customer per VPS
|
||||
|
||||
## Most Critical Risks
|
||||
|
||||
- security bypass in redaction/gating
|
||||
- provisioner migration instability
|
||||
- billing metering accuracy drift
|
||||
|
||||
## First Build Gate
|
||||
|
||||
Do not start feature tracks until:
|
||||
|
||||
1. all `n8n` production references removed
|
||||
2. deprecated deploy paths disabled
|
||||
3. plaintext provisioning secret storage eliminated
|
||||
250
docs/architecture-proposal/gpt/01-architecture-and-dataflows.md
Normal file
250
docs/architecture-proposal/gpt/01-architecture-and-dataflows.md
Normal file
@@ -0,0 +1,250 @@
|
||||
# 01. Architecture And Data Flows
|
||||
|
||||
## 1. Scope And Non-Negotiables
|
||||
|
||||
This proposal is explicitly designed around the fixed constraints from the Architecture Brief:
|
||||
|
||||
- 4-layer security model is mandatory.
|
||||
- Secrets never leave tenant server is mandatory.
|
||||
- 3-tier autonomy + external communications gate is mandatory.
|
||||
- OpenClaw is upstream dependency (no fork by default).
|
||||
- One customer = one VPS is mandatory.
|
||||
- `n8n` removal is prerequisite.
|
||||
|
||||
## 2. Proposed Target Architecture
|
||||
|
||||
### 2.1 Core Decisions
|
||||
|
||||
| Decision | Proposal | Why |
|
||||
|---|---|---|
|
||||
| Hub stack | Keep Next.js + Prisma + PostgreSQL | Existing app already has major workflows and 80+ APIs; rewrite is timeline-risky for 3-month launch. |
|
||||
| OpenClaw integration | Use pinned upstream release, no fork | Maximizes upgrade velocity and avoids merge debt. |
|
||||
| Safety Wrapper shape | Hybrid: OpenClaw plugin + local egress proxy + local execution adapters | Gives direct hook interception plus transport-level redaction guarantee. |
|
||||
| Mobile | React Native + Expo | Fastest path to iOS/Android with TypeScript contract reuse. |
|
||||
| Website | Separate public web app (same monorepo) + Hub public APIs | Security isolation between public onboarding and admin/customer portal. |
|
||||
| Repo strategy | Monorepo for first-party services; OpenClaw kept separate upstream repo | Strong contract sharing + CI simplicity without violating upstream dependency model. |
|
||||
|
||||
### 2.2 System Context Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph Client[Client Layer]
|
||||
M[Mobile App\nReact Native + Expo]
|
||||
W[Website\nOnboarding + Checkout]
|
||||
C[Customer Portal Web]
|
||||
A[Admin Portal Web]
|
||||
end
|
||||
|
||||
subgraph Control[Central Platform]
|
||||
H[Hub API + UI\nNext.js + Prisma]
|
||||
DB[(PostgreSQL)]
|
||||
Q[Background Workers\nAutomation + Metering]
|
||||
N[Notification Service\nPush/Email]
|
||||
ST[Stripe]
|
||||
NC[Netcup/Hetzner]
|
||||
end
|
||||
|
||||
subgraph Tenant[Per-Customer VPS]
|
||||
OC[OpenClaw Gateway\nUpstream]
|
||||
SW[Safety Wrapper Plugin\nHooks + Classification]
|
||||
SP[LLM Egress Proxy\nSecrets Firewall]
|
||||
SV[(Secrets Vault SQLite\nEncrypted)]
|
||||
TA[Tool Adapters + Exec Guards]
|
||||
TS[(Tool Stacks 25+)]
|
||||
AP[(Approval Cache SQLite)]
|
||||
TU[(Token Usage Buckets)]
|
||||
end
|
||||
|
||||
M --> H
|
||||
W --> H
|
||||
C --> H
|
||||
A --> H
|
||||
|
||||
H --> DB
|
||||
H --> Q
|
||||
H --> N
|
||||
H <--> ST
|
||||
H <--> NC
|
||||
|
||||
H <--> OC
|
||||
OC --> SW
|
||||
SW --> SP
|
||||
SP --> LLM[(LLM Providers)]
|
||||
|
||||
SW <--> SV
|
||||
SW <--> TA
|
||||
TA <--> TS
|
||||
SW <--> AP
|
||||
SW --> TU
|
||||
TU --> H
|
||||
```
|
||||
|
||||
## 3. Tenant Runtime Architecture
|
||||
|
||||
### 3.1 4-Layer Security Enforcement
|
||||
|
||||
| Layer | Enforcement Point | Implementation |
|
||||
|---|---|---|
|
||||
| 1. Sandbox | OpenClaw runtime/tool sandbox settings | OpenClaw native sandbox + process/container isolation. |
|
||||
| 2. Tool Policy | OpenClaw agent tool allow/deny | Per-agent tool manifest; tools not listed are unreachable. |
|
||||
| 3. Command Gating | Safety Wrapper `before_tool_call` | Green/Yellow/Yellow+External/Red/Critical Red classification + approval flow. |
|
||||
| 4. Secrets Redaction | Local egress proxy + transcript hooks | Outbound prompt redaction before network egress, plus log/transcript redaction hooks. |
|
||||
|
||||
### 3.2 Safety Wrapper Components
|
||||
|
||||
- `classification-engine`: deterministic rules engine with signed policy bundle from Hub.
|
||||
- `approval-gateway`: sync/async approval requests to Hub, with 24h expiry.
|
||||
- `secret-ref-resolver`: resolves `SECRET_REF(...)` at execution time only.
|
||||
- `adapter-runtime`: executes tool API adapters and guarded shell/docker/file actions.
|
||||
- `metering-collector`: captures per-agent/per-model token usage and aggregates hourly.
|
||||
- `hub-sync-client`: registration, heartbeat, config pull, backup status, command results.
|
||||
|
||||
### 3.3 OpenClaw Hook Usage (No Fork)
|
||||
|
||||
Safety Wrapper plugin uses upstream hook points for enforcement and observability:
|
||||
|
||||
- `before_tool_call`: classify/gate/block/require approval.
|
||||
- `after_tool_call`: audit capture + normalization.
|
||||
- `message_sending`: outbound content redaction.
|
||||
- `before_message_write`, `tool_result_persist`: local persistence redaction.
|
||||
- `llm_output`: token accounting and per-model usage capture.
|
||||
- `before_prompt_build`: inject cacheable SOUL/TOOLS prefix metadata.
|
||||
- `subagent_spawning`: enforce max depth/budget.
|
||||
- `gateway_start`: health checks + Hub session bootstrap.
|
||||
|
||||
## 4. Primary Data Flows
|
||||
|
||||
### 4.1 Signup To Provisioning Flow
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant User
|
||||
participant Site as Website
|
||||
participant Hub
|
||||
participant Stripe
|
||||
participant Worker as Automation Worker
|
||||
participant Provider as Netcup/Hetzner
|
||||
participant Prov as Provisioner
|
||||
participant VPS as Tenant VPS
|
||||
|
||||
User->>Site: Describe business + pick tools
|
||||
Site->>Hub: Create onboarding draft
|
||||
Site->>Stripe: Checkout session
|
||||
Stripe-->>Hub: checkout.session.completed
|
||||
Hub->>Worker: Create order (PAYMENT_CONFIRMED)
|
||||
Worker->>Provider: Allocate VPS
|
||||
Provider-->>Worker: VPS ready (IP + creds)
|
||||
Worker->>Hub: DNS_PENDING -> DNS_READY
|
||||
Worker->>Prov: Start provisioning job
|
||||
Prov->>VPS: Install stacks + OpenClaw + Safety
|
||||
Prov->>VPS: Seed secrets vault + tool registry
|
||||
Prov->>VPS: Register tenant with Hub
|
||||
VPS-->>Hub: register + first heartbeat
|
||||
Hub-->>User: Provisioning complete + app links
|
||||
```
|
||||
|
||||
### 4.2 Agent Tool Call With Gating
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant U as User
|
||||
participant OC as OpenClaw
|
||||
participant SW as Safety Wrapper
|
||||
participant H as Hub
|
||||
participant T as Tool/API
|
||||
|
||||
U->>OC: "Publish this newsletter"
|
||||
OC->>SW: tool call proposal
|
||||
SW->>SW: classify = Yellow+External
|
||||
SW->>H: approval request
|
||||
H-->>U: push approval request
|
||||
U->>H: approve
|
||||
H-->>SW: approval grant
|
||||
SW->>T: execute with SECRET_REF injection
|
||||
T-->>SW: result
|
||||
SW-->>OC: redacted result
|
||||
OC-->>U: completion summary
|
||||
```
|
||||
|
||||
### 4.3 Secrets Redaction Outbound Flow
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[OpenClaw Prompt Payload] --> B[Safety Wrapper Pre-Redaction]
|
||||
B --> C[Secrets Registry Match]
|
||||
C --> D[Pattern Safety Net]
|
||||
D --> E[Function-Call SecretRef Rebinding]
|
||||
E --> F[Local Egress Proxy]
|
||||
F --> G[Provider API]
|
||||
|
||||
C --> C1[(Vault SQLite)]
|
||||
D --> D1[(Regex + Entropy Rules)]
|
||||
F --> F1[Transport-Level Block if bypass attempt]
|
||||
```
|
||||
|
||||
### 4.4 Token Metering And Billing
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
O[OpenClaw llm_output hook] --> M[Metering Collector]
|
||||
M --> B[(Hourly Buckets SQLite)]
|
||||
B --> H[Hub Usage Ingest API]
|
||||
H --> P[(Billing Period + Usage Tables)]
|
||||
P --> S[Stripe Usage/Billing]
|
||||
H --> UI[Usage Dashboard + Alerts]
|
||||
```
|
||||
|
||||
## 5. Prompt Caching Architecture
|
||||
|
||||
- SOUL.md and TOOLS.md are split into stable cacheable prefix blocks and dynamic suffix blocks.
|
||||
- Stable prefix hash is generated per agent version.
|
||||
- Prefix changes only when agent config changes; day-to-day conversations hit cache-read pricing.
|
||||
- Metering persists `input/output/cache_read/cache_write` separately to preserve margin analytics.
|
||||
|
||||
## 6. Mobile, Website, And Channel Architecture
|
||||
|
||||
### 6.1 Mobile App
|
||||
|
||||
- React Native + Expo app as primary interface.
|
||||
- Real-time chat via Hub websocket gateway.
|
||||
- Approvals as push notifications (approve/deny quick actions).
|
||||
- Fallback channel switchboard in Hub for WhatsApp/Telegram relay adapters.
|
||||
|
||||
### 6.2 Website + Onboarding
|
||||
|
||||
- Dedicated public frontend app (`apps/website`) with strict network boundary to Hub public APIs.
|
||||
- Onboarding classifier service (cheap model profile) performs 1-2 message business classification.
|
||||
- Tool bundle recommendation engine returns editable stack + resource calculator.
|
||||
- Checkout remains Stripe-hosted.
|
||||
|
||||
## 7. First-Hour Workflow Templates (Architecture Proof)
|
||||
|
||||
| Template | Cross-Tool Actions | Gating Profile |
|
||||
|---|---|---|
|
||||
| Freelancer First Hour | Connect mail + calendar, create folders, configure intake form, first daily brief | Mostly Green/Yellow |
|
||||
| Agency First Hour | Chat inbox setup, project board scaffolding, proposal template generation, shared KB setup | Yellow + Yellow+External approval |
|
||||
| E-commerce First Hour | Inventory import, support inbox routing, analytics dashboard baseline, recovery email draft | Mixed Yellow/Yellow+External |
|
||||
| Consulting First Hour | Scheduling links, client doc signature template, CRM stages, weekly report automation | Mostly Yellow + one external gate |
|
||||
|
||||
These templates are codified as audited workflow blueprints executed through the same command classification path as ad-hoc agent actions.
|
||||
|
||||
## 8. Interactive Demo Architecture (Pre-Purchase)
|
||||
|
||||
Proposal: shared but isolated "Demo Tenant Pool" instead of a single static demo VPS.
|
||||
|
||||
- Each prospect gets a short-lived demo tenant snapshot (TTL 2 hours).
|
||||
- Demo runs synthetic data and fake outbound integrations only.
|
||||
- Same Safety Wrapper + approvals UI as production to demonstrate trust model.
|
||||
- Recycled automatically after session expiry.
|
||||
|
||||
This is safer and more realistic than one long-lived shared "Bella's Bakery" host.
|
||||
|
||||
## 9. Required Pre-Launch Cleanup Baseline
|
||||
|
||||
Before core build starts, execute repository cleanup gate:
|
||||
|
||||
- Remove all `n8n` references from Hub, Provisioner, stacks, scripts, tests, and docs used for production behavior.
|
||||
- Remove deployment references to deprecated `orchestrator` and `sysadmin-agent` from active provisioning paths.
|
||||
- Close plaintext credential leak path (`jobs/*/config.json` root password exposure) by moving to one-time secret files + immediate secure deletion.
|
||||
|
||||
No feature work should proceed until this baseline passes CI policy checks.
|
||||
@@ -0,0 +1,334 @@
|
||||
# 02. Component Breakdown And API Contracts
|
||||
|
||||
## 1. Component Breakdown
|
||||
|
||||
## 1.1 Control Plane Components
|
||||
|
||||
| Component | Runtime | Responsibility | Notes |
|
||||
|---|---|---|---|
|
||||
| Hub Web/API | Next.js 16 + Node | Admin UI, customer portal, public APIs, tenant APIs | Keep existing app, add route groups and API contracts below. |
|
||||
| Billing Engine | Node worker + Prisma | Usage aggregation, pool accounting, overage invoicing | Hourly usage compaction + end-of-period invoice sync. |
|
||||
| Provisioning Orchestrator | Existing automation worker | Order state machine and provisioning job dispatch | Keep and harden existing job pipeline. |
|
||||
| Notification Gateway | Node service | Push notifications, email alerts, approval prompts | Expo push + email provider adapters. |
|
||||
| Onboarding Classifier | Lightweight service | Business-type classification + starter bundle recommendation | Cheap fast model profile; capped context. |
|
||||
|
||||
## 1.2 Tenant Components (Per VPS)
|
||||
|
||||
| Component | Runtime | Responsibility | State Store |
|
||||
|---|---|---|---|
|
||||
| OpenClaw Gateway | Node 22+ upstream | Agent runtime, sessions, tool orchestration | OpenClaw JSON/JSONL storage |
|
||||
| Safety Wrapper Plugin | TypeScript package | Classification, gating, hooks, metering, Hub sync | SQLite (`safety.db`) |
|
||||
| Egress Proxy | Node/Rust sidecar | Outbound redaction + transport enforcement | In-memory + policy cache |
|
||||
| Execution Adapters | Local modules | Shell/Docker/file/env and tool REST adapters | Audit log in SQLite |
|
||||
| Secrets Vault | SQLite + encryption | Secret values, rotation history, fingerprints | `vault.db` |
|
||||
|
||||
## 1.3 Deprecated Components (Explicitly Out)
|
||||
|
||||
- `letsbe-orchestrator`: behavior studied for migration inputs only.
|
||||
- `letsbe-sysadmin-agent`: executor patterns ported, service itself not retained.
|
||||
- `letsbe-mcp-browser`: replaced by OpenClaw native browser tooling.
|
||||
|
||||
## 2. API Design Rules (Applies To All Contracts)
|
||||
|
||||
- Base path versioning: `/api/v1/...`
|
||||
- JSON request/response with strict schema validation.
|
||||
- Idempotency required on mutating tenant commands (`Idempotency-Key` header).
|
||||
- Authn/authz split by channel:
|
||||
- Tenant channel: `Bearer <tenant_api_key>` (hash stored server-side)
|
||||
- Mobile/customer channel: session JWT + RBAC
|
||||
- Public website onboarding: scoped API key + anti-abuse limits
|
||||
- All mutating endpoints emit audit event rows.
|
||||
- All time fields are ISO 8601 UTC.
|
||||
|
||||
## 3. Hub ↔ Tenant API Contracts
|
||||
|
||||
## 3.1 Register Tenant Node
|
||||
|
||||
`POST /api/v1/tenant/register`
|
||||
|
||||
Purpose: first boot registration from Safety Wrapper.
|
||||
|
||||
Request:
|
||||
|
||||
```json
|
||||
{
|
||||
"registrationToken": "rt_...",
|
||||
"orderId": "ord_...",
|
||||
"agentVersion": "safety-wrapper@0.1.0",
|
||||
"openclawVersion": "2026.2.26",
|
||||
"hostname": "cust-vps-001",
|
||||
"capabilities": ["browser", "exec", "docker", "approval_queue"]
|
||||
}
|
||||
```
|
||||
|
||||
Response `201`:
|
||||
|
||||
```json
|
||||
{
|
||||
"tenantApiKey": "tk_live_...",
|
||||
"tenantId": "ten_...",
|
||||
"heartbeatIntervalSec": 30,
|
||||
"configEtag": "cfg_9f1a...",
|
||||
"time": "2026-02-26T20:15:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
## 3.2 Heartbeat + Pull Deltas
|
||||
|
||||
`POST /api/v1/tenant/heartbeat`
|
||||
|
||||
Purpose: status signal plus lightweight config/update pull.
|
||||
|
||||
Request:
|
||||
|
||||
```json
|
||||
{
|
||||
"tenantId": "ten_...",
|
||||
"server": {
|
||||
"uptimeSec": 86400,
|
||||
"diskPct": 61.2,
|
||||
"memPct": 57.8,
|
||||
"openclawHealthy": true
|
||||
},
|
||||
"agents": [
|
||||
{"agentId": "marketing", "status": "online", "autonomyLevel": 2}
|
||||
],
|
||||
"pendingApprovals": 1,
|
||||
"lastAppliedConfigEtag": "cfg_9f1a..."
|
||||
}
|
||||
```
|
||||
|
||||
Response `200`:
|
||||
|
||||
```json
|
||||
{
|
||||
"configChanged": true,
|
||||
"nextConfigEtag": "cfg_9f1b...",
|
||||
"commands": [],
|
||||
"clock": "2026-02-26T20:15:30Z"
|
||||
}
|
||||
```
|
||||
|
||||
## 3.3 Pull Full Tenant Config
|
||||
|
||||
`GET /api/v1/tenant/config?etag=cfg_9f1a...`
|
||||
|
||||
Response `200` includes:
|
||||
|
||||
- agent definitions (SOUL/TOOLS refs, model profile)
|
||||
- autonomy policy
|
||||
- external comms gate unlock map
|
||||
- command classification ruleset checksum
|
||||
- tool registry template version
|
||||
|
||||
## 3.4 Approval Request / Resolve
|
||||
|
||||
`POST /api/v1/tenant/approval-requests`
|
||||
|
||||
```json
|
||||
{
|
||||
"tenantId": "ten_...",
|
||||
"requestId": "apr_...",
|
||||
"agentId": "marketing",
|
||||
"class": "yellow_external",
|
||||
"tool": "listmonk.send_campaign",
|
||||
"humanSummary": "Send campaign 'March Offer' to 1,204 recipients",
|
||||
"expiresAt": "2026-02-27T20:15:30Z",
|
||||
"context": {"recipientCount": 1204}
|
||||
}
|
||||
```
|
||||
|
||||
`GET /api/v1/tenant/approval-requests/{requestId}` returns `PENDING|APPROVED|DENIED|EXPIRED`.
|
||||
|
||||
## 3.5 Usage Ingestion
|
||||
|
||||
`POST /api/v1/tenant/usage-buckets`
|
||||
|
||||
```json
|
||||
{
|
||||
"tenantId": "ten_...",
|
||||
"buckets": [
|
||||
{
|
||||
"hour": "2026-02-26T20:00:00Z",
|
||||
"agentId": "marketing",
|
||||
"model": "openrouter/deepseek-v3.2",
|
||||
"inputTokens": 12000,
|
||||
"outputTokens": 3800,
|
||||
"cacheReadTokens": 6400,
|
||||
"cacheWriteTokens": 0,
|
||||
"webSearchCalls": 3,
|
||||
"webFetchCalls": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 3.6 Backup Status
|
||||
|
||||
`POST /api/v1/tenant/backup-status`
|
||||
|
||||
Tracks last run, duration, snapshot ID, integrity verification state.
|
||||
|
||||
## 4. Customer/Mobile API Contracts
|
||||
|
||||
## 4.1 Agent And Autonomy Management
|
||||
|
||||
- `GET /api/v1/customer/agents`
|
||||
- `PATCH /api/v1/customer/agents/{agentId}`
|
||||
- `PATCH /api/v1/customer/agents/{agentId}/autonomy`
|
||||
- `PATCH /api/v1/customer/agents/{agentId}/external-comms-gate`
|
||||
|
||||
Autonomy update request:
|
||||
|
||||
```json
|
||||
{
|
||||
"autonomyLevel": 2,
|
||||
"externalComms": {
|
||||
"defaultLocked": true,
|
||||
"toolUnlocks": [
|
||||
{"tool": "chatwoot.reply_external", "enabled": true, "expiresAt": null}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 4.2 Approval Queue
|
||||
|
||||
- `GET /api/v1/customer/approvals?status=pending`
|
||||
- `POST /api/v1/customer/approvals/{id}` with `{ "decision": "approve" | "deny" }`
|
||||
|
||||
## 4.3 Usage And Billing
|
||||
|
||||
- `GET /api/v1/customer/usage/summary`
|
||||
- `GET /api/v1/customer/usage/by-agent`
|
||||
- `GET /api/v1/customer/billing/current-period`
|
||||
- `POST /api/v1/customer/billing/payment-method`
|
||||
|
||||
## 4.4 Realtime Channels
|
||||
|
||||
- `GET /api/v1/customer/events/stream` (SSE fallback)
|
||||
- `WS /api/v1/customer/ws` (chat updates, approvals, status)
|
||||
|
||||
## 5. Public Website/Onboarding API Contracts
|
||||
|
||||
## 5.1 Business Classification
|
||||
|
||||
`POST /api/v1/public/onboarding/classify`
|
||||
|
||||
```json
|
||||
{
|
||||
"sessionId": "onb_...",
|
||||
"messages": [
|
||||
{"role": "user", "content": "I run a 5-person digital agency"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Response:
|
||||
|
||||
```json
|
||||
{
|
||||
"businessType": "agency",
|
||||
"confidence": 0.91,
|
||||
"recommendedBundle": "agency_core_v1",
|
||||
"followUpQuestion": "Do you need ticketing or only chat?"
|
||||
}
|
||||
```
|
||||
|
||||
## 5.2 Bundle Quote
|
||||
|
||||
`POST /api/v1/public/onboarding/quote`
|
||||
|
||||
Returns min tier, projected token pool, monthly estimate, and Stripe checkout seed payload.
|
||||
|
||||
## 5.3 Order Creation
|
||||
|
||||
`POST /api/v1/public/orders` with strict schema + anti-fraud controls.
|
||||
|
||||
## 6. Safety Wrapper Internal Contract (Local Only)
|
||||
|
||||
Local Unix socket JSON-RPC interface between plugin orchestration and execution layer.
|
||||
|
||||
Method examples:
|
||||
|
||||
- `exec.run`
|
||||
- `docker.compose`
|
||||
- `file.read`
|
||||
- `file.write`
|
||||
- `env.update`
|
||||
- `tool.http.call`
|
||||
|
||||
Example request:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "rpc_1",
|
||||
"method": "tool.http.call",
|
||||
"params": {
|
||||
"tool": "ghost",
|
||||
"operation": "posts.create",
|
||||
"secretRefs": ["ghost_admin_key"],
|
||||
"payload": {"title": "..."}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Guarantees:
|
||||
|
||||
- Secrets passed only as references, never raw values in request logs.
|
||||
- Execution engine resolves references inside isolated process boundary.
|
||||
- Full request/result hashes persisted for audit traceability.
|
||||
|
||||
## 7. Tool Registry Contract
|
||||
|
||||
`tool-registry.json` shape (tenant-local):
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "2026-02-26",
|
||||
"tools": [
|
||||
{
|
||||
"id": "chatwoot",
|
||||
"baseUrl": "https://chat.customer-domain.tld",
|
||||
"auth": {"type": "bearer_secret_ref", "ref": "chatwoot_api_token"},
|
||||
"adapters": ["contacts.list", "conversation.reply"],
|
||||
"externalCommsOperations": ["conversation.reply_external"],
|
||||
"cheatsheet": "/opt/letsbe/cheatsheets/chatwoot.md"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 8. Error Contract And Retries
|
||||
|
||||
Standard error envelope:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"code": "APPROVAL_REQUIRED",
|
||||
"message": "Operation requires approval",
|
||||
"requestId": "req_...",
|
||||
"retryable": true,
|
||||
"details": {"approvalRequestId": "apr_..."}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Common error codes:
|
||||
|
||||
- `AUTH_INVALID`
|
||||
- `TENANT_UNKNOWN`
|
||||
- `APPROVAL_REQUIRED`
|
||||
- `APPROVAL_EXPIRED`
|
||||
- `CLASSIFICATION_BLOCKED`
|
||||
- `SECRET_REF_UNRESOLVED`
|
||||
- `POLICY_VERSION_MISMATCH`
|
||||
- `RATE_LIMITED`
|
||||
|
||||
## 9. API Compatibility And Change Policy
|
||||
|
||||
- Backward-compatible additions: allowed in-place.
|
||||
- Breaking changes: new version path (`/api/v2`).
|
||||
- Deprecation window: minimum 60 days for tenant APIs.
|
||||
- Contract tests run in CI for Hub, Safety Wrapper, Mobile, and Website clients.
|
||||
145
docs/architecture-proposal/gpt/03-deployment-strategy.md
Normal file
145
docs/architecture-proposal/gpt/03-deployment-strategy.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# 03. Deployment Strategy
|
||||
|
||||
## 1. Goals
|
||||
|
||||
- Ship to founding members in ~12 weeks without compromising security invariants.
|
||||
- Maintain one-VPS-per-customer isolation.
|
||||
- Keep OpenClaw upstream-pinned and independently upgradeable.
|
||||
- Make tenant rollout reversible with fast rollback paths.
|
||||
|
||||
## 2. Environment Topology
|
||||
|
||||
## 2.1 Control Plane Environments
|
||||
|
||||
| Environment | Purpose | Data |
|
||||
|---|---|---|
|
||||
| `dev` | Rapid feature iteration | Synthetic/local data |
|
||||
| `staging` | Release-candidate validation, e2e, load, security checks | Sanitized fixtures |
|
||||
| `prod-eu` | EU customers (default EU routing) | Real customer data |
|
||||
| `prod-us` | NA customers (default NA routing) | Real customer data |
|
||||
|
||||
Control plane services (Hub + worker + notifications) are region-deployed with independent DBs and clear region affinity.
|
||||
|
||||
## 2.2 Tenant Environments
|
||||
|
||||
- `sandbox tenants`: internal QA and interactive demo pool.
|
||||
- `canary tenants`: first real-production update recipients.
|
||||
- `general tenants`: full customer fleet.
|
||||
|
||||
## 3. Deployment Units
|
||||
|
||||
## 3.1 Control Plane Units
|
||||
|
||||
- `hub-web-api` container (Next.js standalone runtime)
|
||||
- `hub-worker` container (automation + billing jobs)
|
||||
- `notifications` container (push/email delivery)
|
||||
- `postgres` (managed or self-hosted HA)
|
||||
|
||||
## 3.2 Tenant Units (Per Customer VPS)
|
||||
|
||||
- `openclaw` container (upstream image/tag pinned)
|
||||
- `safety-wrapper` plugin package mounted into OpenClaw extension dir
|
||||
- `egress-proxy` service (localhost-only)
|
||||
- tool containers and nginx from provisioner
|
||||
- local SQLite data stores for secrets/approvals/metering
|
||||
|
||||
## 4. Provisioning Deployment Plan
|
||||
|
||||
## 4.1 Provisioner Mode
|
||||
|
||||
Continue with existing one-shot SSH provisioner flow, retooled to:
|
||||
|
||||
- deploy OpenClaw + Safety components
|
||||
- remove legacy orchestrator/sysadmin deployment
|
||||
- strip deprecated stacks and n8n references
|
||||
- write secrets into encrypted vault only (no plaintext long-lived config)
|
||||
|
||||
## 4.2 Immutable Artifact Inputs
|
||||
|
||||
Provisioning uses pinned artifacts only:
|
||||
|
||||
- OpenClaw release tag (`stable` channel pin)
|
||||
- Safety Wrapper image/package digest
|
||||
- Tool stack compose templates with hash
|
||||
- policy bundle version + checksum
|
||||
|
||||
## 5. Secrets And Credential Deployment
|
||||
|
||||
- Registration token is one-time and short-lived.
|
||||
- Tenant API key returned at registration; only hash stored in Hub DB.
|
||||
- Provisioner writes bootstrap secrets to tmpfs file, consumed once, then shredded.
|
||||
- Existing plaintext job config path (`jobs/<id>/config.json`) replaced by encrypted payload + ephemeral decrypt-on-run.
|
||||
|
||||
## 6. Release Strategy
|
||||
|
||||
## 6.1 Control Plane
|
||||
|
||||
- Trunk-based merges behind feature flags.
|
||||
- Deploy via Gitea Actions with staged promotions (`dev -> staging -> prod`).
|
||||
- DB migrations run in expand/contract pattern.
|
||||
|
||||
## 6.2 Tenant Plane
|
||||
|
||||
Tenant updates split into independent channels:
|
||||
|
||||
- `policy-only`: classification/autonomy/tool policy updates (no binary change)
|
||||
- `wrapper patch`: Safety Wrapper version bump
|
||||
- `openclaw bump`: upstream release bump (separate tracked campaign)
|
||||
|
||||
Rollout:
|
||||
|
||||
1. Internal sandbox tenants
|
||||
2. 5% canary customer tenants
|
||||
3. 25%
|
||||
4. 100%
|
||||
|
||||
Auto-stop criteria:
|
||||
|
||||
- redaction test failure
|
||||
- approval-routing failure >1%
|
||||
- tenant heartbeat drop >3%
|
||||
|
||||
## 7. Rollback Strategy
|
||||
|
||||
## 7.1 Control Plane Rollback
|
||||
|
||||
- Keep last two container digests deployable.
|
||||
- Migration rollback policy: only for reversible migrations; otherwise hotfix-forward.
|
||||
|
||||
## 7.2 Tenant Rollback
|
||||
|
||||
- Policy rollback via previous signed policy bundle.
|
||||
- Wrapper rollback to previous plugin package.
|
||||
- OpenClaw rollback to previous pinned stable tag after compatibility check.
|
||||
|
||||
## 8. Observability And SLOs
|
||||
|
||||
## 8.1 Required Telemetry
|
||||
|
||||
- tenant heartbeat latency and freshness
|
||||
- approval queue latency (request -> decision)
|
||||
- redaction pipeline counters (matches by layer)
|
||||
- token usage ingest lag
|
||||
- provisioning success/failure per step
|
||||
|
||||
## 8.2 Launch SLO Targets
|
||||
|
||||
- Hub API availability: 99.9%
|
||||
- Tenant heartbeat freshness: 99% under 2 minutes
|
||||
- Approval propagation: p95 < 5 seconds (Hub to mobile push)
|
||||
- Provisioning success first-attempt: >= 90%
|
||||
|
||||
## 9. Dual-Provider Strategy (Netcup + Hetzner)
|
||||
|
||||
- Primary capacity pool on Netcup (EU/US).
|
||||
- Overflow path on Hetzner with same provisioner scripts and hardened baseline.
|
||||
- Provider adapter abstraction lives in Hub `server-provisioning` module; provisioner remains Debian-focused and provider-agnostic.
|
||||
|
||||
## 10. Cutover Plan From Current State
|
||||
|
||||
1. Freeze legacy orchestrator/sysadmin deployment paths.
|
||||
2. Land prerequisite cleanup release (n8n/deprecated removal + credential leak fix).
|
||||
3. Enable new tenant register/heartbeat APIs in Hub.
|
||||
4. Provision first new-architecture internal tenant.
|
||||
5. Execute parallel-run window (old and new provisioning flows side-by-side for internal only).
|
||||
6. Flip default provisioning to new flow for production orders.
|
||||
@@ -0,0 +1,176 @@
|
||||
# 04. Detailed Implementation Plan And Dependency Graph
|
||||
|
||||
## 1. Planning Assumptions
|
||||
|
||||
- Target launch window: 12 weeks.
|
||||
- Team model assumed for schedule below:
|
||||
- 2 backend/platform engineers
|
||||
- 1 mobile/fullstack engineer
|
||||
- 1 DevOps/SRE engineer
|
||||
- 1 QA/security engineer (shared)
|
||||
- Existing Hub codebase is retained and extended.
|
||||
|
||||
## 2. Work Breakdown Structure (WBS)
|
||||
|
||||
## Phase 0: Prerequisite Cleanup And Hardening (Week 1)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P0-1 | Remove all `n8n` code references (Hub, provisioner, stacks, scripts, tests) | 3d | - | `rg -n n8n` clean in production code paths; CI policy check added |
|
||||
| P0-2 | Remove deprecated deploy targets (`orchestrator`, `sysadmin`) from active provisioning | 2d | P0-1 | No new orders can deploy deprecated services |
|
||||
| P0-3 | Fix plaintext provisioning secret leak (`jobs/*/config.json`) | 2d | P0-1 | No root/server password persisted in plaintext job files |
|
||||
| P0-4 | Baseline security regression tests for cleanup changes | 1d | P0-2,P0-3 | Green CI + sign-off |
|
||||
|
||||
## Phase 1: Safety Substrate (Weeks 2-3)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P1-1 | Build encrypted secrets vault SQLite schema + key management | 3d | P0-4 | CRUD, rotation, audit log implemented |
|
||||
| P1-2 | Implement egress redaction proxy (registry + regex + entropy layers) | 4d | P1-1 | Redaction test suite pass with seeded secrets |
|
||||
| P1-3 | Implement command classification engine (5-tier + external gate) | 3d | P1-1 | Deterministic policy tests pass |
|
||||
| P1-4 | Implement approval state cache + retry logic (tenant-local) | 2d | P1-3 | Approval resilience tests pass |
|
||||
| P1-5 | OpenClaw plugin skeleton with hooks + telemetry envelope | 3d | P1-2,P1-3 | Hook smoke tests green against pinned OpenClaw tag |
|
||||
|
||||
## Phase 2: Hub Tenant APIs + Data Model (Weeks 3-4)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P2-1 | Add Prisma models: approval queue, usage buckets, agent policy, comms unlocks | 2d | P0-4 | Migration applied in staging |
|
||||
| P2-2 | Implement tenant register/heartbeat/config APIs | 3d | P2-1 | Contract tests pass |
|
||||
| P2-3 | Implement tenant approval-request APIs + customer approval endpoints | 3d | P2-1 | End-to-end approval cycle works |
|
||||
| P2-4 | Implement usage ingest + billing period updates | 3d | P2-1 | Usage events visible in dashboard |
|
||||
| P2-5 | Add push notification pipeline for approvals | 2d | P2-3 | Mobile push test path validated |
|
||||
|
||||
## Phase 3: Safety Wrapper Execution Layer (Weeks 4-6)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P3-1 | Port shell/docker/file/env guarded executors from sysadmin patterns | 5d | P1-5 | Security unit tests pass |
|
||||
| P3-2 | Implement tool registry loader + SECRET_REF resolver | 3d | P1-1,P3-1 | Tool calls run without raw secret exposure |
|
||||
| P3-3 | Implement core adapters (Chatwoot, Ghost, Nextcloud, Cal.com, Odoo, Listmonk) | 6d | P3-2 | Adapter contract tests pass |
|
||||
| P3-4 | Implement metering capture and hourly bucket compaction | 2d | P1-5,P2-4 | Buckets reliably posted to Hub |
|
||||
| P3-5 | Add subagent budget/depth limits and policy enforcement | 2d | P1-5 | Policy tests and abuse tests pass |
|
||||
|
||||
## Phase 4: Provisioner Retool (Weeks 5-7)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P4-1 | Add OpenClaw + Safety deployment steps to provisioner | 4d | P3-2 | Fresh VPS comes online with heartbeat |
|
||||
| P4-2 | Remove legacy stack templates and nginx configs from default deployment path | 2d | P0-2 | Deprecated stacks excluded from installs |
|
||||
| P4-3 | Generate and deploy tenant configs/policies during provisioning | 3d | P2-2,P4-1 | Config sync succeeds on first boot |
|
||||
| P4-4 | Migrate initial browser setup scenarios to OpenClaw browser tool | 4d | P4-1 | 8 scenarios replaced or retired |
|
||||
| P4-5 | Add idempotent recovery checkpoints per provisioning step | 2d | P4-1 | Retry from failed step validated |
|
||||
|
||||
## Phase 5: Customer Interfaces (Weeks 6-9)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P5-1 | Customer web portal for approvals, agent settings, usage | 5d | P2-3,P2-4 | Beta usable on staging |
|
||||
| P5-2 | Mobile app MVP (chat, approvals, health, usage) | 8d | P2-5,P5-1 | TestFlight/internal distribution ready |
|
||||
| P5-3 | Public onboarding website + classifier + bundle calculator | 6d | P2-1 | Stripe flow works end-to-end |
|
||||
| P5-4 | WhatsApp/Telegram fallback relay (minimal) | 3d | P2-3 | Approval fallback path works |
|
||||
|
||||
## Phase 6: Workflow Templates + Demo Experience (Weeks 8-10)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P6-1 | Implement 4 first-hour workflow templates as auditable blueprints | 5d | P3-3,P5-1 | Templates executable end-to-end |
|
||||
| P6-2 | Build interactive demo tenant pool manager (TTL snapshots) | 4d | P4-1,P5-3 | Demo session provisioning <5 min |
|
||||
| P6-3 | Add product telemetry for template completion and demo conversion | 2d | P6-1,P6-2 | Metrics dashboards live |
|
||||
|
||||
## Phase 7: Quality, Hardening, Launch (Weeks 10-12)
|
||||
|
||||
| ID | Task | Duration | Depends On | Exit Criteria |
|
||||
|---|---|---:|---|---|
|
||||
| P7-1 | Full security test suite (redaction, gating, injection, auth) | 4d | P3-5,P4-5 | Critical findings resolved |
|
||||
| P7-2 | Load, soak, and chaos tests on staging fleet | 3d | P6-1 | SLO gates met |
|
||||
| P7-3 | Canary launch (5% -> 25% -> 100%) with rollback drills | 4d | P7-1,P7-2 | Canary metrics stable |
|
||||
| P7-4 | Launch readiness review + runbook finalization | 2d | P7-3 | Founding member launch sign-off |
|
||||
|
||||
## 3. Dependency Graph
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
P0_1[P0-1 n8n cleanup] --> P0_2[P0-2 deprecated deploy removal]
|
||||
P0_1 --> P0_3[P0-3 plaintext secret fix]
|
||||
P0_2 --> P0_4[P0-4 baseline security tests]
|
||||
P0_3 --> P0_4
|
||||
|
||||
P0_4 --> P1_1[P1-1 vault]
|
||||
P1_1 --> P1_2[P1-2 egress proxy]
|
||||
P1_1 --> P1_3[P1-3 classification]
|
||||
P1_3 --> P1_4[P1-4 approval cache]
|
||||
P1_2 --> P1_5[P1-5 openclaw plugin skeleton]
|
||||
P1_3 --> P1_5
|
||||
|
||||
P0_4 --> P2_1[P2-1 hub prisma models]
|
||||
P2_1 --> P2_2[P2-2 tenant register/heartbeat/config]
|
||||
P2_1 --> P2_3[P2-3 approval APIs]
|
||||
P2_1 --> P2_4[P2-4 usage ingest]
|
||||
P2_3 --> P2_5[P2-5 push notifications]
|
||||
|
||||
P1_5 --> P3_1[P3-1 guarded executors]
|
||||
P1_1 --> P3_2[P3-2 tool registry + secret ref]
|
||||
P3_1 --> P3_2
|
||||
P3_2 --> P3_3[P3-3 tool adapters]
|
||||
P1_5 --> P3_4[P3-4 metering]
|
||||
P2_4 --> P3_4
|
||||
P1_5 --> P3_5[P3-5 subagent controls]
|
||||
|
||||
P3_2 --> P4_1[P4-1 provisioner openclaw+safety]
|
||||
P0_2 --> P4_2[P4-2 legacy stack template removal]
|
||||
P2_2 --> P4_3[P4-3 config generation]
|
||||
P4_1 --> P4_3
|
||||
P4_1 --> P4_4[P4-4 browser scenario migration]
|
||||
P4_1 --> P4_5[P4-5 idempotent checkpoints]
|
||||
|
||||
P2_3 --> P5_1[P5-1 customer portal]
|
||||
P2_4 --> P5_1
|
||||
P2_5 --> P5_2[P5-2 mobile app MVP]
|
||||
P5_1 --> P5_2
|
||||
P2_1 --> P5_3[P5-3 onboarding website]
|
||||
P2_3 --> P5_4[P5-4 whatsapp/telegram fallback]
|
||||
|
||||
P3_3 --> P6_1[P6-1 first-hour templates]
|
||||
P5_1 --> P6_1
|
||||
P4_1 --> P6_2[P6-2 interactive demo pool]
|
||||
P5_3 --> P6_2
|
||||
P6_1 --> P6_3[P6-3 template/demo telemetry]
|
||||
P6_2 --> P6_3
|
||||
|
||||
P3_5 --> P7_1[P7-1 full security suite]
|
||||
P4_5 --> P7_1
|
||||
P6_1 --> P7_2[P7-2 load/soak/chaos]
|
||||
P7_1 --> P7_3[P7-3 canary launch]
|
||||
P7_2 --> P7_3
|
||||
P7_3 --> P7_4[P7-4 launch readiness]
|
||||
```
|
||||
|
||||
## 4. Critical Path
|
||||
|
||||
Primary critical chain:
|
||||
|
||||
`P0 cleanup -> P1 safety substrate -> P3 execution layer -> P4 provisioner retool -> P7 hardening/canary`
|
||||
|
||||
Secondary critical chain:
|
||||
|
||||
`P2 Hub APIs -> P5 mobile approvals -> P7 canary`
|
||||
|
||||
## 5. Parallelization Strategy
|
||||
|
||||
To meet 12 weeks, run these in parallel after Week 3:
|
||||
|
||||
- Track A: Safety Wrapper + adapters (P3)
|
||||
- Track B: Provisioner retool (P4)
|
||||
- Track C: Customer interfaces (P5)
|
||||
|
||||
## 6. Definition Of Done (Program-Level)
|
||||
|
||||
Launch gate passes only when all are true:
|
||||
|
||||
- secrets-never-leave-server invariant passes automated red-team test suite
|
||||
- gating matrix works exactly for all 5 command classes and 3 autonomy levels
|
||||
- external comms gate enforces lock-by-default at all autonomy levels
|
||||
- provisioning succeeds >=90% first attempt and >=99% with retries
|
||||
- approval path works across web + mobile push with audit completeness
|
||||
- usage metering reconciles with provider usage within <=1% variance
|
||||
75
docs/architecture-proposal/gpt/05-estimated-timelines.md
Normal file
75
docs/architecture-proposal/gpt/05-estimated-timelines.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# 05. Estimated Timelines
|
||||
|
||||
## 1. Date Anchors
|
||||
|
||||
- Planning baseline date: **Thursday, February 26, 2026**
|
||||
- Proposed execution start: **Monday, March 2, 2026**
|
||||
- 12-week target launch window end: **Sunday, May 24, 2026**
|
||||
- Recommended contingency buffer: **May 25-31, 2026**
|
||||
|
||||
## 2. Timeline Summary
|
||||
|
||||
| Phase | Dates | Duration | Confidence |
|
||||
|---|---|---:|---|
|
||||
| Phase 0 prerequisites | Mar 2 - Mar 8 | 1 week | High |
|
||||
| Phase 1 safety substrate | Mar 9 - Mar 22 | 2 weeks | Medium |
|
||||
| Phase 2 Hub APIs/models | Mar 16 - Mar 29 | 2 weeks (overlap) | High |
|
||||
| Phase 3 wrapper execution layer | Mar 23 - Apr 12 | 3 weeks | Medium |
|
||||
| Phase 4 provisioner retool | Mar 30 - Apr 19 | 3 weeks (overlap) | Medium |
|
||||
| Phase 5 mobile + website + portal | Apr 6 - May 3 | 4 weeks | Medium |
|
||||
| Phase 6 templates + demo | Apr 27 - May 10 | 2 weeks | Medium |
|
||||
| Phase 7 hardening + canary + launch | May 4 - May 24 | 3 weeks | Medium-Low |
|
||||
|
||||
## 3. Milestones
|
||||
|
||||
| Milestone | Target Date | Exit Condition |
|
||||
|---|---|---|
|
||||
| M1: Cleanup gate passed | Mar 8, 2026 | n8n and deprecated deploy paths removed; plaintext secret leak fixed |
|
||||
| M2: Security substrate alpha | Mar 22, 2026 | redaction proxy + classifier + plugin skeleton integrated |
|
||||
| M3: Hub tenant APIs beta | Mar 29, 2026 | register/heartbeat/approval/usage contracts stable |
|
||||
| M4: First full tenant provision | Apr 12, 2026 | new VPS boots with OpenClaw + Safety + heartbeat |
|
||||
| M5: Customer interface beta | May 3, 2026 | web portal + mobile approvals + onboarding flow functional |
|
||||
| M6: Launch candidate | May 17, 2026 | full security/perf test pass; canary starts |
|
||||
| M7: Founding member launch | May 24, 2026 | canary complete; runbooks and rollback drills signed off |
|
||||
|
||||
## 4. Weekly View (Condensed)
|
||||
|
||||
```text
|
||||
Week 1 (Mar 2) : Phase 0 prerequisite cleanup
|
||||
Week 2-3 : Phase 1 safety substrate begins
|
||||
Week 3-4 : Phase 2 Hub API/data model work (parallel)
|
||||
Week 4-6 : Phase 3 wrapper execution + adapters
|
||||
Week 5-7 : Phase 4 provisioner retool and browser migration
|
||||
Week 6-9 : Phase 5 customer portal/mobile/website
|
||||
Week 9-10 : Phase 6 templates + interactive demo
|
||||
Week 10-12 : Phase 7 hardening, canary rollout, launch
|
||||
Buffer Week : May 25-31 contingency
|
||||
```
|
||||
|
||||
## 5. Critical Timeline Risks
|
||||
|
||||
| Risk | Schedule Impact If Realized |
|
||||
|---|---|
|
||||
| OpenClaw hook behavior drift or undocumented edge cases | +1 to +2 weeks |
|
||||
| Provisioner migration instability on fresh VPS images | +1 week |
|
||||
| Mobile push approval reliability issues (iOS/Android differences) | +0.5 to +1 week |
|
||||
| Token billing reconciliation defects with Stripe meter events | +1 week |
|
||||
| Security findings in redaction/gating late in cycle | +1 to +3 weeks |
|
||||
|
||||
## 6. Confidence Ranges
|
||||
|
||||
| Scenario | Launch Window |
|
||||
|---|---|
|
||||
| Optimistic | May 17-24, 2026 |
|
||||
| Most likely | May 24-31, 2026 |
|
||||
| Conservative | June 7-14, 2026 |
|
||||
|
||||
## 7. Scope Compression Options (If Needed)
|
||||
|
||||
To preserve security and launch by May 24-31, de-scope in this order:
|
||||
|
||||
1. Delay WhatsApp/Telegram fallback to post-launch.
|
||||
2. Limit initial tool adapter set to top 8 usage tools, keep others on browser fallback.
|
||||
3. Ship 3 first-hour templates at launch, add the 4th in first patch.
|
||||
|
||||
Do **not** cut redaction, gating, approval, or metering correctness work.
|
||||
73
docs/architecture-proposal/gpt/06-risk-assessment.md
Normal file
73
docs/architecture-proposal/gpt/06-risk-assessment.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# 06. Risk Assessment
|
||||
|
||||
## 1. Risk Scoring Method
|
||||
|
||||
- Probability: 1 (low) to 5 (high)
|
||||
- Impact: 1 (low) to 5 (high)
|
||||
- Risk score = Probability x Impact
|
||||
|
||||
## 2. Top Risks
|
||||
|
||||
| ID | Risk | Prob | Impact | Score | Mitigation | Contingency Trigger |
|
||||
|---|---|---:|---:|---:|---|---|
|
||||
| R1 | Secret exfiltration via unredacted outbound payload | 3 | 5 | 15 | Multi-layer redaction tests, egress deny-by-default policy, seeded canary secrets | Any unredacted canary secret seen outside tenant |
|
||||
| R2 | Command gating bypass due misclassification | 3 | 5 | 15 | Deterministic policy engine, contract tests per class, human-readable reason logging | Red/Critical executes without approval in tests |
|
||||
| R3 | OpenClaw upstream changes break plugin behavior | 3 | 4 | 12 | Pin stable tags, adapter compatibility suite, staged upgrade canaries | Hook contract test fails against new tag |
|
||||
| R4 | Provisioner regressions reduce provisioning success | 4 | 4 | 16 | Idempotent checkpoints, replay tests, synthetic VPS CI | First-attempt success < 90% |
|
||||
| R5 | Billing usage mismatch vs provider costs | 3 | 4 | 12 | Dual-entry usage checks, nightly reconciliation jobs, alert thresholds | >1% sustained variance for 24h |
|
||||
| R6 | Mobile approval notification delays/drop | 3 | 3 | 9 | Push retries + in-app queue fallback + email fallback | p95 approval notify > 30s |
|
||||
| R7 | Performance overhead exceeds Lite-tier budget | 2 | 4 | 8 | Memory profiling budget gates, disable non-essential plugins, tune browser lifecycle | LetsBe overhead > 800MB sustained |
|
||||
| R8 | Tool API churn breaks adapters | 4 | 3 | 12 | Adapter integration tests against pinned versions, fallback to browser playbook | Adapter failure rate > 5% |
|
||||
| R9 | Security debt from AI-generated code quality | 4 | 4 | 16 | Mandatory senior review on security modules, lint rules, banned patterns checks | Critical static-analysis finding unresolved >48h |
|
||||
| R10 | Legal/compliance drift (license/source disclosure pages) | 2 | 4 | 8 | Automated license manifest publishing, pre-release legal checklist | Missing OSS disclosure page at RC freeze |
|
||||
|
||||
## 3. Risk Register By Domain
|
||||
|
||||
## 3.1 Security Risks
|
||||
|
||||
- Redaction misses non-standard secret formats.
|
||||
- External comms gate incorrectly tied to autonomy level.
|
||||
- Local logs/transcripts persist raw secret material.
|
||||
- Local execution adapters allow shell metacharacter bypass.
|
||||
|
||||
## 3.2 Delivery Risks
|
||||
|
||||
- Too much simultaneous change across Hub + provisioner + tenant runtime.
|
||||
- Underestimated migration effort from deprecated orchestrator/sysadmin behaviors.
|
||||
- Browser automation migration complexity for setup scripts.
|
||||
|
||||
## 3.3 Operational Risks
|
||||
|
||||
- Dual-region Hub operations increase DB and deploy complexity.
|
||||
- Insufficient on-call runbooks for approval outages and provisioning failures.
|
||||
- Canary rollout without automated rollback criteria.
|
||||
|
||||
## 4. Mitigation Program
|
||||
|
||||
## 4.1 Pre-Launch Controls
|
||||
|
||||
- Security invariants are encoded as executable tests (not checklist-only).
|
||||
- Every release candidate must pass redaction canary probes.
|
||||
- Dry-run provisioning must pass on both Netcup and Hetzner targets.
|
||||
|
||||
## 4.2 Runtime Controls
|
||||
|
||||
- Alert on heartbeat freshness degradation.
|
||||
- Alert on approval queue lag and expiration spikes.
|
||||
- Alert on sudden drop in cache-read ratio (cost anomaly indicator).
|
||||
|
||||
## 4.3 Governance Controls
|
||||
|
||||
- Security design review required for changes in Safety Wrapper, redaction, or secrets flows.
|
||||
- Migration freeze on deprecated paths after Phase 0.
|
||||
- Weekly risk review with updated probability/impact re-scoring.
|
||||
|
||||
## 5. Launch Go/No-Go Risk Gates
|
||||
|
||||
No launch if any condition is true:
|
||||
|
||||
- unresolved severity-1 security defect
|
||||
- redaction tests fail for any supported secret class
|
||||
- command gating matrix not fully passing
|
||||
- usage reconciliation error >1% over 72h canary
|
||||
- provisioning first-attempt success below 85% in final week
|
||||
111
docs/architecture-proposal/gpt/07-testing-strategy.md
Normal file
111
docs/architecture-proposal/gpt/07-testing-strategy.md
Normal file
@@ -0,0 +1,111 @@
|
||||
# 07. Testing Strategy Proposal
|
||||
|
||||
## 1. Testing Principles
|
||||
|
||||
- Security-critical behavior is verified with invariant tests, not only unit coverage.
|
||||
- Contract-first testing between Hub, Safety Wrapper, Mobile, Website, and Provisioner.
|
||||
- Fast feedback in CI, deep verification in staging and nightly runs.
|
||||
- AI-generated code receives stricter review and mutation testing on critical paths.
|
||||
|
||||
## 2. Test Pyramid By Component
|
||||
|
||||
| Layer | Hub | Safety Wrapper | Provisioner | Mobile/Website |
|
||||
|---|---|---|---|---|
|
||||
| Unit | services, validators, policy logic | classifier, redactor, secret resolver, adapters | parser/utils/template render | UI logic, state stores, hooks |
|
||||
| Integration | Prisma + API handlers + auth | plugin hooks vs OpenClaw test harness | SSH runner against disposable VM | API integration against mock Hub |
|
||||
| End-to-end | full order/provision/approval/billing flow | tenant command execution path | full 10-step provisioning with checkpoints | chat/approval/onboarding user journeys |
|
||||
| Security | authz, rate-limit, session hardening | secret exfil tests, gating bypass tests | credential leakage scans | token storage, deep link auth |
|
||||
| Performance | API p95 and DB load | per-turn latency overhead, memory usage | provisioning duration and retry cost | startup latency, push receipt latency |
|
||||
|
||||
## 3. Mandatory Security Invariant Suite
|
||||
|
||||
The following automated tests are required before each release:
|
||||
|
||||
1. **Secrets Never Leave Server Test**
|
||||
- Seed known secrets in vault and files.
|
||||
- Trigger prompts/tool outputs containing these values.
|
||||
- Assert outbound payloads and persisted logs contain only placeholders.
|
||||
|
||||
2. **Command Classification Matrix Test**
|
||||
- Execute fixtures for each command class (Green/Yellow/Yellow+External/Red/Critical).
|
||||
- Validate behavior across autonomy levels 1-3.
|
||||
|
||||
3. **External Comms Independence Test**
|
||||
- At autonomy level 3, external action remains blocked when comms gate locked.
|
||||
- Unlock only targeted tool; validate others remain blocked.
|
||||
|
||||
4. **Approval Expiry Test**
|
||||
- Approval request expires at 24h.
|
||||
- Late approval cannot be replayed.
|
||||
|
||||
5. **SECRET_REF Boundary Test**
|
||||
- Secrets cannot be requested directly by raw name/value.
|
||||
- Only valid references in allowlisted tool operations resolve.
|
||||
|
||||
## 4. Provisioning Test Strategy
|
||||
|
||||
## 4.1 Fast Checks
|
||||
|
||||
- Shellcheck + static checks for bash scripts.
|
||||
- Template substitution tests (all placeholders resolved, none leaked).
|
||||
- Stack inventory policy tests (no banned tools like n8n).
|
||||
|
||||
## 4.2 Disposable VPS E2E
|
||||
|
||||
Nightly automated runs:
|
||||
|
||||
- create disposable Debian VPS
|
||||
- run full provisioning
|
||||
- run smoke checks on selected tool endpoints
|
||||
- verify tenant registration + heartbeat + approvals
|
||||
- tear down VPS and collect artifacts
|
||||
|
||||
## 5. Contract Testing
|
||||
|
||||
- OpenAPI specs for Hub APIs and tenant APIs.
|
||||
- Consumer-driven contract tests for:
|
||||
- Safety Wrapper against Hub tenant endpoints
|
||||
- Mobile app against customer endpoints
|
||||
- Website onboarding against public endpoints
|
||||
- Contract break blocks merge.
|
||||
|
||||
## 6. Data And Billing Validation
|
||||
|
||||
- Synthetic token event generator with known totals.
|
||||
- Reconcile tenant usage buckets against Hub aggregated totals.
|
||||
- Reconcile Hub totals against Stripe meter/invoice preview.
|
||||
- Fail build if variance exceeds threshold.
|
||||
|
||||
## 7. Quality Gates (CI)
|
||||
|
||||
- Unit + integration tests must pass.
|
||||
- Security invariants must pass.
|
||||
- Critical package diff review for Safety Wrapper and Provisioner.
|
||||
- Minimum thresholds:
|
||||
- security-critical modules: >=90% branch coverage
|
||||
- overall backend: >=75% branch coverage
|
||||
- Mutation testing on classifier and redactor modules.
|
||||
|
||||
## 8. Human Review Workflow (Anti-AI-Slop)
|
||||
|
||||
Required for security-critical PRs:
|
||||
|
||||
- one reviewer validates threat model assumptions
|
||||
- one reviewer validates test completeness and failure cases
|
||||
- checklist includes: error paths, rollback behavior, idempotency, logging hygiene
|
||||
|
||||
No direct auto-merge for changes in:
|
||||
|
||||
- redaction engine
|
||||
- command classifier
|
||||
- secret storage/resolution
|
||||
- provisioning credential handling
|
||||
|
||||
## 9. Launch Validation Checklist
|
||||
|
||||
Before founding-member launch:
|
||||
|
||||
- 7-day staging soak with no sev-1/2 defects
|
||||
- two successful rollback drills (control plane and tenant plane)
|
||||
- production canary with live approval + billing reconciliation
|
||||
- first-hour templates executed successfully on staging tenants
|
||||
128
docs/architecture-proposal/gpt/08-cicd-strategy-gitea.md
Normal file
128
docs/architecture-proposal/gpt/08-cicd-strategy-gitea.md
Normal file
@@ -0,0 +1,128 @@
|
||||
# 08. CI/CD Strategy (Gitea-Based)
|
||||
|
||||
## 1. Objectives
|
||||
|
||||
- Keep release cadence high without bypassing security checks.
|
||||
- Provide deterministic, reproducible artifacts for Hub, Safety components, and Provisioner.
|
||||
- Enforce policy gates (security invariants, banned tools, contract compatibility) in CI.
|
||||
|
||||
## 2. Platform Baseline
|
||||
|
||||
- CI engine: **Gitea Actions** with self-hosted **act_runner**.
|
||||
- Artifact registry: private container registry (`code.letsbe.solutions/...`).
|
||||
- Deployment target:
|
||||
- Control plane: Docker hosts (EU + US)
|
||||
- Tenant plane: provisioner-managed customer VPS rollout jobs
|
||||
|
||||
## 3. Branch And Release Model
|
||||
|
||||
- `main`: releasable at all times.
|
||||
- short-lived feature branches.
|
||||
- release tags: `hub/vX.Y.Z`, `safety/vX.Y.Z`, `provisioner/vX.Y.Z`.
|
||||
- hotfix branch only for production incidents, merged back to `main` immediately.
|
||||
|
||||
## 4. Pipeline Stages
|
||||
|
||||
## 4.1 Pull Request Pipeline
|
||||
|
||||
1. `lint-typecheck`
|
||||
2. `unit-tests`
|
||||
3. `integration-tests`
|
||||
4. `contract-tests`
|
||||
5. `security-scan` (SAST, dependency vulnerabilities, secret scan)
|
||||
6. `policy-checks`:
|
||||
- banned stack/reference detector (`n8n`, deprecated deploy targets)
|
||||
- no plaintext credentials in artifacts/config
|
||||
7. `build-preview-images`
|
||||
|
||||
## 4.2 Main Branch Pipeline
|
||||
|
||||
1. re-run all PR checks
|
||||
2. build immutable release images
|
||||
3. generate SBOMs
|
||||
4. image signing (cosign/sigstore-compatible)
|
||||
5. push to registry with digest pins
|
||||
6. deploy to `dev` automatically
|
||||
|
||||
## 4.3 Promotion Pipelines
|
||||
|
||||
- `promote-staging`: manual approval gate + smoke tests
|
||||
- `promote-prod-eu`: manual approval + canary checks
|
||||
- `promote-prod-us`: separate manual gate after EU health confirmation
|
||||
|
||||
## 5. Tenant Rollout Pipeline
|
||||
|
||||
Separate workflow for tenant-plane updates:
|
||||
|
||||
- policy-only rollout job
|
||||
- wrapper package rollout job
|
||||
- OpenClaw version rollout campaign
|
||||
|
||||
Rollout controller enforces:
|
||||
|
||||
- canary percentages
|
||||
- halt thresholds
|
||||
- automated rollback trigger execution
|
||||
|
||||
## 6. Required Checks Per Package
|
||||
|
||||
| Package | Required Jobs |
|
||||
|---|---|
|
||||
| Hub | lint, unit, integration, Prisma migration check, API contract tests |
|
||||
| Safety Wrapper | unit, hook integration (OpenClaw pinned tag), redaction/gating invariants |
|
||||
| Egress Proxy | redaction corpus tests, outbound policy tests, perf checks |
|
||||
| Provisioner | shellcheck, template checks, disposable VPS smoke run |
|
||||
| Mobile | typecheck, unit/UI tests, API contract tests, build verification |
|
||||
| Website | lint/typecheck, onboarding flow tests, pricing/quote tests |
|
||||
|
||||
## 7. Example Gitea Workflow Skeleton
|
||||
|
||||
```yaml
|
||||
name: pr-checks
|
||||
on: [pull_request]
|
||||
|
||||
jobs:
|
||||
lint-test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- run: pnpm install --frozen-lockfile
|
||||
- run: pnpm lint && pnpm typecheck
|
||||
- run: pnpm test:unit
|
||||
|
||||
security-policy:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- run: pnpm test:security-invariants
|
||||
- run: ./scripts/ci/check-banned-references.sh
|
||||
- run: ./scripts/ci/check-no-plaintext-secrets.sh
|
||||
```
|
||||
|
||||
## 8. Secrets And Runner Security
|
||||
|
||||
- Gitea secrets scoped by environment (`dev/staging/prod`).
|
||||
- Runner hosts are isolated and ephemeral where possible.
|
||||
- No production credentials in PR jobs.
|
||||
- OIDC-based short-lived cloud/provider credentials preferred over long-lived static tokens.
|
||||
|
||||
## 9. Change Management Gates
|
||||
|
||||
Security-critical paths require extra gate:
|
||||
|
||||
- files under `safety-wrapper/`, `egress-proxy/`, `provisioner/scripts/credentials*`
|
||||
- mandatory 2 reviewers
|
||||
- security test suite pass required
|
||||
- no force-merge override
|
||||
|
||||
## 10. Metrics For CI/CD Quality
|
||||
|
||||
Track weekly:
|
||||
|
||||
- median PR cycle time
|
||||
- flaky test rate
|
||||
- change failure rate
|
||||
- mean time to rollback
|
||||
- canary abort count
|
||||
|
||||
Use these metrics in weekly engineering ops review to keep speed/quality balance aligned with launch target.
|
||||
@@ -0,0 +1,105 @@
|
||||
# 09. Repository Structure Proposal
|
||||
|
||||
## 1. Decision
|
||||
|
||||
**Choose: Monorepo for LetsBe first-party code, with OpenClaw kept as separate pinned upstream dependency.**
|
||||
|
||||
This is the best speed/quality tradeoff for a 3-month launch while preserving the non-fork requirement.
|
||||
|
||||
## 2. Why This Over Multi-Repo
|
||||
|
||||
## 2.1 Benefits
|
||||
|
||||
- Shared TypeScript contracts across Hub, Mobile, Website, and Safety services.
|
||||
- One CI graph with selective test execution and consistent policy checks.
|
||||
- Easier cross-cutting refactors (API shape changes, auth, telemetry schema updates).
|
||||
- Better fit for AI-assisted coding workflows where context continuity matters.
|
||||
|
||||
## 2.2 Risks
|
||||
|
||||
- Larger repo and CI complexity.
|
||||
- Migration effort from existing repo layout.
|
||||
|
||||
## 2.3 Mitigations
|
||||
|
||||
- Use path-based CI execution and build caching.
|
||||
- Keep OpenClaw external to avoid massive vendor code in monorepo.
|
||||
- Execute migration in controlled steps with history-preserving imports.
|
||||
|
||||
## 3. Proposed Structure
|
||||
|
||||
```text
|
||||
letsbe-platform/
|
||||
apps/
|
||||
hub/ # Next.js admin + customer portal + APIs
|
||||
website/ # public onboarding and marketing app
|
||||
mobile/ # React Native + Expo
|
||||
services/
|
||||
safety-wrapper/ # OpenClaw plugin package
|
||||
egress-proxy/ # LLM redaction proxy
|
||||
provisioner/ # provisioning controller + scripts/templates
|
||||
packages/
|
||||
api-contracts/ # OpenAPI specs + TS SDKs
|
||||
policy-engine/ # shared classification and gate logic
|
||||
tooling-sdk/ # adapter framework + SECRET_REF utilities
|
||||
ui-kit/ # shared design components (web/mobile where possible)
|
||||
config/ # eslint/tsconfig/jest/shared tooling
|
||||
infra/
|
||||
gitea-workflows/
|
||||
docker/
|
||||
scripts/
|
||||
docs/
|
||||
architecture-proposal/
|
||||
runbooks/
|
||||
```
|
||||
|
||||
## 4. OpenClaw Upstream Strategy (No Fork)
|
||||
|
||||
OpenClaw remains outside monorepo as independent upstream source:
|
||||
|
||||
- Track pinned release tag in `services/safety-wrapper/openclaw-version.lock`.
|
||||
- CI job pulls pinned OpenClaw version for compatibility tests.
|
||||
- Upgrade workflow:
|
||||
1. open compatibility PR bumping lock file
|
||||
2. run hook-contract test suite
|
||||
3. run staging canary tenants
|
||||
4. promote if green
|
||||
|
||||
If a temporary patch is unavoidable, maintain patch as isolated overlay and upstream contribution plan; do not maintain long-lived fork branch.
|
||||
|
||||
## 5. Migration Plan From Current Repos
|
||||
|
||||
## 5.1 Current Inputs
|
||||
|
||||
- `letsbe-hub`
|
||||
- `letsbe-ansible-runner`
|
||||
- `letsbe-orchestrator` (reference only, not migrated as active runtime)
|
||||
- `letsbe-sysadmin-agent` (reference only, patterns ported into Safety)
|
||||
- `openclaw` (kept external)
|
||||
|
||||
## 5.2 Migration Steps
|
||||
|
||||
1. Create monorepo skeleton and shared package manager workspace.
|
||||
2. Import `letsbe-hub` into `apps/hub` with history.
|
||||
3. Import `letsbe-ansible-runner` into `services/provisioner`.
|
||||
4. Create new `services/safety-wrapper` and `services/egress-proxy`.
|
||||
5. Scaffold `apps/mobile` and `apps/website`.
|
||||
6. Extract shared contracts from hub into `packages/api-contracts`.
|
||||
7. Add compatibility adapters so existing deployments continue during transition.
|
||||
8. Archive deprecated repos as read-only references after cutover.
|
||||
|
||||
## 6. Governance Model
|
||||
|
||||
- CODEOWNERS by area (`hub`, `safety`, `provisioner`, `mobile`, `website`).
|
||||
- Required reviewer policy:
|
||||
- 2 reviewers for `safety-wrapper`, `egress-proxy`, `provisioner` secrets paths.
|
||||
- 1 reviewer for non-security UI changes.
|
||||
- Architectural Decision Records (ADR) stored under `docs/adr`.
|
||||
|
||||
## 7. Alternative Considered: Keep Multi-Repo
|
||||
|
||||
Rejected for v1 because cross-repo contract drift is already visible in current state (legacy APIs, deprecated stacks, stale references). Under a 12-week launch window, contract drift risk is higher than monorepo migration overhead.
|
||||
|
||||
## 8. Post-Launch Option
|
||||
|
||||
After launch, if team scaling or compliance requirements demand stricter isolation, split out mobile and website into separate repos while preserving shared contract package publication.
|
||||
@@ -0,0 +1,102 @@
|
||||
# 10. Technology Validation Sources
|
||||
|
||||
Validation date: **2026-02-26**
|
||||
|
||||
This proposal uses current official documentation (and release notes where relevant) for each major recommended technology.
|
||||
|
||||
## 1. OpenClaw
|
||||
|
||||
- Docs home: https://docs.openclaw.ai/
|
||||
- Plugin development/hooks: https://docs.openclaw.ai/guide/developers/plugins/overview/
|
||||
- Browser tool docs: https://docs.openclaw.ai/guide/tools/browser/
|
||||
- OpenClaw GitHub releases/readme: https://github.com/openclawai/openclaw
|
||||
|
||||
Used for:
|
||||
- hook names and plugin lifecycle
|
||||
- browser capabilities and profile modes
|
||||
- upstream release/update model
|
||||
|
||||
## 2. Next.js
|
||||
|
||||
- Official docs: https://nextjs.org/docs
|
||||
- Release notes: https://nextjs.org/blog
|
||||
|
||||
Used for:
|
||||
- app router patterns
|
||||
- deployment/runtime guidance
|
||||
- version-aware migration planning
|
||||
|
||||
## 3. Prisma
|
||||
|
||||
- Official docs: https://www.prisma.io/docs
|
||||
- ORM release notes: https://github.com/prisma/prisma/releases
|
||||
|
||||
Used for:
|
||||
- schema/migration guidance
|
||||
- Prisma Client behavior and deployment practices
|
||||
|
||||
## 4. React Native + Expo
|
||||
|
||||
- React Native docs: https://reactnative.dev/docs/getting-started
|
||||
- React Native releases: https://github.com/facebook/react-native/releases
|
||||
- Expo docs: https://docs.expo.dev/
|
||||
- Expo SDK changelog: https://expo.dev/changelog
|
||||
|
||||
Used for:
|
||||
- mobile stack decision
|
||||
- push notification and build pipeline planning
|
||||
|
||||
## 5. Flutter (evaluated alternative)
|
||||
|
||||
- Flutter docs: https://docs.flutter.dev/
|
||||
- Flutter releases: https://github.com/flutter/flutter/releases
|
||||
|
||||
Used for:
|
||||
- alternative comparison for mobile stack decision
|
||||
|
||||
## 6. Playwright
|
||||
|
||||
- Official docs: https://playwright.dev/docs/intro
|
||||
- Release notes: https://playwright.dev/docs/release-notes
|
||||
|
||||
Used for:
|
||||
- browser automation fallback strategy
|
||||
- testing and scenario migration approach
|
||||
|
||||
## 7. SQLite
|
||||
|
||||
- SQLite docs: https://www.sqlite.org/docs.html
|
||||
- SQLite file format/security references: https://www.sqlite.org/fileformat.html
|
||||
|
||||
Used for:
|
||||
- tenant-local vault, approval cache, and usage bucket storage design
|
||||
|
||||
## 8. Stripe
|
||||
|
||||
- Stripe API docs: https://docs.stripe.com/api
|
||||
- Usage-based billing/meter events: https://docs.stripe.com/billing/subscriptions/usage-based
|
||||
|
||||
Used for:
|
||||
- overage billing architecture
|
||||
- usage ingestion and invoice flow design
|
||||
|
||||
## 9. Gitea Actions / Act Runner
|
||||
|
||||
- Gitea Actions docs: https://docs.gitea.com/usage/actions/overview
|
||||
- Act runner docs: https://docs.gitea.com/usage/actions/act-runner
|
||||
|
||||
Used for:
|
||||
- CI/CD workflow strategy
|
||||
- runner security and deployment pipeline design
|
||||
|
||||
## 10. Additional Provider References
|
||||
|
||||
- Netcup API context (existing integration baseline): https://www.netcup.com/en
|
||||
- Hetzner Cloud docs (overflow strategy): https://docs.hetzner.cloud/
|
||||
|
||||
Used for:
|
||||
- provider-agnostic provisioning strategy
|
||||
|
||||
## 11. Note On Source Priority
|
||||
|
||||
For technical decisions, this proposal prioritizes primary official documentation and release notes over secondary summaries.
|
||||
41
docs/architecture-proposal/gpt/README.md
Normal file
41
docs/architecture-proposal/gpt/README.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# LetsBe Biz Architecture Proposal (GPT Team)
|
||||
|
||||
Date: 2026-02-26
|
||||
Author: GPT Architecture Team
|
||||
|
||||
This folder contains the complete architecture development plan requested in `docs/technical/LetsBe_Biz_Architecture_Brief.md` Section 1.
|
||||
|
||||
## Deliverables Index
|
||||
|
||||
0. [00-executive-summary.md](./00-executive-summary.md)
|
||||
Executive direction and launch gating summary.
|
||||
1. [01-architecture-and-dataflows.md](./01-architecture-and-dataflows.md)
|
||||
Architecture document with system diagrams and data flow diagrams.
|
||||
2. [02-components-and-api-contracts.md](./02-components-and-api-contracts.md)
|
||||
Component breakdown and API contracts.
|
||||
3. [03-deployment-strategy.md](./03-deployment-strategy.md)
|
||||
Deployment strategy for control plane and tenant plane.
|
||||
4. [04-implementation-plan-and-dependency-graph.md](./04-implementation-plan-and-dependency-graph.md)
|
||||
Detailed implementation plan, task breakdown, and dependency graph.
|
||||
5. [05-estimated-timelines.md](./05-estimated-timelines.md)
|
||||
Estimated timelines and milestone schedule.
|
||||
6. [06-risk-assessment.md](./06-risk-assessment.md)
|
||||
Risk assessment and mitigation plan.
|
||||
7. [07-testing-strategy.md](./07-testing-strategy.md)
|
||||
Testing strategy proposal.
|
||||
8. [08-cicd-strategy-gitea.md](./08-cicd-strategy-gitea.md)
|
||||
Gitea-based CI/CD strategy.
|
||||
9. [09-repository-structure-proposal.md](./09-repository-structure-proposal.md)
|
||||
Repository structure proposal and migration plan.
|
||||
10. [10-technology-validation-sources.md](./10-technology-validation-sources.md)
|
||||
Current official documentation references used to validate technology choices.
|
||||
|
||||
## Executive Direction (One-Page Summary)
|
||||
|
||||
- Keep `letsbe-hub` (Next.js + Prisma) and retool it; do not rewrite core backend in v1 launch window.
|
||||
- Build Safety Wrapper as OpenClaw plugin + local egress secrets proxy; keep OpenClaw upstream and un-forked.
|
||||
- Remove all `n8n` and deprecated-stack references as a hard prerequisite (Week 1).
|
||||
- Replace orchestrator/sysadmin responsibilities with explicit Hub↔Safety APIs and local execution adapters.
|
||||
- Build mobile app with React Native + Expo for speed, push approvals, and shared TypeScript contracts.
|
||||
- Use monorepo for first-party LetsBe code (Hub, Mobile, Safety services, Provisioner), while consuming OpenClaw as pinned upstream dependency.
|
||||
- Target 12-week founding-member launch with strict security quality gates, canary rollout, and staged feature hardening.
|
||||
Reference in New Issue
Block a user