251 lines
9.3 KiB
Markdown
251 lines
9.3 KiB
Markdown
|
|
# 01. Architecture And Data Flows
|
||
|
|
|
||
|
|
## 1. Scope And Non-Negotiables
|
||
|
|
|
||
|
|
This proposal is explicitly designed around the fixed constraints from the Architecture Brief:
|
||
|
|
|
||
|
|
- 4-layer security model is mandatory.
|
||
|
|
- Secrets never leave tenant server is mandatory.
|
||
|
|
- 3-tier autonomy + external communications gate is mandatory.
|
||
|
|
- OpenClaw is upstream dependency (no fork by default).
|
||
|
|
- One customer = one VPS is mandatory.
|
||
|
|
- `n8n` removal is prerequisite.
|
||
|
|
|
||
|
|
## 2. Proposed Target Architecture
|
||
|
|
|
||
|
|
### 2.1 Core Decisions
|
||
|
|
|
||
|
|
| Decision | Proposal | Why |
|
||
|
|
|---|---|---|
|
||
|
|
| Hub stack | Keep Next.js + Prisma + PostgreSQL | Existing app already has major workflows and 80+ APIs; rewrite is timeline-risky for 3-month launch. |
|
||
|
|
| OpenClaw integration | Use pinned upstream release, no fork | Maximizes upgrade velocity and avoids merge debt. |
|
||
|
|
| Safety Wrapper shape | Hybrid: OpenClaw plugin + local egress proxy + local execution adapters | Gives direct hook interception plus transport-level redaction guarantee. |
|
||
|
|
| Mobile | React Native + Expo | Fastest path to iOS/Android with TypeScript contract reuse. |
|
||
|
|
| Website | Separate public web app (same monorepo) + Hub public APIs | Security isolation between public onboarding and admin/customer portal. |
|
||
|
|
| Repo strategy | Monorepo for first-party services; OpenClaw kept separate upstream repo | Strong contract sharing + CI simplicity without violating upstream dependency model. |
|
||
|
|
|
||
|
|
### 2.2 System Context Diagram
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
flowchart LR
|
||
|
|
subgraph Client[Client Layer]
|
||
|
|
M[Mobile App\nReact Native + Expo]
|
||
|
|
W[Website\nOnboarding + Checkout]
|
||
|
|
C[Customer Portal Web]
|
||
|
|
A[Admin Portal Web]
|
||
|
|
end
|
||
|
|
|
||
|
|
subgraph Control[Central Platform]
|
||
|
|
H[Hub API + UI\nNext.js + Prisma]
|
||
|
|
DB[(PostgreSQL)]
|
||
|
|
Q[Background Workers\nAutomation + Metering]
|
||
|
|
N[Notification Service\nPush/Email]
|
||
|
|
ST[Stripe]
|
||
|
|
NC[Netcup/Hetzner]
|
||
|
|
end
|
||
|
|
|
||
|
|
subgraph Tenant[Per-Customer VPS]
|
||
|
|
OC[OpenClaw Gateway\nUpstream]
|
||
|
|
SW[Safety Wrapper Plugin\nHooks + Classification]
|
||
|
|
SP[LLM Egress Proxy\nSecrets Firewall]
|
||
|
|
SV[(Secrets Vault SQLite\nEncrypted)]
|
||
|
|
TA[Tool Adapters + Exec Guards]
|
||
|
|
TS[(Tool Stacks 25+)]
|
||
|
|
AP[(Approval Cache SQLite)]
|
||
|
|
TU[(Token Usage Buckets)]
|
||
|
|
end
|
||
|
|
|
||
|
|
M --> H
|
||
|
|
W --> H
|
||
|
|
C --> H
|
||
|
|
A --> H
|
||
|
|
|
||
|
|
H --> DB
|
||
|
|
H --> Q
|
||
|
|
H --> N
|
||
|
|
H <--> ST
|
||
|
|
H <--> NC
|
||
|
|
|
||
|
|
H <--> OC
|
||
|
|
OC --> SW
|
||
|
|
SW --> SP
|
||
|
|
SP --> LLM[(LLM Providers)]
|
||
|
|
|
||
|
|
SW <--> SV
|
||
|
|
SW <--> TA
|
||
|
|
TA <--> TS
|
||
|
|
SW <--> AP
|
||
|
|
SW --> TU
|
||
|
|
TU --> H
|
||
|
|
```
|
||
|
|
|
||
|
|
## 3. Tenant Runtime Architecture
|
||
|
|
|
||
|
|
### 3.1 4-Layer Security Enforcement
|
||
|
|
|
||
|
|
| Layer | Enforcement Point | Implementation |
|
||
|
|
|---|---|---|
|
||
|
|
| 1. Sandbox | OpenClaw runtime/tool sandbox settings | OpenClaw native sandbox + process/container isolation. |
|
||
|
|
| 2. Tool Policy | OpenClaw agent tool allow/deny | Per-agent tool manifest; tools not listed are unreachable. |
|
||
|
|
| 3. Command Gating | Safety Wrapper `before_tool_call` | Green/Yellow/Yellow+External/Red/Critical Red classification + approval flow. |
|
||
|
|
| 4. Secrets Redaction | Local egress proxy + transcript hooks | Outbound prompt redaction before network egress, plus log/transcript redaction hooks. |
|
||
|
|
|
||
|
|
### 3.2 Safety Wrapper Components
|
||
|
|
|
||
|
|
- `classification-engine`: deterministic rules engine with signed policy bundle from Hub.
|
||
|
|
- `approval-gateway`: sync/async approval requests to Hub, with 24h expiry.
|
||
|
|
- `secret-ref-resolver`: resolves `SECRET_REF(...)` at execution time only.
|
||
|
|
- `adapter-runtime`: executes tool API adapters and guarded shell/docker/file actions.
|
||
|
|
- `metering-collector`: captures per-agent/per-model token usage and aggregates hourly.
|
||
|
|
- `hub-sync-client`: registration, heartbeat, config pull, backup status, command results.
|
||
|
|
|
||
|
|
### 3.3 OpenClaw Hook Usage (No Fork)
|
||
|
|
|
||
|
|
Safety Wrapper plugin uses upstream hook points for enforcement and observability:
|
||
|
|
|
||
|
|
- `before_tool_call`: classify/gate/block/require approval.
|
||
|
|
- `after_tool_call`: audit capture + normalization.
|
||
|
|
- `message_sending`: outbound content redaction.
|
||
|
|
- `before_message_write`, `tool_result_persist`: local persistence redaction.
|
||
|
|
- `llm_output`: token accounting and per-model usage capture.
|
||
|
|
- `before_prompt_build`: inject cacheable SOUL/TOOLS prefix metadata.
|
||
|
|
- `subagent_spawning`: enforce max depth/budget.
|
||
|
|
- `gateway_start`: health checks + Hub session bootstrap.
|
||
|
|
|
||
|
|
## 4. Primary Data Flows
|
||
|
|
|
||
|
|
### 4.1 Signup To Provisioning Flow
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
sequenceDiagram
|
||
|
|
participant User
|
||
|
|
participant Site as Website
|
||
|
|
participant Hub
|
||
|
|
participant Stripe
|
||
|
|
participant Worker as Automation Worker
|
||
|
|
participant Provider as Netcup/Hetzner
|
||
|
|
participant Prov as Provisioner
|
||
|
|
participant VPS as Tenant VPS
|
||
|
|
|
||
|
|
User->>Site: Describe business + pick tools
|
||
|
|
Site->>Hub: Create onboarding draft
|
||
|
|
Site->>Stripe: Checkout session
|
||
|
|
Stripe-->>Hub: checkout.session.completed
|
||
|
|
Hub->>Worker: Create order (PAYMENT_CONFIRMED)
|
||
|
|
Worker->>Provider: Allocate VPS
|
||
|
|
Provider-->>Worker: VPS ready (IP + creds)
|
||
|
|
Worker->>Hub: DNS_PENDING -> DNS_READY
|
||
|
|
Worker->>Prov: Start provisioning job
|
||
|
|
Prov->>VPS: Install stacks + OpenClaw + Safety
|
||
|
|
Prov->>VPS: Seed secrets vault + tool registry
|
||
|
|
Prov->>VPS: Register tenant with Hub
|
||
|
|
VPS-->>Hub: register + first heartbeat
|
||
|
|
Hub-->>User: Provisioning complete + app links
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.2 Agent Tool Call With Gating
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
sequenceDiagram
|
||
|
|
participant U as User
|
||
|
|
participant OC as OpenClaw
|
||
|
|
participant SW as Safety Wrapper
|
||
|
|
participant H as Hub
|
||
|
|
participant T as Tool/API
|
||
|
|
|
||
|
|
U->>OC: "Publish this newsletter"
|
||
|
|
OC->>SW: tool call proposal
|
||
|
|
SW->>SW: classify = Yellow+External
|
||
|
|
SW->>H: approval request
|
||
|
|
H-->>U: push approval request
|
||
|
|
U->>H: approve
|
||
|
|
H-->>SW: approval grant
|
||
|
|
SW->>T: execute with SECRET_REF injection
|
||
|
|
T-->>SW: result
|
||
|
|
SW-->>OC: redacted result
|
||
|
|
OC-->>U: completion summary
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.3 Secrets Redaction Outbound Flow
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
flowchart LR
|
||
|
|
A[OpenClaw Prompt Payload] --> B[Safety Wrapper Pre-Redaction]
|
||
|
|
B --> C[Secrets Registry Match]
|
||
|
|
C --> D[Pattern Safety Net]
|
||
|
|
D --> E[Function-Call SecretRef Rebinding]
|
||
|
|
E --> F[Local Egress Proxy]
|
||
|
|
F --> G[Provider API]
|
||
|
|
|
||
|
|
C --> C1[(Vault SQLite)]
|
||
|
|
D --> D1[(Regex + Entropy Rules)]
|
||
|
|
F --> F1[Transport-Level Block if bypass attempt]
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.4 Token Metering And Billing
|
||
|
|
|
||
|
|
```mermaid
|
||
|
|
flowchart LR
|
||
|
|
O[OpenClaw llm_output hook] --> M[Metering Collector]
|
||
|
|
M --> B[(Hourly Buckets SQLite)]
|
||
|
|
B --> H[Hub Usage Ingest API]
|
||
|
|
H --> P[(Billing Period + Usage Tables)]
|
||
|
|
P --> S[Stripe Usage/Billing]
|
||
|
|
H --> UI[Usage Dashboard + Alerts]
|
||
|
|
```
|
||
|
|
|
||
|
|
## 5. Prompt Caching Architecture
|
||
|
|
|
||
|
|
- SOUL.md and TOOLS.md are split into stable cacheable prefix blocks and dynamic suffix blocks.
|
||
|
|
- Stable prefix hash is generated per agent version.
|
||
|
|
- Prefix changes only when agent config changes; day-to-day conversations hit cache-read pricing.
|
||
|
|
- Metering persists `input/output/cache_read/cache_write` separately to preserve margin analytics.
|
||
|
|
|
||
|
|
## 6. Mobile, Website, And Channel Architecture
|
||
|
|
|
||
|
|
### 6.1 Mobile App
|
||
|
|
|
||
|
|
- React Native + Expo app as primary interface.
|
||
|
|
- Real-time chat via Hub websocket gateway.
|
||
|
|
- Approvals as push notifications (approve/deny quick actions).
|
||
|
|
- Fallback channel switchboard in Hub for WhatsApp/Telegram relay adapters.
|
||
|
|
|
||
|
|
### 6.2 Website + Onboarding
|
||
|
|
|
||
|
|
- Dedicated public frontend app (`apps/website`) with strict network boundary to Hub public APIs.
|
||
|
|
- Onboarding classifier service (cheap model profile) performs 1-2 message business classification.
|
||
|
|
- Tool bundle recommendation engine returns editable stack + resource calculator.
|
||
|
|
- Checkout remains Stripe-hosted.
|
||
|
|
|
||
|
|
## 7. First-Hour Workflow Templates (Architecture Proof)
|
||
|
|
|
||
|
|
| Template | Cross-Tool Actions | Gating Profile |
|
||
|
|
|---|---|---|
|
||
|
|
| Freelancer First Hour | Connect mail + calendar, create folders, configure intake form, first daily brief | Mostly Green/Yellow |
|
||
|
|
| Agency First Hour | Chat inbox setup, project board scaffolding, proposal template generation, shared KB setup | Yellow + Yellow+External approval |
|
||
|
|
| E-commerce First Hour | Inventory import, support inbox routing, analytics dashboard baseline, recovery email draft | Mixed Yellow/Yellow+External |
|
||
|
|
| Consulting First Hour | Scheduling links, client doc signature template, CRM stages, weekly report automation | Mostly Yellow + one external gate |
|
||
|
|
|
||
|
|
These templates are codified as audited workflow blueprints executed through the same command classification path as ad-hoc agent actions.
|
||
|
|
|
||
|
|
## 8. Interactive Demo Architecture (Pre-Purchase)
|
||
|
|
|
||
|
|
Proposal: shared but isolated "Demo Tenant Pool" instead of a single static demo VPS.
|
||
|
|
|
||
|
|
- Each prospect gets a short-lived demo tenant snapshot (TTL 2 hours).
|
||
|
|
- Demo runs synthetic data and fake outbound integrations only.
|
||
|
|
- Same Safety Wrapper + approvals UI as production to demonstrate trust model.
|
||
|
|
- Recycled automatically after session expiry.
|
||
|
|
|
||
|
|
This is safer and more realistic than one long-lived shared "Bella's Bakery" host.
|
||
|
|
|
||
|
|
## 9. Required Pre-Launch Cleanup Baseline
|
||
|
|
|
||
|
|
Before core build starts, execute repository cleanup gate:
|
||
|
|
|
||
|
|
- Remove all `n8n` references from Hub, Provisioner, stacks, scripts, tests, and docs used for production behavior.
|
||
|
|
- Remove deployment references to deprecated `orchestrator` and `sysadmin-agent` from active provisioning paths.
|
||
|
|
- Close plaintext credential leak path (`jobs/*/config.json` root password exposure) by moving to one-time secret files + immediate secure deletion.
|
||
|
|
|
||
|
|
No feature work should proceed until this baseline passes CI policy checks.
|