9.3 KiB
9.3 KiB
01. Architecture And Data Flows
1. Scope And Non-Negotiables
This proposal is explicitly designed around the fixed constraints from the Architecture Brief:
- 4-layer security model is mandatory.
- Secrets never leave tenant server is mandatory.
- 3-tier autonomy + external communications gate is mandatory.
- OpenClaw is upstream dependency (no fork by default).
- One customer = one VPS is mandatory.
n8nremoval is prerequisite.
2. Proposed Target Architecture
2.1 Core Decisions
| Decision | Proposal | Why |
|---|---|---|
| Hub stack | Keep Next.js + Prisma + PostgreSQL | Existing app already has major workflows and 80+ APIs; rewrite is timeline-risky for 3-month launch. |
| OpenClaw integration | Use pinned upstream release, no fork | Maximizes upgrade velocity and avoids merge debt. |
| Safety Wrapper shape | Hybrid: OpenClaw plugin + local egress proxy + local execution adapters | Gives direct hook interception plus transport-level redaction guarantee. |
| Mobile | React Native + Expo | Fastest path to iOS/Android with TypeScript contract reuse. |
| Website | Separate public web app (same monorepo) + Hub public APIs | Security isolation between public onboarding and admin/customer portal. |
| Repo strategy | Monorepo for first-party services; OpenClaw kept separate upstream repo | Strong contract sharing + CI simplicity without violating upstream dependency model. |
2.2 System Context Diagram
flowchart LR
subgraph Client[Client Layer]
M[Mobile App\nReact Native + Expo]
W[Website\nOnboarding + Checkout]
C[Customer Portal Web]
A[Admin Portal Web]
end
subgraph Control[Central Platform]
H[Hub API + UI\nNext.js + Prisma]
DB[(PostgreSQL)]
Q[Background Workers\nAutomation + Metering]
N[Notification Service\nPush/Email]
ST[Stripe]
NC[Netcup/Hetzner]
end
subgraph Tenant[Per-Customer VPS]
OC[OpenClaw Gateway\nUpstream]
SW[Safety Wrapper Plugin\nHooks + Classification]
SP[LLM Egress Proxy\nSecrets Firewall]
SV[(Secrets Vault SQLite\nEncrypted)]
TA[Tool Adapters + Exec Guards]
TS[(Tool Stacks 25+)]
AP[(Approval Cache SQLite)]
TU[(Token Usage Buckets)]
end
M --> H
W --> H
C --> H
A --> H
H --> DB
H --> Q
H --> N
H <--> ST
H <--> NC
H <--> OC
OC --> SW
SW --> SP
SP --> LLM[(LLM Providers)]
SW <--> SV
SW <--> TA
TA <--> TS
SW <--> AP
SW --> TU
TU --> H
3. Tenant Runtime Architecture
3.1 4-Layer Security Enforcement
| Layer | Enforcement Point | Implementation |
|---|---|---|
| 1. Sandbox | OpenClaw runtime/tool sandbox settings | OpenClaw native sandbox + process/container isolation. |
| 2. Tool Policy | OpenClaw agent tool allow/deny | Per-agent tool manifest; tools not listed are unreachable. |
| 3. Command Gating | Safety Wrapper before_tool_call |
Green/Yellow/Yellow+External/Red/Critical Red classification + approval flow. |
| 4. Secrets Redaction | Local egress proxy + transcript hooks | Outbound prompt redaction before network egress, plus log/transcript redaction hooks. |
3.2 Safety Wrapper Components
classification-engine: deterministic rules engine with signed policy bundle from Hub.approval-gateway: sync/async approval requests to Hub, with 24h expiry.secret-ref-resolver: resolvesSECRET_REF(...)at execution time only.adapter-runtime: executes tool API adapters and guarded shell/docker/file actions.metering-collector: captures per-agent/per-model token usage and aggregates hourly.hub-sync-client: registration, heartbeat, config pull, backup status, command results.
3.3 OpenClaw Hook Usage (No Fork)
Safety Wrapper plugin uses upstream hook points for enforcement and observability:
before_tool_call: classify/gate/block/require approval.after_tool_call: audit capture + normalization.message_sending: outbound content redaction.before_message_write,tool_result_persist: local persistence redaction.llm_output: token accounting and per-model usage capture.before_prompt_build: inject cacheable SOUL/TOOLS prefix metadata.subagent_spawning: enforce max depth/budget.gateway_start: health checks + Hub session bootstrap.
4. Primary Data Flows
4.1 Signup To Provisioning Flow
sequenceDiagram
participant User
participant Site as Website
participant Hub
participant Stripe
participant Worker as Automation Worker
participant Provider as Netcup/Hetzner
participant Prov as Provisioner
participant VPS as Tenant VPS
User->>Site: Describe business + pick tools
Site->>Hub: Create onboarding draft
Site->>Stripe: Checkout session
Stripe-->>Hub: checkout.session.completed
Hub->>Worker: Create order (PAYMENT_CONFIRMED)
Worker->>Provider: Allocate VPS
Provider-->>Worker: VPS ready (IP + creds)
Worker->>Hub: DNS_PENDING -> DNS_READY
Worker->>Prov: Start provisioning job
Prov->>VPS: Install stacks + OpenClaw + Safety
Prov->>VPS: Seed secrets vault + tool registry
Prov->>VPS: Register tenant with Hub
VPS-->>Hub: register + first heartbeat
Hub-->>User: Provisioning complete + app links
4.2 Agent Tool Call With Gating
sequenceDiagram
participant U as User
participant OC as OpenClaw
participant SW as Safety Wrapper
participant H as Hub
participant T as Tool/API
U->>OC: "Publish this newsletter"
OC->>SW: tool call proposal
SW->>SW: classify = Yellow+External
SW->>H: approval request
H-->>U: push approval request
U->>H: approve
H-->>SW: approval grant
SW->>T: execute with SECRET_REF injection
T-->>SW: result
SW-->>OC: redacted result
OC-->>U: completion summary
4.3 Secrets Redaction Outbound Flow
flowchart LR
A[OpenClaw Prompt Payload] --> B[Safety Wrapper Pre-Redaction]
B --> C[Secrets Registry Match]
C --> D[Pattern Safety Net]
D --> E[Function-Call SecretRef Rebinding]
E --> F[Local Egress Proxy]
F --> G[Provider API]
C --> C1[(Vault SQLite)]
D --> D1[(Regex + Entropy Rules)]
F --> F1[Transport-Level Block if bypass attempt]
4.4 Token Metering And Billing
flowchart LR
O[OpenClaw llm_output hook] --> M[Metering Collector]
M --> B[(Hourly Buckets SQLite)]
B --> H[Hub Usage Ingest API]
H --> P[(Billing Period + Usage Tables)]
P --> S[Stripe Usage/Billing]
H --> UI[Usage Dashboard + Alerts]
5. Prompt Caching Architecture
- SOUL.md and TOOLS.md are split into stable cacheable prefix blocks and dynamic suffix blocks.
- Stable prefix hash is generated per agent version.
- Prefix changes only when agent config changes; day-to-day conversations hit cache-read pricing.
- Metering persists
input/output/cache_read/cache_writeseparately to preserve margin analytics.
6. Mobile, Website, And Channel Architecture
6.1 Mobile App
- React Native + Expo app as primary interface.
- Real-time chat via Hub websocket gateway.
- Approvals as push notifications (approve/deny quick actions).
- Fallback channel switchboard in Hub for WhatsApp/Telegram relay adapters.
6.2 Website + Onboarding
- Dedicated public frontend app (
apps/website) with strict network boundary to Hub public APIs. - Onboarding classifier service (cheap model profile) performs 1-2 message business classification.
- Tool bundle recommendation engine returns editable stack + resource calculator.
- Checkout remains Stripe-hosted.
7. First-Hour Workflow Templates (Architecture Proof)
| Template | Cross-Tool Actions | Gating Profile |
|---|---|---|
| Freelancer First Hour | Connect mail + calendar, create folders, configure intake form, first daily brief | Mostly Green/Yellow |
| Agency First Hour | Chat inbox setup, project board scaffolding, proposal template generation, shared KB setup | Yellow + Yellow+External approval |
| E-commerce First Hour | Inventory import, support inbox routing, analytics dashboard baseline, recovery email draft | Mixed Yellow/Yellow+External |
| Consulting First Hour | Scheduling links, client doc signature template, CRM stages, weekly report automation | Mostly Yellow + one external gate |
These templates are codified as audited workflow blueprints executed through the same command classification path as ad-hoc agent actions.
8. Interactive Demo Architecture (Pre-Purchase)
Proposal: shared but isolated "Demo Tenant Pool" instead of a single static demo VPS.
- Each prospect gets a short-lived demo tenant snapshot (TTL 2 hours).
- Demo runs synthetic data and fake outbound integrations only.
- Same Safety Wrapper + approvals UI as production to demonstrate trust model.
- Recycled automatically after session expiry.
This is safer and more realistic than one long-lived shared "Bella's Bakery" host.
9. Required Pre-Launch Cleanup Baseline
Before core build starts, execute repository cleanup gate:
- Remove all
n8nreferences from Hub, Provisioner, stacks, scripts, tests, and docs used for production behavior. - Remove deployment references to deprecated
orchestratorandsysadmin-agentfrom active provisioning paths. - Close plaintext credential leak path (
jobs/*/config.jsonroot password exposure) by moving to one-time secret files + immediate secure deletion.
No feature work should proceed until this baseline passes CI policy checks.