LetsBeBiz-Redesign/Architecture Plans/LetsBe_Biz_Architecture_Bri...

21 KiB
Raw Blame History

LetsBe Biz — Architecture Brief

Date: February 26, 2026 Author: Matt (Founder) Purpose: Competing architecture proposals from two independent teams Status: ACTIVE — Awaiting proposals


1. What This Brief Is

You are being asked to produce a complete architecture development plan for the LetsBe Biz platform. A second, independent team is doing the same thing from the same brief. Matt will compare both proposals and select the best approach (or combine the strongest elements from each).

Your deliverables:

  1. Architecture document with system diagrams and data flow diagrams
  2. Component breakdown with API contracts
  3. Deployment strategy
  4. Detailed implementation plan with task breakdown and dependency graph
  5. Estimated timelines
  6. Risk assessment
  7. Testing strategy proposal
  8. CI/CD strategy (Gitea-based — see Section 9)
  9. Repository structure proposal (monorepo vs. multi-repo — your call, justify it)

Read the full codebase. You have access to the existing repo. Examine the Hub, Provisioner, Docker stacks, nginx configs, and all documentation in docs/. The existing Technical Architecture document (docs/technical/LetsBe_Biz_Technical_Architecture.md) is the most detailed reference — read it thoroughly.


2. What We're Building

LetsBe Biz is a privacy-first AI workforce platform for SMBs. Each customer gets an isolated VPS running 25+ open-source business tools, managed by a team of AI agents that autonomously operate those tools on behalf of the business owner.

The platform has two domains:

  • Central Platform — Hub (admin/customer portal, billing, provisioning, monitoring) + Provisioner (one-shot VPS setup)
  • Tenant Server — OpenClaw (AI agent runtime) + Safety Wrapper (secrets redaction, command gating, Hub communication) + Tool Stacks (25+ containerized business tools)

Customers interact via a mobile app and a web portal. The AI agents talk to business tools via REST APIs and browser automation.


3. Non-Negotiables

These constraints are locked. Do not propose alternatives — design around them.

3.1 Privacy Architecture (4-Layer Security Model)

Security is enforced through four independent layers, each adding restrictions. No layer can expand access granted by layers above it.

Layer What It Does Enforced By
1. Sandbox Controls where code runs (container isolation) OpenClaw native
2. Tool Policy Controls what tools each agent can see OpenClaw native (allow/deny arrays)
3. Command Gating Controls what operations require human approval Safety Wrapper (LetsBe layer)
4. Secrets Redaction Strips all credentials from outbound LLM traffic Safety Wrapper (always on, non-negotiable)

Invariant: Secrets never leave the customer's server. All credential redaction happens locally before any data reaches an LLM provider. This is enforced at the transport layer, not by trusting the AI.

3.2 AI Autonomy Levels (3-Tier System)

Customers control how much the AI does without approval:

Level Name Auto-Execute Requires Approval
1 Training Wheels Green (read-only) Yellow + Red + Critical Red
2 Trusted Assistant (default) Green + Yellow Red + Critical Red
3 Full Autonomy Green + Yellow + Red Critical Red only

External Communications Gate: Operations that send information outside the business (publish blog posts, send emails, reply to customers) are gated by a separate mechanism, independent of autonomy levels. Even at Level 3, external comms remain gated until the user explicitly unlocks them per agent, per tool. This is a product principle — a misworded email to a client is worse than a delayed newsletter.

3.3 Command Classification (5 Tiers)

Every tool call is classified before execution:

  • Green — Non-destructive (reads, status checks, analytics) → auto-execute at all levels
  • Yellow — Modifying (restart containers, write files, update configs) → auto-execute at Level 2+
  • Yellow+External — External-facing (publish, send emails, reply to customers) → gated by External Comms Gate
  • Red — Destructive (delete files, remove containers, drop tables) → auto-execute at Level 3 only
  • Critical Red — Irreversible (drop database, modify firewall, wipe backups) → always gated

3.4 OpenClaw as Upstream Dependency

OpenClaw is the AI agent runtime. It is treated as a dependency, not a fork. All LetsBe-specific logic lives outside OpenClaw's codebase. Use the latest stable release. If you are genuinely convinced that modifying OpenClaw is necessary, you may propose it — but you must also propose a strategy for maintaining those modifications across upstream updates. The strong preference is to avoid forking.

3.5 One Customer = One VPS

Each customer gets their own isolated VPS. No multi-tenant servers. This is permanent for v1.


4. What Needs to Be Built (Full Tier 1 Scope)

All of the following are in scope for your architecture plan. This is the full scope for v1 launch.

4.1 Safety Wrapper (Core IP)

The competitive moat. Five responsibilities:

  1. Secrets Firewall — 4-layer redaction (registry lookup → outbound redaction → pattern safety net → function-call proxy). All LLM-bound traffic is scrubbed before leaving the VPS.
  2. Command Classification — Every tool call classified into Green/Yellow/Yellow+External/Red/Critical Red and gated based on agent's effective autonomy level.
  3. Tool Execution Layer — Capabilities ported from the deprecated sysadmin agent: shell execution (allowlisted), Docker operations, file read/write, env read/update, plus 24+ tool API adapters.
  4. Hub Communication — Registration, heartbeat, config sync, approval request routing, token usage reporting, backup status.
  5. Token Metering — Per-agent, per-model token tracking with hourly bucket aggregation for billing.

Architecture choice is yours. The current Technical Architecture proposes an OpenClaw extension (in-process) plus a separate thin secrets proxy. You may propose an alternative architecture (sidecar, full proxy, different split) as long as the five responsibilities are met and the secrets-never-leave-the-server guarantee holds.

4.2 Tool Registry + Adapters

24+ business tools need to be accessible to AI agents. Three access patterns:

  1. REST API via exec tool (primary) — Agent runs curl commands; Safety Wrapper intercepts, injects credentials via SECRET_REF, audits.
  2. CLI binaries via exec tool — For external services (e.g., Google via gog CLI, IMAP via himalaya).
  3. Browser automation (fallback) — OpenClaw's native Playwright/CDP browser for tools without APIs.

A tool registry (tool-registry.json) describes every installed tool with its URL, auth method, credential references, and cheat sheet location. The registry is loaded into agent context.

Cheat sheets are per-tool markdown files with API documentation, common operations, and example curl commands. Loaded on-demand to conserve tokens.

4.3 Hub Updates

The Hub is an existing Next.js + Prisma application (~15,000 LOC, 244 source files, 80+ API endpoints, 20+ Prisma models). It needs:

New capabilities:

  • Customer-facing portal API (dashboard, agent management, usage tracking, command approvals, billing)
  • Token metering and overage billing (Stripe integration exists)
  • Agent management API (SOUL.md, TOOLS.md, permissions, model selection)
  • Safety Wrapper communication endpoints (registration, heartbeat, config sync, approval routing)
  • Command approval queue (Yellow/Red commands surface for admin/customer approval)
  • Token usage analytics dashboard
  • Founding member program tracking (2× token allotment for 12 months)

You may propose a different backend stack if you can justify it. The existing Hub is production-ready for its current scope. A rewrite must account for the 80+ working endpoints and 20+ data models.

4.4 Provisioner Updates

The Provisioner (letsbe-provisioner, ~4,477 LOC Bash) does one-shot VPS provisioning via SSH. It needs:

  • Deploy OpenClaw + Safety Wrapper instead of deprecated orchestrator + sysadmin agent
  • Generate and deploy Safety Wrapper configuration (secrets registry, agent configs, Hub credentials, autonomy defaults)
  • Generate and deploy OpenClaw configuration (model provider pointing to Safety Wrapper proxy, agent definitions, prompt caching settings)
  • Migrate 8 Playwright initial-setup scenarios to run via OpenClaw's native browser tool
  • Clean up config.json post-provisioning (currently contains root password in plaintext — critical fix)
  • Remove all n8n references from Playwright scripts, Docker Compose stacks, and adapters (n8n removed from stack due to license issues)

4.5 Mobile App

Primary customer interface. Requirements:

  • Chat with agent selection ("Talk to your Marketing Agent")
  • Morning briefing from Dispatcher Agent
  • Team management (agent config, model selection, autonomy levels)
  • Command gating approvals (push notifications with one-tap approve/deny)
  • Server health overview (storage, uptime, active tools)
  • Usage dashboard (token consumption, activity)
  • External comms gate management (unlock sending per agent/tool)
  • Access channels: App at launch, WhatsApp/Telegram as fallback channels

Tech choice is yours. React Native is the current direction, but you may propose alternatives (Flutter, PWA, etc.) with justification.

4.6 Website + Onboarding Flow (letsbe.biz)

AI-powered signup flow:

  1. Landing page with chat input: "Describe your business"
  2. AI conversation (1-2 messages) → business type classification
  3. Tool recommendation (pre-selected bundle for detected business type)
  4. Customization (add/remove tools, live resource calculator)
  5. Server selection (only tiers meeting minimum shown)
  6. Domain setup (user brings domain or buys one via Netcup reselling)
  7. Agent config (optional, template-based per business type)
  8. Payment (Stripe)
  9. Provisioning status (real-time progress, email with credentials, app download links)

Website architecture is your call. Part of Hub, separate frontend, or something else — propose and justify.

AI provider for onboarding classification is your call. Requirement: cheap, fast, accurate business type classification in 1-2 messages.

4.7 Secrets Registry

Encrypted SQLite vault for all tenant credentials (50+ per server). Supports:

  • Credential rotation with history
  • Pattern-based discovery (safety net for unregistered secrets)
  • Audit logging
  • SECRET_REF resolution for tool execution

4.8 Autonomy Level System

Per-agent, per-tenant gating configuration. Synced from Hub to Safety Wrapper. Includes:

  • Per-agent autonomy level overrides
  • External comms gate with per-agent, per-tool unlock state
  • Approval request routing to Hub → mobile app
  • Approval expiry (24h default)

4.9 Prompt Caching Architecture

SOUL.md and TOOLS.md structured as cacheable prompt prefixes. Cache read prices are 80-99% cheaper than standard input — direct margin multiplier. Design for maximum cache hit rates across agent conversations.

4.10 First-Hour Workflow Templates

Design 3-4 example workflow templates that demonstrate the architecture works end-to-end:

  • Freelancer first hour: Set up email, connect calendar, configure basic automation
  • Agency first hour: Configure client communication channels, set up project tracking
  • E-commerce first hour: Connect inventory management, set up customer chat, configure analytics
  • Consulting first hour: Set up scheduling, document management, client portal

These should prove your architecture supports real cross-tool workflows, not just individual tool access.

4.11 Interactive Demo (or Alternative)

The current plan proposes a "Bella's Bakery" sandbox — a shared VPS with fake business data where prospects can chat with the AI and watch it operate tools in real-time.

You may propose this approach or a better alternative. The requirement is: give prospects a hands-on experience of the AI workforce before they buy. Not a video — interactive.


5. What Already Exists

5.1 Hub (letsbe-hub)

  • Next.js + Prisma + PostgreSQL
  • ~15,000 LOC, 244 source files
  • 80+ API endpoints across auth, admin, customers, orders, servers, enterprise, staff, settings
  • Stripe integration (webhooks, checkout)
  • Netcup SCP API integration (OAuth2, server management)
  • Portainer integration (container management)
  • RBAC with 4 roles, 2FA, staff invitations
  • Order lifecycle: 8-state automation state machine
  • DNS verification workflow
  • Docker-based provisioning with SSE log streaming
  • AES-256-CBC credential encryption

5.2 Provisioner (letsbe-provisioner)

  • ~4,477 LOC Bash
  • 10-step server provisioning pipeline
  • 28+ Docker Compose tool stacks + 33 nginx configs
  • Template rendering with 50+ secrets generation
  • Backup system (18 PostgreSQL + 2 MySQL + 1 MongoDB + rclone remote + rotation)
  • Restore system (per-tool and full)
  • Zero tests — testing strategy is part of your proposal

5.3 Tool Stacks

  • 28 containerized applications across cloud/files, communication, project management, development, automation, CMS, ERP, analytics, design, security, monitoring, documents, chat
  • Each tool has its own Docker Compose file, nginx config, and provisioning template
  • See docs/technical/LetsBe_Biz_Tool_Catalog.md for full inventory with licensing

5.4 Deprecated Components (Do Not Build On)

  • Orchestrator (letsbe-orchestrator, ~7,500 LOC Python/FastAPI) — absorbed by OpenClaw + Safety Wrapper
  • Sysadmin Agent (letsbe-sysadmin-agent, ~7,600 LOC Python/asyncio) — capabilities become Safety Wrapper tools
  • MCP Browser (letsbe-mcp-browser, ~1,246 LOC Python/FastAPI) — replaced by OpenClaw native browser

5.5 Codebase Cleanup Required

n8n removal: n8n was removed from the tool stack due to its Sustainable Use License prohibiting managed service deployment. However, references persist in:

  • Playwright initial-setup scripts
  • Docker Compose stacks
  • Adapter/integration code
  • Various config files

Your plan must include removing all n8n references as a prerequisite task.


6. Infrastructure Context

6.1 Server Tiers

Tier Specs Netcup Plan Customer Price Use Case
Lite (hidden) 4c/8GB/256GB NVMe RS 1000 G12 €29/mo 5-8 tools
Build (default) 8c/16GB/512GB NVMe RS 2000 G12 €45/mo 10-15 tools
Scale 12c/32GB/1TB NVMe RS 4000 G12 €75/mo 15-30 tools
Enterprise 16c/64GB/2TB NVMe RS 8000 G12 €109/mo Full stack

6.2 Dual-Region

  • EU: Nuremberg, Germany (default for EU customers)
  • US: Manassas, Virginia (default for NA customers)
  • Same RS G12 hardware in both locations

6.3 Provider Strategy

  • Primary: Netcup RS G12 (pre-provisioned pool, 12-month contracts)
  • Overflow: Hetzner Cloud (on-demand, hourly billing)
  • Architecture must be provider-agnostic — Ansible works on any Debian VPS

6.4 Per-Tenant Resource Budget

Your architecture must fit within these constraints:

Component RAM Budget
OpenClaw + Safety Wrapper (in-process) ~512MB (includes Chromium for browser tool)
Secrets proxy (if separate process) ~64MB
nginx ~64MB
Total LetsBe overhead ~640MB

The rest of server RAM is for the 25+ tool containers. On the Lite tier (8GB), that's ~7.3GB for tools — tight. Design accordingly.


7. Billing & Token Model

7.1 Structure

  • Flat monthly subscription (server tier)
  • Monthly token pool (configurable per tier — exact sizes TBD, architecture must support dynamic configuration)
  • Two model tiers:
    • Included: 5-6 cost-efficient models routed through OpenRouter. Pool consumption.
    • Premium: Top-tier models (Claude, GPT-5.2, Gemini Pro). Per-usage metered with sliding markup. Credit card required.
  • Overage billing when pool exhausted (Stripe)
  • Founding member program: 2× token allotment for 12 months (first 50-100 customers)

7.2 Sliding Markup

  • 25% markup on models under $1/M input tokens
  • Decreasing to 8% markup on models over $15/M input tokens
  • Configurable in Hub settings

7.3 What the Architecture Must Support

  • Per-agent, per-model token tracking (input, output, cache-read, cache-write)
  • Hourly bucket aggregation
  • Real-time pool tracking with usage alerts
  • Sub-agent token tracking (isolated from parent)
  • Web search/fetch usage counted in same pool
  • Overage billing via Stripe when pool exhausted

8. Agent Architecture

8.1 Default Agents

Agent Role Tool Access Pattern
Dispatcher Routes user messages, decomposes workflows, morning briefing Inter-agent messaging only
IT Admin Infrastructure, security, tool deployment Shell, Docker, file ops, Portainer, broad tool access
Marketing Content, campaigns, analytics Ghost, Listmonk, Umami, browser, file read
Secretary Communications, scheduling, files Cal.com, Chatwoot, email, Nextcloud, file read
Sales Leads, quotes, contracts Chatwoot, Odoo, Cal.com, Documenso, file read

8.2 Agent Configuration

  • SOUL.md — Personality, domain knowledge, behavioral rules, brand voice
  • Tool permissions — Allow/deny arrays per agent (OpenClaw native)
  • Model selection — Per-agent model choice (basic/advanced UX)
  • Autonomy level — Per-agent override of tenant default

8.3 Custom Agents

Users can create unlimited custom agents. Architecture must support dynamic agent creation, configuration, and removal without server restarts.


9. Operational Constraints

9.1 CI/CD

Source control is Gitea. Your CI/CD strategy should integrate with Gitea. Propose your pipeline approach.

9.2 Quality Bar

This platform is being built with AI coding tools (Claude Code and Codex). The quality bar is premium, not AI slop. Your architecture and implementation plan must account for:

  • Code review processes that catch AI-generated anti-patterns
  • Meaningful test coverage (not just coverage numbers — tests that actually validate behavior)
  • Documentation that a human developer can follow
  • Security-critical code (Safety Wrapper, secrets handling) gets extra scrutiny

9.3 Launch Target

Balance speed and quality. Target: ~3 months to founding member launch with core features. Security is non-negotiable. UX polish can iterate post-launch.


10. Reference Documents

Read these documents from the repo for full context:

Document Path What It Contains
Technical Architecture v1.2 docs/technical/LetsBe_Biz_Technical_Architecture.md Most detailed reference. Full system specification, component details, all 35 architectural decisions, access control model, autonomy levels, tool integration strategy, skills system, memory architecture, inter-agent communication, provisioning pipeline.
Foundation Document v1.1 docs/strategy/LetsBe_Biz_Foundation_Document.md Business strategy, product vision, pricing, competitive landscape, go-to-market.
Product Vision v1.0 docs/strategy/LetsBe_Biz_Product_Vision.md Customer personas, product principles, customer journey, moat analysis, three-year vision.
Pricing Model v2.2 docs/strategy/LetsBe_Biz_Pricing_Model.md Per-tier cost breakdown, token cost modeling, founding member impact, unit economics.
Tool Catalog v2.2 docs/technical/LetsBe_Biz_Tool_Catalog.md Full tool inventory with licensing, resource requirements, expansion candidates.
Infrastructure Runbook docs/technical/LetsBe_Biz_Infrastructure_Runbook.md Operational procedures, server management, backup/restore.
Repo Analysis docs/technical/LetsBe_Repo_Analysis.md Codebase audit — what exists, what's deprecated, what needs cleanup.
Open Source Compliance Check docs/legal/LetsBe_Biz_Open_Source_Compliance_Check.md License compliance audit with action items.
Competitive Landscape docs/strategy/LetsBe_Biz_Competitive_Landscape.md Competitor analysis and positioning.

Also examine the actual codebase: Hub source, Provisioner scripts, Docker Compose stacks, nginx configs.


11. What We Want to Compare

When Matt reviews both proposals, he'll be evaluating:

  1. Architectural clarity — Is the system well-decomposed? Are interfaces clean? Can each component evolve independently?
  2. Security rigor — Does the secrets-never-leave-the-server guarantee hold under all scenarios? Are there edge cases the architecture misses?
  3. Pragmatic trade-offs — Does the plan balance "do it right" with "ship it"? Are scope cuts identified if timeline pressure hits?
  4. Build order intelligence — Is the critical path identified? Can components be developed in parallel? Are dependencies mapped correctly?
  5. Testing strategy — Does it inspire confidence that security-critical code actually works? Not just coverage numbers.
  6. Innovation — Did you find a better way to solve a problem than what the existing Technical Architecture proposes? Bonus points for improvements we didn't think of.
  7. Honesty about risks — What could go wrong? What are the unknowns? Where might the timeline slip?

12. Submission

Produce your architecture plan as a set of documents (markdown preferred) with diagrams. Include everything listed in Section 1 (deliverables). Be thorough but practical — this is a real product being built, not an academic exercise.


End of Brief