LetsBeBiz-Redesign/docs/architecture-proposal/gpt/04-implementation-plan-and-...

8.1 KiB

04. Detailed Implementation Plan And Dependency Graph

1. Planning Assumptions

  • Target launch window: 12 weeks.
  • Team model assumed for schedule below:
    • 2 backend/platform engineers
    • 1 mobile/fullstack engineer
    • 1 DevOps/SRE engineer
    • 1 QA/security engineer (shared)
  • Existing Hub codebase is retained and extended.

2. Work Breakdown Structure (WBS)

Phase 0: Prerequisite Cleanup And Hardening (Week 1)

ID Task Duration Depends On Exit Criteria
P0-1 Remove all n8n code references (Hub, provisioner, stacks, scripts, tests) 3d - rg -n n8n clean in production code paths; CI policy check added
P0-2 Remove deprecated deploy targets (orchestrator, sysadmin) from active provisioning 2d P0-1 No new orders can deploy deprecated services
P0-3 Fix plaintext provisioning secret leak (jobs/*/config.json) 2d P0-1 No root/server password persisted in plaintext job files
P0-4 Baseline security regression tests for cleanup changes 1d P0-2,P0-3 Green CI + sign-off

Phase 1: Safety Substrate (Weeks 2-3)

ID Task Duration Depends On Exit Criteria
P1-1 Build encrypted secrets vault SQLite schema + key management 3d P0-4 CRUD, rotation, audit log implemented
P1-2 Implement egress redaction proxy (registry + regex + entropy layers) 4d P1-1 Redaction test suite pass with seeded secrets
P1-3 Implement command classification engine (5-tier + external gate) 3d P1-1 Deterministic policy tests pass
P1-4 Implement approval state cache + retry logic (tenant-local) 2d P1-3 Approval resilience tests pass
P1-5 OpenClaw plugin skeleton with hooks + telemetry envelope 3d P1-2,P1-3 Hook smoke tests green against pinned OpenClaw tag

Phase 2: Hub Tenant APIs + Data Model (Weeks 3-4)

ID Task Duration Depends On Exit Criteria
P2-1 Add Prisma models: approval queue, usage buckets, agent policy, comms unlocks 2d P0-4 Migration applied in staging
P2-2 Implement tenant register/heartbeat/config APIs 3d P2-1 Contract tests pass
P2-3 Implement tenant approval-request APIs + customer approval endpoints 3d P2-1 End-to-end approval cycle works
P2-4 Implement usage ingest + billing period updates 3d P2-1 Usage events visible in dashboard
P2-5 Add push notification pipeline for approvals 2d P2-3 Mobile push test path validated

Phase 3: Safety Wrapper Execution Layer (Weeks 4-6)

ID Task Duration Depends On Exit Criteria
P3-1 Port shell/docker/file/env guarded executors from sysadmin patterns 5d P1-5 Security unit tests pass
P3-2 Implement tool registry loader + SECRET_REF resolver 3d P1-1,P3-1 Tool calls run without raw secret exposure
P3-3 Implement core adapters (Chatwoot, Ghost, Nextcloud, Cal.com, Odoo, Listmonk) 6d P3-2 Adapter contract tests pass
P3-4 Implement metering capture and hourly bucket compaction 2d P1-5,P2-4 Buckets reliably posted to Hub
P3-5 Add subagent budget/depth limits and policy enforcement 2d P1-5 Policy tests and abuse tests pass

Phase 4: Provisioner Retool (Weeks 5-7)

ID Task Duration Depends On Exit Criteria
P4-1 Add OpenClaw + Safety deployment steps to provisioner 4d P3-2 Fresh VPS comes online with heartbeat
P4-2 Remove legacy stack templates and nginx configs from default deployment path 2d P0-2 Deprecated stacks excluded from installs
P4-3 Generate and deploy tenant configs/policies during provisioning 3d P2-2,P4-1 Config sync succeeds on first boot
P4-4 Migrate initial browser setup scenarios to OpenClaw browser tool 4d P4-1 8 scenarios replaced or retired
P4-5 Add idempotent recovery checkpoints per provisioning step 2d P4-1 Retry from failed step validated

Phase 5: Customer Interfaces (Weeks 6-9)

ID Task Duration Depends On Exit Criteria
P5-1 Customer web portal for approvals, agent settings, usage 5d P2-3,P2-4 Beta usable on staging
P5-2 Mobile app MVP (chat, approvals, health, usage) 8d P2-5,P5-1 TestFlight/internal distribution ready
P5-3 Public onboarding website + classifier + bundle calculator 6d P2-1 Stripe flow works end-to-end
P5-4 WhatsApp/Telegram fallback relay (minimal) 3d P2-3 Approval fallback path works

Phase 6: Workflow Templates + Demo Experience (Weeks 8-10)

ID Task Duration Depends On Exit Criteria
P6-1 Implement 4 first-hour workflow templates as auditable blueprints 5d P3-3,P5-1 Templates executable end-to-end
P6-2 Build interactive demo tenant pool manager (TTL snapshots) 4d P4-1,P5-3 Demo session provisioning <5 min
P6-3 Add product telemetry for template completion and demo conversion 2d P6-1,P6-2 Metrics dashboards live

Phase 7: Quality, Hardening, Launch (Weeks 10-12)

ID Task Duration Depends On Exit Criteria
P7-1 Full security test suite (redaction, gating, injection, auth) 4d P3-5,P4-5 Critical findings resolved
P7-2 Load, soak, and chaos tests on staging fleet 3d P6-1 SLO gates met
P7-3 Canary launch (5% -> 25% -> 100%) with rollback drills 4d P7-1,P7-2 Canary metrics stable
P7-4 Launch readiness review + runbook finalization 2d P7-3 Founding member launch sign-off

3. Dependency Graph

graph TD
  P0_1[P0-1 n8n cleanup] --> P0_2[P0-2 deprecated deploy removal]
  P0_1 --> P0_3[P0-3 plaintext secret fix]
  P0_2 --> P0_4[P0-4 baseline security tests]
  P0_3 --> P0_4

  P0_4 --> P1_1[P1-1 vault]
  P1_1 --> P1_2[P1-2 egress proxy]
  P1_1 --> P1_3[P1-3 classification]
  P1_3 --> P1_4[P1-4 approval cache]
  P1_2 --> P1_5[P1-5 openclaw plugin skeleton]
  P1_3 --> P1_5

  P0_4 --> P2_1[P2-1 hub prisma models]
  P2_1 --> P2_2[P2-2 tenant register/heartbeat/config]
  P2_1 --> P2_3[P2-3 approval APIs]
  P2_1 --> P2_4[P2-4 usage ingest]
  P2_3 --> P2_5[P2-5 push notifications]

  P1_5 --> P3_1[P3-1 guarded executors]
  P1_1 --> P3_2[P3-2 tool registry + secret ref]
  P3_1 --> P3_2
  P3_2 --> P3_3[P3-3 tool adapters]
  P1_5 --> P3_4[P3-4 metering]
  P2_4 --> P3_4
  P1_5 --> P3_5[P3-5 subagent controls]

  P3_2 --> P4_1[P4-1 provisioner openclaw+safety]
  P0_2 --> P4_2[P4-2 legacy stack template removal]
  P2_2 --> P4_3[P4-3 config generation]
  P4_1 --> P4_3
  P4_1 --> P4_4[P4-4 browser scenario migration]
  P4_1 --> P4_5[P4-5 idempotent checkpoints]

  P2_3 --> P5_1[P5-1 customer portal]
  P2_4 --> P5_1
  P2_5 --> P5_2[P5-2 mobile app MVP]
  P5_1 --> P5_2
  P2_1 --> P5_3[P5-3 onboarding website]
  P2_3 --> P5_4[P5-4 whatsapp/telegram fallback]

  P3_3 --> P6_1[P6-1 first-hour templates]
  P5_1 --> P6_1
  P4_1 --> P6_2[P6-2 interactive demo pool]
  P5_3 --> P6_2
  P6_1 --> P6_3[P6-3 template/demo telemetry]
  P6_2 --> P6_3

  P3_5 --> P7_1[P7-1 full security suite]
  P4_5 --> P7_1
  P6_1 --> P7_2[P7-2 load/soak/chaos]
  P7_1 --> P7_3[P7-3 canary launch]
  P7_2 --> P7_3
  P7_3 --> P7_4[P7-4 launch readiness]

4. Critical Path

Primary critical chain:

P0 cleanup -> P1 safety substrate -> P3 execution layer -> P4 provisioner retool -> P7 hardening/canary

Secondary critical chain:

P2 Hub APIs -> P5 mobile approvals -> P7 canary

5. Parallelization Strategy

To meet 12 weeks, run these in parallel after Week 3:

  • Track A: Safety Wrapper + adapters (P3)
  • Track B: Provisioner retool (P4)
  • Track C: Customer interfaces (P5)

6. Definition Of Done (Program-Level)

Launch gate passes only when all are true:

  • secrets-never-leave-server invariant passes automated red-team test suite
  • gating matrix works exactly for all 5 command classes and 3 autonomy levels
  • external comms gate enforces lock-by-default at all autonomy levels
  • provisioning succeeds >=90% first attempt and >=99% with retries
  • approval path works across web + mobile push with audit completeness
  • usage metering reconciles with provider usage within <=1% variance