LetsBeBiz-Redesign/docs/architecture-proposal/claude/05-TIMELINE.md

380 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# LetsBe Biz — Timeline & Milestones
**Date:** February 27, 2026
**Team:** Claude Opus 4.6 Architecture Team
**Document:** 05 of 09
**Status:** Proposal — Competing with independent team
---
## Table of Contents
1. [Timeline Overview](#1-timeline-overview)
2. [Week-by-Week Gantt Chart](#2-week-by-week-gantt-chart)
3. [Milestone Definitions](#3-milestone-definitions)
4. [Team Sizing & Roles](#4-team-sizing--roles)
5. [Weekly Deliverables](#5-weekly-deliverables)
6. [Buffer Analysis](#6-buffer-analysis)
7. [Go/No-Go Decision Points](#7-gono-go-decision-points)
8. [Post-Launch Roadmap](#8-post-launch-roadmap)
---
## 1. Timeline Overview
**Target:** Founding member launch in ~16 weeks (~4 months)
**Launch definition:** First 10 paying customers onboarded, using AI workforce via mobile app, with secrets redaction and command gating enforced.
```
MONTH 1 MONTH 2 MONTH 3 MONTH 4
Wk1 Wk2 Wk3 Wk4 Wk5 Wk6 Wk7 Wk8 Wk9 Wk10 Wk11 Wk12 Wk13 Wk14 Wk15 Wk16
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
Safety Wrapper │████│████│████│████│ │ │ │ │ │ │ │ │ │ │ │ │
Secrets Proxy │████│████│████│ │ │ │ │ │ │ │ │ │ │ │ │ │
Hub Backend │ │ │██░░│████│████│████│████│████│████│████│ │ │ │ │ │ │
Tool Adapters │ │ │ │ │ │ │████│████│ │ │ │ │████│ │ │ │
Mobile App │ │ │ │ │ │ │ │ │████│████│████│████│ │████│ │ │
Website │ │ │ │ │ │ │ │ │ │ │████│████│ │████│ │ │
Provisioner │ │ │ │ │ │ │ │ │ │ │████│████│ │ │ │ │
Integration │ │ │ │ │ │ │ │ │ │ │ │████│ │ │████│ │
Security Audit │ │ │ │ │ │ │ │ │ │ │ │ │████│ │ │ │
Polish & Launch │ │ │ │ │ │ │ │ │ │ │ │ │ │████│████│████│
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
M1──────────────►M2─────────────────►M3─────────────────►M4──────────────►
Legend: ████ = primary work ██░░ = ramp-up/planning ░░░░ = testing/maintenance
M1-M4 = Milestones
```
---
## 2. Week-by-Week Gantt Chart
### Phase 1 — Foundation (Weeks 1-4)
| Week | Stream A (Safety Wrapper) | Stream B (Secrets Proxy) | Stream C (Hub) | Stream D (Frontend) | Stream E (DevOps) |
|------|--------------------------|--------------------------|----------------|--------------------|--------------------|
| **1** | Monorepo setup; SW skeleton; SQLite schema; Secrets registry | Proxy skeleton; Layer 1 Aho-Corasick start | Prisma model planning; ServerConnection updates | Design system selection; wireframes | Turborepo CI; Docker base images |
| **2** | Command classification engine; Shell executor; Docker executor; File/Env executors | Layer 1 complete; Layer 2 regex; Layer 3 entropy; Layer 4 JSON keys | Token usage models; Billing period models | Wireframes: mobile chat, approvals, dashboard | Gitea pipeline: lint + test + build |
| **3** | P0 tests: classification (100+ cases) | P0 tests: redaction (TDD); Performance benchmarks (<10ms) | Tenant API design; Hub endpoint stubs | Website landing page design | OpenClaw Docker image build; Dev env setup |
| **4** | Autonomy engine; Approval queue; SECRET_REF injection; OpenClaw integration | OpenClaw LLM proxy integration; Integration tests | Hub SW protocol endpoint implementation starts | UI component library setup | Staging server provisioning |
**Phase 1 Exit: Milestone M1 — "Core Security Working"**
### Phase 2 — Integration (Weeks 5-8)
| Week | Stream A (Safety Wrapper) | Stream B (Secrets + Tools) | Stream C (Hub) | Stream D (Frontend) | Stream E (DevOps) |
|------|--------------------------|---------------------------|----------------|--------------------|--------------------|
| **5** | Hub client: registration, heartbeat, config sync | Secrets API: provide/reveal/generate/rotate | /tenant/register, /tenant/heartbeat, /tenant/config endpoints | Website: onboarding flow pages 1-5 | Cheat sheet: Portainer |
| **6** | Token metering capture; hourly buckets | Secrets integration tests; Side-channel protocol | Token billing pipeline; Stripe Billing Meters; Founding member logic | Website: AI classifier (Gemini Flash); Resource calculator | Cheat sheets: Nextcloud, Chatwoot |
| **7** | Approval request routing; Config sync receiver | Tool registry generator; Master skill | Approval queue CRUD; AgentConfig model | Website: payment flow; provisioning status | Cheat sheets: Ghost, Cal.com, Stalwart |
| **8** | Integration tests: Hub SW round-trip | Tool integration tests (6 P0 tools) | Push notification skeleton; Config versioning | Mobile: auth screens (login, token storage) | CI: integration test pipeline |
**Phase 2 Exit: Milestone M2 — "Backend Pipeline Working"**
### Phase 3 — Customer Experience (Weeks 9-12)
| Week | Stream A (Safety Wrapper) | Stream B (Provisioner) | Stream C (Hub) | Stream D (Mobile + Frontend) | Stream E (DevOps) |
|------|--------------------------|------------------------|----------------|-----------------------------|--------------------|
| **9** | Monitoring endpoints; Health checks | Provisioner: step 10 rewrite (OpenClaw + SW) | Customer portal API (dashboard, agents, usage) | Mobile: chat with SSE streaming; agent selector | n8n cleanup (7 files) |
| **10** | Performance optimization; Caching tuning | Provisioner: config.json cleanup; Secret seeding | Chat relay service; WebSocket endpoint | Mobile: push notifications; approval cards | Provisioner: Playwright migration (7 scenarios) |
| **11** | Edge case hardening | Provisioner: Docker Compose for LetsBe stack | Customer portal: billing, tools, settings endpoints | Mobile: dashboard, usage, settings | Staging: full stack deployment |
| **12** | Bug fixes from integration | Integration test on real VPS | E2E test: payment provision AI ready | Mobile: secrets side-channel; polish | E2E test verification |
**Phase 3 Exit: Milestone M3 — "End-to-End Journey Working"**
### Phase 4 — Polish & Launch (Weeks 13-16)
| Week | Stream A (Security) | Stream B (Tools + Demo) | Stream C (Hub) | Stream D (Mobile + Frontend) | Stream E (DevOps) |
|------|--------------------|-----------------------|----------------|-----------------------------|--------------------|
| **13** | Adversarial security audit: secrets, classification, injection, SSRF | P1 cheat sheets (Odoo, Listmonk, NocoDB, Umami, Keycloak, Activepieces) | Security fixes from audit | Mobile: UI polish, error handling, offline | Channel config: WhatsApp + Telegram |
| **14** | Prompt caching optimization; Token efficiency audit | First-hour templates: Freelancer, Agency | Performance tuning; Usage alert system | Website: remaining pages, polish | Provisioner integration tests |
| **15** | Fix critical/high issues from dogfooding | Interactive demo: ephemeral containers | Deploy to staging; Dogfooding begins | Mobile: beta testing (internal) | Monitoring dashboard; Backup monitoring |
| **16** | Final security verification | Demo polish; Fix staging issues | Production deployment | App Store / Play Store prep | Founding member onboarding (10 customers) |
**Phase 4 Exit: Milestone M4 — "Founding Member Launch"**
---
## 3. Milestone Definitions
### M1 — Core Security Working (End of Week 4)
| Criterion | Verification |
|-----------|-------------|
| Secrets Proxy redacts all known patterns | P0 test suite: 100% pass |
| Redaction latency < 10ms with 50+ secrets | Benchmark test |
| Command classifier handles all 5 tiers correctly | P0 test suite: 100+ cases |
| Autonomy engine gates correctly at levels 1/2/3 | Test suite: all combinations |
| OpenClaw routes tool calls through Safety Wrapper | Integration test: tool call execution audit |
| OpenClaw routes LLM calls through Secrets Proxy | Integration test: LLM call redacted outbound |
| SECRET_REF injection resolves credentials | Integration test: placeholder real value |
| Audit log captures every tool call | Log verification test |
**Decision gate:** If M1 slips by > 1 week, escalate. Safety Wrapper is the critical path — nothing downstream works without it.
### M2 — Backend Pipeline Working (End of Week 8)
| Criterion | Verification |
|-----------|-------------|
| Safety Wrapper registers with Hub | Protocol test: register → receive API key |
| Heartbeat maintains connection | 24h soak test: heartbeat + reconnect |
| Token usage flows to billing | Pipeline test: usage → bucket → billing period |
| Stripe overage billing triggers | Stripe test mode: pool exhaustion → invoice |
| 6 P0 tool cheat sheets work | Agent successfully calls each tool's API |
| Approval round-trip completes | Test: Red command → Hub → approve → execute |
| Config sync propagates | Test: change agent config in Hub → verify on SW |
**Decision gate:** If M2 slips, assess whether to cut overage billing and/or founding member logic from launch scope (both in the "scope cut" table).
### M3 — End-to-End Journey Working (End of Week 12)
| Criterion | Verification |
|-----------|-------------|
| Website: signup → payment works | Stripe test mode end-to-end |
| Provisioner deploys new stack | Full provisioning on staging VPS |
| Mobile: login → chat → approve works | Device testing (iOS + Android) |
| Chat relay: App → Hub → SW → OpenClaw → response | Full round-trip with streaming |
| Push notifications for approvals | Notification received on test device |
| n8n references fully removed | `grep -r "n8n" provisioner/` returns nothing |
| config.json cleanup verified | Post-provisioning: no plaintext passwords |
**Decision gate:** If M3 slips by > 1 week, defer interactive demo, P1 tool adapters, and WhatsApp/Telegram to post-launch. Focus all effort on core launch requirements.
### M4 — Founding Member Launch (End of Week 16)
| Criterion | Verification |
|-----------|-------------|
| Security audit: no critical findings | Audit report reviewed and signed off |
| 10 founding members onboarded | Active users with functional AI workforce |
| Performance targets met | Redaction <10ms, tool calls <5s p95, heartbeat stable |
| First-hour templates prove cross-tool workflows | At least 2 templates working end-to-end |
| Monitoring and alerting operational | Hub health + tenant health dashboards live |
---
## 4. Team Sizing & Roles
### Recommended: 4-5 Engineers
| Role | Focus Area | Skills Required | Stream |
|------|-----------|-----------------|--------|
| **Safety Wrapper Lead** (Senior) | Safety Wrapper + Secrets Proxy + OpenClaw integration | Node.js, security, cryptography, SQLite | A + B |
| **Hub Backend Engineer** | Hub API, billing, tenant protocol, chat relay | TypeScript, Next.js, Prisma, Stripe | C |
| **Frontend/Mobile Engineer** | Mobile app (Expo), website (Next.js), design system | React Native, Expo, Next.js, Tailwind | D |
| **DevOps/Provisioner Engineer** | CI/CD, Docker, provisioning, tool cheat sheets, staging | Bash, Docker, Gitea Actions, Ansible concepts | E |
| **QA/Integration Engineer** (part-time or shared) | Testing, security audit, E2E verification | Testing frameworks, security testing | Cross-stream |
### Minimum Viable: 3 Engineers
| Role | Covers | Trade-off |
|------|--------|-----------|
| **Full-Stack Security** (Senior) | Streams A + B | Secrets Proxy work starts week 2 instead of week 1 |
| **Hub + Backend** | Stream C | No changes same workload |
| **Frontend + DevOps** | Streams D + E | Website and mobile overlap handled sequentially; DevOps work spread across evenings/gaps |
### Critical Hire: Safety Wrapper Lead
The Safety Wrapper Lead is the most critical hire. This person:
- Must understand security at a deep level (cryptography, injection prevention, transport security)
- Must be comfortable with Node.js internals (HTTP proxy, process management, SQLite)
- Owns the core IP of the platform
- Is on the critical path for every downstream milestone
**Risk mitigation:** If this hire is delayed, the founder (Matt) should write the Safety Wrapper skeleton and P0 tests during week 1-2 while recruiting.
---
## 5. Weekly Deliverables
Each week produces demonstrable output. This prevents "dark" periods where progress can't be verified.
| Week | Key Deliverable | Demo |
|------|----------------|------|
| 1 | Monorepo running; SW responds on :8200; SQLite schema created; Secrets registry encrypts/decrypts | `curl localhost:8200/health` returns OK; secrets round-trip test |
| 2 | Commands classified correctly; Shell/Docker/File executors work | Run `classify("rm -rf /")` CRITICAL_RED; execute a read-only command |
| 3 | Secrets Proxy redacts all patterns; P0 tests pass | Send payload with JWT embedded verify redacted output |
| 4 | OpenClaw talks to SW; Autonomy gates work; Full Phase 1 integration | OpenClaw agent issues tool call SW classifies executes returns |
| 5 | Hub accepts registration; Heartbeat flowing | SW boots registers heartbeat shows in Hub admin |
| 6 | Token usage tracked; Billing period accumulates | Agent makes LLM calls usage appears in Hub dashboard |
| 7 | 6 tools callable via API; Approval queue populated | Agent uses Portainer API container list returned |
| 8 | Approval round-trip works; Config sync confirmed | Change autonomy level in Hub verify change on tenant |
| 9 | Mobile app renders chat; Agent responds | Open app type message see agent response stream |
| 10 | Push notifications arrive; Customer portal shows data | Trigger Red command push notification on phone approve |
| 11 | Provisioner deploys new stack; Website onboarding works | Run provisioner verify OpenClaw + SW running on VPS |
| 12 | Full journey: signup provision chat | New account Stripe test VPS provisioned mobile chat |
| 13 | Security audit complete; P1 tools available | Audit report; Odoo/Listmonk usable by agents |
| 14 | Prompt caching verified; First-hour templates work | Cache hit rate logged; Freelancer template runs end-to-end |
| 15 | Staging deployment stable; Internal team using it | Team dogfooding report; Bug list prioritized |
| 16 | 10 founding members onboarded | Real customers talking to their AI teams |
---
## 6. Buffer Analysis
### Critical Path Duration
The absolute minimum serial dependency chain (from 04-IMPLEMENTATION-PLAN):
```
Monorepo (2d) → SW skeleton (2d) → Classification (3d) → Executors (2d) →
Autonomy (2d) → OpenClaw integration (2d) → Hub protocol (5d) →
Billing (5d) → Approval queue (4d) → Customer portal (3d) →
Chat relay (2d) → Mobile chat (3d) → Provisioner (3d) →
E2E test (3d) → Security audit (3d) → Launch (1d)
Total: 42 working days = 8.5 weeks
```
### Available Calendar Time
- 16 weeks × 5 working days = 80 working days
- Critical path: 42 working days
- **Buffer: 38 working days (7.5 weeks)**
### Buffer Distribution
| Phase | Calendar | Critical Path | Buffer | Buffer % |
|-------|----------|--------------|--------|----------|
| Phase 1 (wk 1-4) | 20 days | 13 days | 7 days | 35% |
| Phase 2 (wk 5-8) | 20 days | 14 days | 6 days | 30% |
| Phase 3 (wk 9-12) | 20 days | 11 days | 9 days | 45% |
| Phase 4 (wk 13-16) | 20 days | 4 days | 16 days | 80% |
**Phase 4 has the most buffer** because it's mostly polish, which can absorb delays from earlier phases. If Phase 1 or 2 slip, Phase 4 scope is cut first (interactive demo, channels, P2+ tools).
### Risk Scenarios & Buffer Impact
| Scenario | Probability | Days Lost | Buffer Remaining | Mitigation |
|----------|------------|-----------|-----------------|------------|
| OpenClaw integration harder than expected | HIGH | 3-5 days | 33-35 days | Start integration in week 3 instead of week 4; allocate extra time |
| Secrets redaction has edge cases requiring extra work | MEDIUM | 2-3 days | 35-36 days | TDD approach; adversarial testing starts in Phase 1, not Phase 4 |
| Mobile app iOS/Android platform bugs | MEDIUM | 3-5 days | 33-35 days | Focus on one platform first; use Expo's cross-platform abstractions |
| Stripe billing integration complexity | LOW | 2-3 days | 35-36 days | Stripe Billing Meters well-documented; test mode available |
| Provisioner testing on real VPS reveals issues | HIGH | 3-5 days | 33-35 days | Allocate staging VPS early (week 4); test incrementally |
| Key engineer leaves or is unavailable for 2 weeks | LOW | 10 days | 28 days | Document everything; pair on critical path items |
| All of the above simultaneously | VERY LOW | ~20 days | 18 days | Still launchable cut scope per scope cut table |
**Conclusion:** Even in the worst case (all risks materializing), the 16-week timeline has enough buffer to launch with core features. The scope cut table in 04-IMPLEMENTATION-PLAN defines what gets deferred.
---
## 7. Go/No-Go Decision Points
### Week 4 — Phase 1 Review
**Go criteria:**
- [ ] All M1 criteria met
- [ ] P0 test suites pass with >95% coverage of defined scenarios
- [ ] OpenClaw integration demonstrated
**No-go actions:**
- If secrets redaction is incomplete → STOP. Allocate all engineering to this. Delay Phase 2 start.
- If classification engine has gaps → document gaps, create follow-up tickets, proceed with caution
- If OpenClaw integration fails → investigate alternative integration approaches; consider filing upstream issue
### Week 8 — Phase 2 Review
**Go criteria:**
- [ ] All M2 criteria met
- [ ] Hub ↔ Safety Wrapper protocol stable for 48h
- [ ] At least 4 of 6 P0 tools working
**No-go actions:**
- If billing pipeline broken → defer overage billing; use flat pool with hard stop at limit
- If approval queue broken → allow admin-only approvals via Hub dashboard; defer mobile approval cards
- If < 4 tools working focus on the most critical (Portainer, Nextcloud, Chatwoot) and defer rest
### Week 12 — Phase 3 Review (Most Critical Decision)
**Go criteria:**
- [ ] All M3 criteria met
- [ ] Full customer journey demonstrated on staging
- [ ] Mobile app functional on both iOS and Android
**No-go actions:**
- If provisioner fails CRITICAL. Cannot launch without provisioning. All hands on provisioner until fixed.
- If mobile app not ready launch with web-only customer portal as temporary interface; ship mobile in 2 weeks post-launch
- If E2E journey has gaps identify gaps, create workarounds, defer non-essential features
### Week 14 — Launch Readiness Review
**Go criteria:**
- [ ] Security audit passed (no critical findings)
- [ ] Staging deployment stable for 3+ days
- [ ] At least 5 founding member candidates confirmed
**No-go actions:**
- If security audit finds critical issues STOP LAUNCH. Fix issues. Re-audit. No exceptions.
- If staging unstable extend dogfooding by 1 week; defer launch to week 17
- If no founding members marketing push; consider beta invite program; launch with team-internal usage
---
## 8. Post-Launch Roadmap
Items deferred from v1 launch, prioritized for the 2 months following launch:
### Month 5 (Weeks 17-20) — Stabilization
| Priority | Item | Effort |
|----------|------|--------|
| P0 | Fix all critical bugs from founding member feedback | Ongoing |
| P0 | Performance optimization based on real usage data | 1 week |
| P1 | P2 tool cheat sheets (Gitea, Uptime Kuma, MinIO, Documenso, VaultWarden, WordPress) | 1 week |
| P1 | Interactive demo system (if deferred) | 1 week |
| P1 | WhatsApp + Telegram channels (if deferred) | 1 week |
| P2 | Customer portal web UI (if deferred) | 2 weeks |
### Month 6 (Weeks 21-24) — Growth
| Priority | Item | Effort |
|----------|------|--------|
| P0 | Scale to 50 founding members | Ongoing |
| P1 | Custom agent creation | 2 weeks |
| P1 | Dynamic tool installation from catalog | 2 weeks |
| P1 | P3 tool cheat sheets (Activepieces, Windmill, Redash, Penpot, Squidex, Typebot) | 1 week |
| P2 | E-commerce and Consulting first-hour templates | 1 week |
| P2 | DNS automation via Cloudflare/Entri API | 1 week |
### Month 7-8 (Weeks 25-32) — Scale
| Priority | Item | Effort |
|----------|------|--------|
| P0 | Scale to 100 customers; Hetzner overflow activation | Ongoing |
| P1 | Discord + Slack channels | 1 week |
| P1 | Cross-region backup (encrypted offsite) | 2 weeks |
| P1 | Automated backup restore testing | 1 week |
| P2 | Premium model tier (if deferred) | 1 week |
| P2 | Advanced analytics dashboard | 2 weeks |
| P2 | Multi-language support | 2 weeks |
---
## Calendar Mapping
Assuming project start on **Monday, March 3, 2026**:
| Milestone | Target Date | Calendar Week |
|-----------|------------|---------------|
| Project kickoff | March 3, 2026 | Week 1 |
| M1 Core Security Working | March 28, 2026 | End of Week 4 |
| M2 Backend Pipeline Working | April 25, 2026 | End of Week 8 |
| M3 End-to-End Journey Working | May 22, 2026 | End of Week 12 |
| Staging deployment | June 5, 2026 | Week 15 |
| M4 Founding Member Launch | June 19, 2026 | End of Week 16 |
| Stabilization complete | July 17, 2026 | End of Week 20 |
| 50 customers | August 14, 2026 | End of Week 24 |
**Holidays to account for (Germany/EU):**
- Easter: April 3-6, 2026 (4 days lost in week 5)
- May Day: May 1, 2026 (1 day lost in week 9)
- Ascension: May 14, 2026 (1 day lost in week 11)
- Whit Monday: May 25, 2026 (1 day lost in week 13)
**Impact:** ~7 working days lost to holidays. This is absorbed by the 38-day buffer. No milestone dates need to shift, but the buffer effectively reduces to ~31 working days.
---
*End of Document — 05 Timeline & Milestones*