23 KiB
LetsBe Biz — Timeline & Milestones
Date: February 27, 2026 Team: Claude Opus 4.6 Architecture Team Document: 05 of 09 Status: Proposal — Competing with independent team
Table of Contents
- Timeline Overview
- Week-by-Week Gantt Chart
- Milestone Definitions
- Team Sizing & Roles
- Weekly Deliverables
- Buffer Analysis
- Go/No-Go Decision Points
- Post-Launch Roadmap
1. Timeline Overview
Target: Founding member launch in ~16 weeks (~4 months) Launch definition: First 10 paying customers onboarded, using AI workforce via mobile app, with secrets redaction and command gating enforced.
MONTH 1 MONTH 2 MONTH 3 MONTH 4
Wk1 Wk2 Wk3 Wk4 Wk5 Wk6 Wk7 Wk8 Wk9 Wk10 Wk11 Wk12 Wk13 Wk14 Wk15 Wk16
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
Safety Wrapper │████│████│████│████│ │ │ │ │ │ │ │ │ │ │ │ │
Secrets Proxy │████│████│████│ │ │ │ │ │ │ │ │ │ │ │ │ │
Hub Backend │ │ │██░░│████│████│████│████│████│████│████│ │ │ │ │ │ │
Tool Adapters │ │ │ │ │ │ │████│████│ │ │ │ │████│ │ │ │
Mobile App │ │ │ │ │ │ │ │ │████│████│████│████│ │████│ │ │
Website │ │ │ │ │ │ │ │ │ │ │████│████│ │████│ │ │
Provisioner │ │ │ │ │ │ │ │ │ │ │████│████│ │ │ │ │
Integration │ │ │ │ │ │ │ │ │ │ │ │████│ │ │████│ │
Security Audit │ │ │ │ │ │ │ │ │ │ │ │ │████│ │ │ │
Polish & Launch │ │ │ │ │ │ │ │ │ │ │ │ │ │████│████│████│
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
M1──────────────►M2─────────────────►M3─────────────────►M4──────────────►
Legend: ████ = primary work ██░░ = ramp-up/planning ░░░░ = testing/maintenance
M1-M4 = Milestones
2. Week-by-Week Gantt Chart
Phase 1 — Foundation (Weeks 1-4)
| Week | Stream A (Safety Wrapper) | Stream B (Secrets Proxy) | Stream C (Hub) | Stream D (Frontend) | Stream E (DevOps) |
|---|---|---|---|---|---|
| 1 | Monorepo setup; SW skeleton; SQLite schema; Secrets registry | Proxy skeleton; Layer 1 Aho-Corasick start | Prisma model planning; ServerConnection updates | Design system selection; wireframes | Turborepo CI; Docker base images |
| 2 | Command classification engine; Shell executor; Docker executor; File/Env executors | Layer 1 complete; Layer 2 regex; Layer 3 entropy; Layer 4 JSON keys | Token usage models; Billing period models | Wireframes: mobile chat, approvals, dashboard | Gitea pipeline: lint + test + build |
| 3 | P0 tests: classification (100+ cases) | P0 tests: redaction (TDD); Performance benchmarks (<10ms) | Tenant API design; Hub endpoint stubs | Website landing page design | OpenClaw Docker image build; Dev env setup |
| 4 | Autonomy engine; Approval queue; SECRET_REF injection; OpenClaw integration | OpenClaw LLM proxy integration; Integration tests | Hub ↔ SW protocol endpoint implementation starts | UI component library setup | Staging server provisioning |
Phase 1 Exit: Milestone M1 — "Core Security Working"
Phase 2 — Integration (Weeks 5-8)
| Week | Stream A (Safety Wrapper) | Stream B (Secrets + Tools) | Stream C (Hub) | Stream D (Frontend) | Stream E (DevOps) |
|---|---|---|---|---|---|
| 5 | Hub client: registration, heartbeat, config sync | Secrets API: provide/reveal/generate/rotate | /tenant/register, /tenant/heartbeat, /tenant/config endpoints | Website: onboarding flow pages 1-5 | Cheat sheet: Portainer |
| 6 | Token metering capture; hourly buckets | Secrets integration tests; Side-channel protocol | Token billing pipeline; Stripe Billing Meters; Founding member logic | Website: AI classifier (Gemini Flash); Resource calculator | Cheat sheets: Nextcloud, Chatwoot |
| 7 | Approval request routing; Config sync receiver | Tool registry generator; Master skill | Approval queue CRUD; AgentConfig model | Website: payment flow; provisioning status | Cheat sheets: Ghost, Cal.com, Stalwart |
| 8 | Integration tests: Hub ↔ SW round-trip | Tool integration tests (6 P0 tools) | Push notification skeleton; Config versioning | Mobile: auth screens (login, token storage) | CI: integration test pipeline |
Phase 2 Exit: Milestone M2 — "Backend Pipeline Working"
Phase 3 — Customer Experience (Weeks 9-12)
| Week | Stream A (Safety Wrapper) | Stream B (Provisioner) | Stream C (Hub) | Stream D (Mobile + Frontend) | Stream E (DevOps) |
|---|---|---|---|---|---|
| 9 | Monitoring endpoints; Health checks | Provisioner: step 10 rewrite (OpenClaw + SW) | Customer portal API (dashboard, agents, usage) | Mobile: chat with SSE streaming; agent selector | n8n cleanup (7 files) |
| 10 | Performance optimization; Caching tuning | Provisioner: config.json cleanup; Secret seeding | Chat relay service; WebSocket endpoint | Mobile: push notifications; approval cards | Provisioner: Playwright migration (7 scenarios) |
| 11 | Edge case hardening | Provisioner: Docker Compose for LetsBe stack | Customer portal: billing, tools, settings endpoints | Mobile: dashboard, usage, settings | Staging: full stack deployment |
| 12 | Bug fixes from integration | Integration test on real VPS | E2E test: payment → provision → AI ready | Mobile: secrets side-channel; polish | E2E test verification |
Phase 3 Exit: Milestone M3 — "End-to-End Journey Working"
Phase 4 — Polish & Launch (Weeks 13-16)
| Week | Stream A (Security) | Stream B (Tools + Demo) | Stream C (Hub) | Stream D (Mobile + Frontend) | Stream E (DevOps) |
|---|---|---|---|---|---|
| 13 | Adversarial security audit: secrets, classification, injection, SSRF | P1 cheat sheets (Odoo, Listmonk, NocoDB, Umami, Keycloak, Activepieces) | Security fixes from audit | Mobile: UI polish, error handling, offline | Channel config: WhatsApp + Telegram |
| 14 | Prompt caching optimization; Token efficiency audit | First-hour templates: Freelancer, Agency | Performance tuning; Usage alert system | Website: remaining pages, polish | Provisioner integration tests |
| 15 | Fix critical/high issues from dogfooding | Interactive demo: ephemeral containers | Deploy to staging; Dogfooding begins | Mobile: beta testing (internal) | Monitoring dashboard; Backup monitoring |
| 16 | Final security verification | Demo polish; Fix staging issues | Production deployment | App Store / Play Store prep | Founding member onboarding (10 customers) |
Phase 4 Exit: Milestone M4 — "Founding Member Launch"
3. Milestone Definitions
M1 — Core Security Working (End of Week 4)
| Criterion | Verification |
|---|---|
| Secrets Proxy redacts all known patterns | P0 test suite: 100% pass |
| Redaction latency < 10ms with 50+ secrets | Benchmark test |
| Command classifier handles all 5 tiers correctly | P0 test suite: 100+ cases |
| Autonomy engine gates correctly at levels 1/2/3 | Test suite: all combinations |
| OpenClaw routes tool calls through Safety Wrapper | Integration test: tool call → execution → audit |
| OpenClaw routes LLM calls through Secrets Proxy | Integration test: LLM call → redacted outbound |
| SECRET_REF injection resolves credentials | Integration test: placeholder → real value |
| Audit log captures every tool call | Log verification test |
Decision gate: If M1 slips by > 1 week, escalate. Safety Wrapper is the critical path — nothing downstream works without it.
M2 — Backend Pipeline Working (End of Week 8)
| Criterion | Verification |
|---|---|
| Safety Wrapper registers with Hub | Protocol test: register → receive API key |
| Heartbeat maintains connection | 24h soak test: heartbeat + reconnect |
| Token usage flows to billing | Pipeline test: usage → bucket → billing period |
| Stripe overage billing triggers | Stripe test mode: pool exhaustion → invoice |
| 6 P0 tool cheat sheets work | Agent successfully calls each tool's API |
| Approval round-trip completes | Test: Red command → Hub → approve → execute |
| Config sync propagates | Test: change agent config in Hub → verify on SW |
Decision gate: If M2 slips, assess whether to cut overage billing and/or founding member logic from launch scope (both in the "scope cut" table).
M3 — End-to-End Journey Working (End of Week 12)
| Criterion | Verification |
|---|---|
| Website: signup → payment works | Stripe test mode end-to-end |
| Provisioner deploys new stack | Full provisioning on staging VPS |
| Mobile: login → chat → approve works | Device testing (iOS + Android) |
| Chat relay: App → Hub → SW → OpenClaw → response | Full round-trip with streaming |
| Push notifications for approvals | Notification received on test device |
| n8n references fully removed | grep -r "n8n" provisioner/ returns nothing |
| config.json cleanup verified | Post-provisioning: no plaintext passwords |
Decision gate: If M3 slips by > 1 week, defer interactive demo, P1 tool adapters, and WhatsApp/Telegram to post-launch. Focus all effort on core launch requirements.
M4 — Founding Member Launch (End of Week 16)
| Criterion | Verification |
|---|---|
| Security audit: no critical findings | Audit report reviewed and signed off |
| 10 founding members onboarded | Active users with functional AI workforce |
| Performance targets met | Redaction <10ms, tool calls <5s p95, heartbeat stable |
| First-hour templates prove cross-tool workflows | At least 2 templates working end-to-end |
| Monitoring and alerting operational | Hub health + tenant health dashboards live |
4. Team Sizing & Roles
Recommended: 4-5 Engineers
| Role | Focus Area | Skills Required | Stream |
|---|---|---|---|
| Safety Wrapper Lead (Senior) | Safety Wrapper + Secrets Proxy + OpenClaw integration | Node.js, security, cryptography, SQLite | A + B |
| Hub Backend Engineer | Hub API, billing, tenant protocol, chat relay | TypeScript, Next.js, Prisma, Stripe | C |
| Frontend/Mobile Engineer | Mobile app (Expo), website (Next.js), design system | React Native, Expo, Next.js, Tailwind | D |
| DevOps/Provisioner Engineer | CI/CD, Docker, provisioning, tool cheat sheets, staging | Bash, Docker, Gitea Actions, Ansible concepts | E |
| QA/Integration Engineer (part-time or shared) | Testing, security audit, E2E verification | Testing frameworks, security testing | Cross-stream |
Minimum Viable: 3 Engineers
| Role | Covers | Trade-off |
|---|---|---|
| Full-Stack Security (Senior) | Streams A + B | Secrets Proxy work starts week 2 instead of week 1 |
| Hub + Backend | Stream C | No changes — same workload |
| Frontend + DevOps | Streams D + E | Website and mobile overlap handled sequentially; DevOps work spread across evenings/gaps |
Critical Hire: Safety Wrapper Lead
The Safety Wrapper Lead is the most critical hire. This person:
- Must understand security at a deep level (cryptography, injection prevention, transport security)
- Must be comfortable with Node.js internals (HTTP proxy, process management, SQLite)
- Owns the core IP of the platform
- Is on the critical path for every downstream milestone
Risk mitigation: If this hire is delayed, the founder (Matt) should write the Safety Wrapper skeleton and P0 tests during week 1-2 while recruiting.
5. Weekly Deliverables
Each week produces demonstrable output. This prevents "dark" periods where progress can't be verified.
| Week | Key Deliverable | Demo |
|---|---|---|
| 1 | Monorepo running; SW responds on :8200; SQLite schema created; Secrets registry encrypts/decrypts | curl localhost:8200/health returns OK; secrets round-trip test |
| 2 | Commands classified correctly; Shell/Docker/File executors work | Run classify("rm -rf /") → CRITICAL_RED; execute a read-only command |
| 3 | Secrets Proxy redacts all patterns; P0 tests pass | Send payload with JWT embedded → verify redacted output |
| 4 | OpenClaw talks to SW; Autonomy gates work; Full Phase 1 integration | OpenClaw agent issues tool call → SW classifies → executes → returns |
| 5 | Hub accepts registration; Heartbeat flowing | SW boots → registers → heartbeat shows in Hub admin |
| 6 | Token usage tracked; Billing period accumulates | Agent makes LLM calls → usage appears in Hub dashboard |
| 7 | 6 tools callable via API; Approval queue populated | Agent uses Portainer API → container list returned |
| 8 | Approval round-trip works; Config sync confirmed | Change autonomy level in Hub → verify change on tenant |
| 9 | Mobile app renders chat; Agent responds | Open app → type message → see agent response stream |
| 10 | Push notifications arrive; Customer portal shows data | Trigger Red command → push notification on phone → approve |
| 11 | Provisioner deploys new stack; Website onboarding works | Run provisioner → verify OpenClaw + SW running on VPS |
| 12 | Full journey: signup → provision → chat | New account → Stripe test → VPS provisioned → mobile chat |
| 13 | Security audit complete; P1 tools available | Audit report; Odoo/Listmonk usable by agents |
| 14 | Prompt caching verified; First-hour templates work | Cache hit rate logged; Freelancer template runs end-to-end |
| 15 | Staging deployment stable; Internal team using it | Team dogfooding report; Bug list prioritized |
| 16 | 10 founding members onboarded | Real customers talking to their AI teams |
6. Buffer Analysis
Critical Path Duration
The absolute minimum serial dependency chain (from 04-IMPLEMENTATION-PLAN):
Monorepo (2d) → SW skeleton (2d) → Classification (3d) → Executors (2d) →
Autonomy (2d) → OpenClaw integration (2d) → Hub protocol (5d) →
Billing (5d) → Approval queue (4d) → Customer portal (3d) →
Chat relay (2d) → Mobile chat (3d) → Provisioner (3d) →
E2E test (3d) → Security audit (3d) → Launch (1d)
Total: 42 working days = 8.5 weeks
Available Calendar Time
- 16 weeks × 5 working days = 80 working days
- Critical path: 42 working days
- Buffer: 38 working days (7.5 weeks)
Buffer Distribution
| Phase | Calendar | Critical Path | Buffer | Buffer % |
|---|---|---|---|---|
| Phase 1 (wk 1-4) | 20 days | 13 days | 7 days | 35% |
| Phase 2 (wk 5-8) | 20 days | 14 days | 6 days | 30% |
| Phase 3 (wk 9-12) | 20 days | 11 days | 9 days | 45% |
| Phase 4 (wk 13-16) | 20 days | 4 days | 16 days | 80% |
Phase 4 has the most buffer because it's mostly polish, which can absorb delays from earlier phases. If Phase 1 or 2 slip, Phase 4 scope is cut first (interactive demo, channels, P2+ tools).
Risk Scenarios & Buffer Impact
| Scenario | Probability | Days Lost | Buffer Remaining | Mitigation |
|---|---|---|---|---|
| OpenClaw integration harder than expected | HIGH | 3-5 days | 33-35 days | Start integration in week 3 instead of week 4; allocate extra time |
| Secrets redaction has edge cases requiring extra work | MEDIUM | 2-3 days | 35-36 days | TDD approach; adversarial testing starts in Phase 1, not Phase 4 |
| Mobile app iOS/Android platform bugs | MEDIUM | 3-5 days | 33-35 days | Focus on one platform first; use Expo's cross-platform abstractions |
| Stripe billing integration complexity | LOW | 2-3 days | 35-36 days | Stripe Billing Meters well-documented; test mode available |
| Provisioner testing on real VPS reveals issues | HIGH | 3-5 days | 33-35 days | Allocate staging VPS early (week 4); test incrementally |
| Key engineer leaves or is unavailable for 2 weeks | LOW | 10 days | 28 days | Document everything; pair on critical path items |
| All of the above simultaneously | VERY LOW | ~20 days | 18 days | Still launchable — cut scope per scope cut table |
Conclusion: Even in the worst case (all risks materializing), the 16-week timeline has enough buffer to launch with core features. The scope cut table in 04-IMPLEMENTATION-PLAN defines what gets deferred.
7. Go/No-Go Decision Points
Week 4 — Phase 1 Review
Go criteria:
- All M1 criteria met
- P0 test suites pass with >95% coverage of defined scenarios
- OpenClaw integration demonstrated
No-go actions:
- If secrets redaction is incomplete → STOP. Allocate all engineering to this. Delay Phase 2 start.
- If classification engine has gaps → document gaps, create follow-up tickets, proceed with caution
- If OpenClaw integration fails → investigate alternative integration approaches; consider filing upstream issue
Week 8 — Phase 2 Review
Go criteria:
- All M2 criteria met
- Hub ↔ Safety Wrapper protocol stable for 48h
- At least 4 of 6 P0 tools working
No-go actions:
- If billing pipeline broken → defer overage billing; use flat pool with hard stop at limit
- If approval queue broken → allow admin-only approvals via Hub dashboard; defer mobile approval cards
- If < 4 tools working → focus on the most critical (Portainer, Nextcloud, Chatwoot) and defer rest
Week 12 — Phase 3 Review (Most Critical Decision)
Go criteria:
- All M3 criteria met
- Full customer journey demonstrated on staging
- Mobile app functional on both iOS and Android
No-go actions:
- If provisioner fails → CRITICAL. Cannot launch without provisioning. All hands on provisioner until fixed.
- If mobile app not ready → launch with web-only customer portal as temporary interface; ship mobile in 2 weeks post-launch
- If E2E journey has gaps → identify gaps, create workarounds, defer non-essential features
Week 14 — Launch Readiness Review
Go criteria:
- Security audit passed (no critical findings)
- Staging deployment stable for 3+ days
- At least 5 founding member candidates confirmed
No-go actions:
- If security audit finds critical issues → STOP LAUNCH. Fix issues. Re-audit. No exceptions.
- If staging unstable → extend dogfooding by 1 week; defer launch to week 17
- If no founding members → marketing push; consider beta invite program; launch with team-internal usage
8. Post-Launch Roadmap
Items deferred from v1 launch, prioritized for the 2 months following launch:
Month 5 (Weeks 17-20) — Stabilization
| Priority | Item | Effort |
|---|---|---|
| P0 | Fix all critical bugs from founding member feedback | Ongoing |
| P0 | Performance optimization based on real usage data | 1 week |
| P1 | P2 tool cheat sheets (Gitea, Uptime Kuma, MinIO, Documenso, VaultWarden, WordPress) | 1 week |
| P1 | Interactive demo system (if deferred) | 1 week |
| P1 | WhatsApp + Telegram channels (if deferred) | 1 week |
| P2 | Customer portal web UI (if deferred) | 2 weeks |
Month 6 (Weeks 21-24) — Growth
| Priority | Item | Effort |
|---|---|---|
| P0 | Scale to 50 founding members | Ongoing |
| P1 | Custom agent creation | 2 weeks |
| P1 | Dynamic tool installation from catalog | 2 weeks |
| P1 | P3 tool cheat sheets (Activepieces, Windmill, Redash, Penpot, Squidex, Typebot) | 1 week |
| P2 | E-commerce and Consulting first-hour templates | 1 week |
| P2 | DNS automation via Cloudflare/Entri API | 1 week |
Month 7-8 (Weeks 25-32) — Scale
| Priority | Item | Effort |
|---|---|---|
| P0 | Scale to 100 customers; Hetzner overflow activation | Ongoing |
| P1 | Discord + Slack channels | 1 week |
| P1 | Cross-region backup (encrypted offsite) | 2 weeks |
| P1 | Automated backup restore testing | 1 week |
| P2 | Premium model tier (if deferred) | 1 week |
| P2 | Advanced analytics dashboard | 2 weeks |
| P2 | Multi-language support | 2 weeks |
Calendar Mapping
Assuming project start on Monday, March 3, 2026:
| Milestone | Target Date | Calendar Week |
|---|---|---|
| Project kickoff | March 3, 2026 | Week 1 |
| M1 — Core Security Working | March 28, 2026 | End of Week 4 |
| M2 — Backend Pipeline Working | April 25, 2026 | End of Week 8 |
| M3 — End-to-End Journey Working | May 22, 2026 | End of Week 12 |
| Staging deployment | June 5, 2026 | Week 15 |
| M4 — Founding Member Launch | June 19, 2026 | End of Week 16 |
| Stabilization complete | July 17, 2026 | End of Week 20 |
| 50 customers | August 14, 2026 | End of Week 24 |
Holidays to account for (Germany/EU):
- Easter: April 3-6, 2026 (4 days lost in week 5)
- May Day: May 1, 2026 (1 day lost in week 9)
- Ascension: May 14, 2026 (1 day lost in week 11)
- Whit Monday: May 25, 2026 (1 day lost in week 13)
Impact: ~7 working days lost to holidays. This is absorbed by the 38-day buffer. No milestone dates need to shift, but the buffer effectively reduces to ~31 working days.
End of Document — 05 Timeline & Milestones