kalei/docs/technical/kalei-infrastructure-plan.md

22 KiB
Raw Blame History

Kalei — Infrastructure & Financial Plan

The Constraint

Starting capital: €0 €2,000 max Monthly burn target: Under €30/month at launch, scaling only when revenue justifies it Goal: Ship a production-quality AI mental wellness app that can serve its first 1,000 users without going broke


1. The AI Decision (This Is Everything)

AI is 7090% of Kalei's variable cost. Every other infrastructure decision is rounding error compared to this one. Here's the full landscape:

Option A: Claude API (Anthropic Direct)

Model Input / MTok Output / MTok Quality Speed
Haiku 4.5 $1.00 $5.00 Near-frontier Fast
Haiku 3 $0.25 $1.25 Good Very fast
Sonnet 4.5 $3.00 $15.00 Frontier Medium

Haiku 4.5 is the sweet spot for Kalei. It's fast, cheap, and its emotional intelligence and nuanced language understanding are strong enough for cognitive distortion detection and reframing — the core product. Sonnet is overkill for most interactions; Opus is completely unnecessary.

Cost optimizations available:

  • Prompt caching: 90% reduction on cached system prompt tokens (cache reads cost 0.1× base input)
  • Batch API: 50% discount for non-real-time processing (Spectrum analysis, weekly insights)
  • Smart prompt design: Keep system prompts tight, reuse cached context

Option B: Open-Source Models via API Providers

Provider Model Input / MTok Output / MTok
OpenRouter / Fireworks Qwen3-235B-A22B $0.45 $1.80
Together AI Llama 3.3 70B $0.20 $0.20
Groq Qwen3-32B $0.29 ~$0.39
DeepInfra Various 770B $0.05$0.50 $0.10$1.50

Cheaper on paper, but critical trade-off: quality of emotional understanding. Kalei's entire value proposition is that the AI "gets" your thinking patterns and offers genuinely insightful reframes. A bad reframe isn't just unhelpful — in a mental wellness context, it can feel dismissive or harmful. Claude Haiku 4.5 was specifically trained for nuance, safety, and emotional calibration.

Option C: Self-Hosted GPU (Eliminated)

  • Netcup vGPU (H200, 7GB VRAM): €137/month — that's our entire budget
  • Even cheap GPU providers (vast.ai, Lambda): $50150/month for anything capable of running a 30B+ model
  • Requires DevOps expertise to maintain inference servers
  • Verdict: Not viable at our budget. Not even close. Revisit only at 5,000+ paying users.

The Decision: Hybrid API Strategy

Primary engine: Claude Haiku 4.5 via Anthropic API Batch processing: Claude Haiku 4.5 Batch API (50% off) for Spectrum analysis, weekly insights Fallback / cost ceiling: If costs spike, route simple tasks (affirmation generation, basic Lens content) to Qwen3-32B via Groq ($0.29/$0.39 per MTok) — 7085% cheaper for tasks that don't require Claude's emotional depth

This gives us Claude-quality where it matters (Mirror, Kaleidoscope, crisis detection) and an escape valve for cost control.


2. Per-User AI Cost Model

Here's what a real user session looks like in tokens:

The Mirror (Freeform Writing + AI Highlights)

Component Input Tokens Output Tokens
System prompt (cached after first call) ~800
User's writing (per session, ~300 words) ~400
Fragment detection (5 highlights avg) ~500
Inline reframe (per tap, user triggers ~2) ~200 ~150
Session Reflection ~300 ~400
Total per Mirror session ~1,700 ~1,050

With prompt caching (system prompt cached): effective input ≈ 980 tokens (800 cached at 0.1×) + 900 fresh = ~980 billable input tokens

The Kaleidoscope (One Turn)

Component Input Tokens Output Tokens
System prompt (cached) ~600
User's fragment + context ~300
3 reframe perspectives ~450
Total per Turn ~900 ~450

With caching: ~360 billable input tokens

The Lens (Daily Affirmation)

Component Input Tokens Output Tokens
System prompt (cached) ~400
User context + goals ~200
Generated affirmation ~100
Total per daily affirmation ~600 ~100

With caching: ~240 billable input tokens

Monthly Usage Per Active User Profile

Free user (3 Turns/day, 2 Mirror sessions/week, daily Lens):

Feature Sessions/Month Billable Input Tokens Output Tokens
Kaleidoscope 90 Turns 32,400 40,500
Mirror 8 sessions 7,840 8,400
Lens 30 affirmations 7,200 3,000
Total 47,440 51,900

Cost with Haiku 4.5: (47,440 × $1.00 + 51,900 × $5.00) / 1,000,000 = $0.047 + $0.260 = $0.31/month

Prism subscriber (unlimited usage, assume 2× free user + Spectrum):

Feature Sessions/Month Billable Input Tokens Output Tokens
Kaleidoscope 180 Turns 64,800 81,000
Mirror 16 sessions 15,680 16,800
Lens 30 affirmations 7,200 3,000
Spectrum (batch) 4 analyses 8,000 12,000
Total 95,680 112,800

Cost with Haiku 4.5: $0.096 + $0.564 = $0.66/month Spectrum uses Batch API (50% off): saves ~$0.03, so effective = ~$0.63/month

Reality check: Most users won't hit max usage. Expect average active user cost of $0.15$0.40/month.


3. Infrastructure Stack

Server: Netcup VPS 1000 G12

Spec Value
CPU 4 vCores (AMD EPYC)
RAM 8 GB DDR5 ECC
Storage 256 GB NVMe
Bandwidth Unlimited, 2.5 Gbps
Location Nuremberg, Germany
Price €8.45/month (~$9.20)

This runs everything: API server, database, Redis cache, reverse proxy. Comfortably handles hundreds of concurrent users. Can upgrade to VPS 2000 (€15.59/mo) when we outgrow it.

What runs on this box:

  • Node.js / Express API server (or Fastify for speed)
  • PostgreSQL 16 (direct install, not Supabase overhead)
  • Redis (session cache, rate limiting, prompt cache keys)
  • Nginx (reverse proxy, SSL termination, rate limiting)
  • Certbot (free SSL via Let's Encrypt)

Why NOT Supabase Cloud

Supabase Cloud Pro is $25/month — that's 3× our VPS cost and we'd still need a separate server for the API layer. Self-hosting Supabase via Docker is possible but adds ~2GB RAM overhead for all the services (GoTrue, PostgREST, Realtime, Storage, Kong). On an 8GB VPS, that leaves very little room.

Instead: Run PostgreSQL directly. We get all the database functionality we need (Row Level Security, triggers, functions, JSON support) without the Supabase services overhead. We build our own auth layer (JWT-based, simple) and our own API. This is leaner, cheaper, and gives us full control.

If we later want Supabase features (real-time subscriptions, storage), we can self-host just the components we need.

Domain & DNS

Item Cost
kalei.ai domain ~$5070/year (~$5/month)
Cloudflare DNS (free tier) $0
Cloudflare CDN/DDoS (free tier) $0

App Deployment & Distribution

Item Cost
Expo / EAS Build (free tier) $0 (limited builds, queue wait)
Apple Developer Program $99/year (~$8.25/month)
Google Play Developer $25 one-time
Push Notifications (Firebase Cloud Messaging) $0

Build strategy: Use Expo free tier for development. For production releases, use EAS free tier (low priority queue, ~30 min wait) or build locally. 24 builds per month is fine for the free tier.

Email & Transactional

Item Cost
Resend (transactional email, free tier) $0 (up to 100 emails/day)
Or Brevo free tier $0 (300 emails/day)

Monitoring & Error Tracking

Item Cost
Sentry (free tier) $0 (5K errors/month)
UptimeRobot (free tier) $0 (50 monitors)
Custom logging to PostgreSQL $0

4. Total Monthly Cost Breakdown

Phase 0: Pre-Launch / Development (Months 13)

Item Monthly Cost
Netcup VPS 1000 G12 €8.45
Domain (kalei.ai) ~€5.00
Claude API (dev/testing, ~$5 credit free) €0
Expo Free Tier €0
Cloudflare, Sentry, email €0
Total ~€13.50/month

Upfront costs: Apple Developer ($99) + Google Play ($25) + Domain (~$55/year) = ~€180 one-time

Phase 1: Launch (0500 users, ~50 DAU)

Assuming 50 daily active users, ~200 registered:

Item Monthly Cost
Netcup VPS 1000 G12 €8.45
Domain ~€5.00
Claude API (~50 active × $0.25 avg) ~€12.50
Expo Free Tier €0
Infrastructure (Cloudflare, etc.) €0
Total ~€26/month

Phase 2: Traction (5002,000 users, ~200 DAU)

Item Monthly Cost
Netcup VPS 2000 G12 (upgrade) €15.59
Domain ~€5.00
Claude API (~200 active × $0.25 avg) ~€50.00
Expo Starter (if needed for OTA updates) €19.00
Email (may need paid tier) €010
Total ~€90100/month

Phase 3: Growth (2,00010,000 users, ~1,000 DAU)

Item Monthly Cost
Netcup VPS 4000 G12 €26.18
Domain ~€5.00
Claude API (~1,000 active × $0.25 avg) ~€250.00
Expo Production plan €99.00
Email paid tier ~€20
Sentry paid (if needed) ~€26
Total ~€425/month

At this point, AI cost is 60% of total spend. This is where the Groq/open-source fallback for simple tasks starts saving real money.


5. Pricing Reevaluation

The Old Price: $7.99/month (Prism)

Based on the cost model above, let's check if this works:

At Phase 1 (50 DAU, ~10 paying subscribers):

  • Revenue: 10 × $7.99 = $79.90
  • Costs: ~$28
  • Margin: +$52 (65%)

At Phase 2 (200 DAU, ~40 paying subscribers @ 20% conversion):

  • Revenue: 40 × $7.99 = $319.60
  • Costs: ~$100
  • Margin: +$220 (69%)

At Phase 3 (1,000 DAU, ~150 paying subscribers @ 15% conversion):

  • Revenue: 150 × $7.99 = $1,198.50
  • Costs: ~$425
  • Margin: +$773 (65%)

The margins are healthy. But $7.99 feels like a lot for a brand-new app from an unknown brand in a competitive wellness space. Users compare against Headspace ($12.99), Calm ($14.99), but those have massive brand recognition and content libraries.

The New Price: $4.99/month (Prism)

Why $4.99:

  • Psychological barrier is much lower — impulse-buy territory
  • Significantly undercuts major competitors while offering AI personalization they don't have
  • At $0.63/month cost per Prism subscriber, the margin is still 87%
  • Annual option: $39.99/year ($3.33/month) — strong incentive to commit
  • Free tier remains generous enough to demonstrate value (3 Turns/day, 2 Mirror/week)

Revised projections at $4.99:

Phase Paying Users Monthly Revenue Monthly Cost Margin
Phase 1 15 (higher conversion at lower price) $74.85 ~$28 +$47 (62%)
Phase 2 60 $299.40 ~$100 +$200 (67%)
Phase 3 250 $1,247.50 ~$450 +$798 (64%)

The lower price likely drives higher conversion, so net revenue is similar or better. And the margin stays well above 60% at every stage.

Alternative: Tiered Pricing

Tier Price What You Get
Free $0 3 Turns/day, 2 Mirror/week, basic Lens, 30-day Gallery
Prism $4.99/mo Unlimited Turns + Mirror, advanced reframe styles, full Gallery, fragment tracking
Prism+ $9.99/mo Everything in Prism + full Spectrum dashboard, weekly/monthly AI insights, export, priority processing

This is smart because Spectrum is the most expensive feature (batch AI analysis of historical data) and the most valuable retention tool. Gating it behind a higher tier means only your most engaged (and willing-to-pay) users generate that cost, and they're paying for it.


6. Revenue Milestones & Sustainability

Break-Even Analysis

Monthly fixed costs (Phase 1): ~€16 (VPS + domain) Variable cost per active user: ~€0.25

Break-even on fixed costs alone: 4 Prism subscribers at $4.99 cover the infrastructure.

To cover Apple's annual fee ($99) and Google ($25 amortized): add ~$10/month → total of 6 subscribers to fully break even.

Path to Sustainability

Milestone Users Paying MRR Costs Profit
Month 3 100 5 $25 $28 -$3
Month 6 500 30 $150 $45 +$105
Month 9 1,500 80 $400 $85 +$315
Month 12 3,000 200 $1,000 $200 +$800
Month 18 8,000 600 $3,000 $500 +$2,500

The model becomes self-sustaining around month 45 with ~15 paying subscribers.


7. Technical Architecture Summary

┌─────────────────────────────────────────────────────┐
│                   CLIENTS                            │
│         React Native (iOS + Android)                 │
│              via Expo / EAS                          │
└──────────────────┬──────────────────────────────────┘
                   │ HTTPS
                   ▼
┌─────────────────────────────────────────────────────┐
│              CLOUDFLARE (Free Tier)                  │
│        DNS · CDN · DDoS Protection · SSL            │
└──────────────────┬──────────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────────┐
│        NETCUP VPS 1000 G12 (€8.45/mo)              │
│                                                      │
│  ┌──────────┐  ┌───────────┐  ┌──────────────────┐  │
│  │  Nginx   │→ │  Node.js  │→ │   PostgreSQL 16  │  │
│  │ (proxy)  │  │  API      │  │   (all app data) │  │
│  └──────────┘  └─────┬─────┘  └──────────────────┘  │
│                      │         ┌──────────────────┐  │
│                      │         │     Redis        │  │
│                      │         │  (cache/sessions)│  │
│                      │         └──────────────────┘  │
└──────────────────────┼──────────────────────────────┘
                       │ API Calls
                       ▼
        ┌──────────────────────────────┐
        │     ANTHROPIC API            │
        │                              │
        │  Haiku 4.5 (primary)         │
        │  • Mirror fragment detection │
        │  • Kaleidoscope reframes     │
        │  • Lens affirmations         │
        │  • Crisis detection          │
        │                              │
        │  Haiku 4.5 Batch (50% off)   │
        │  • Spectrum weekly analysis   │
        │  • Monthly insights          │
        │  • Growth trajectory calc    │
        └──────────────────────────────┘
                       │
          (Future fallback if costs spike)
                       │
        ┌──────────────────────────────┐
        │     GROQ / OPENROUTER        │
        │  Qwen3-32B ($0.29/MTok)      │
        │  • Simple affirmations       │
        │  • Basic Lens content        │
        │  • Non-critical generation   │
        └──────────────────────────────┘

Key Technical Decisions

Auth: Custom JWT-based auth built into our Node.js API. Uses bcrypt for password hashing, short-lived access tokens (15 min) + long-lived refresh tokens stored in PostgreSQL. Social login (Apple Sign-In, Google) via their SDKs — free.

Database schema: PostgreSQL with Row Level Security policies. Tables for users, mirror_sessions, mirror_fragments, turns, lens_goals, spectrum_analyses. All user content encrypted at rest (PostgreSQL pgcrypto extension).

AI request pipeline:

  1. Client sends user text to our API
  2. API constructs prompt with cached system prompt + user context
  3. API calls Claude Haiku 4.5, streams response back to client
  4. API logs token usage for cost tracking
  5. Response stored in PostgreSQL for Spectrum analysis

Rate limiting: Redis-based. Free tier: 3 Turns/day, 2 Mirror/week enforced server-side. Prism: unlimited but soft-capped at 50 Turns/day to prevent abuse (99.9% of users will never hit this).

Prompt caching strategy: System prompts for each feature (Mirror, Kaleidoscope, Lens) are designed to be identical across users. Only the user's specific content changes. With Anthropic's prompt caching, the system prompt (600800 tokens) is cached and subsequent calls within a 5-minute window cost only 10% of the base input rate. This cuts effective input costs by ~4050%.


8. Cost Control Safeguards

These prevent a surprise API bill from killing the project:

  1. Hard spending cap on Anthropic API dashboard (start at $50/month, increase as revenue grows)
  2. Per-user daily token budget tracked in Redis. If a user somehow generates excessive requests, they get a "take a break" message (fits the wellness brand perfectly)
  3. Graceful degradation: If API budget is 80% consumed, route Lens affirmations to local template system (pre-written affirmations, no AI needed). Mirror and Kaleidoscope get priority for remaining budget.
  4. Batch everything possible: Spectrum analysis runs overnight via Batch API (50% off). Weekly insights generated in a single batch job Sunday night.
  5. Monitor daily: Simple Telegram bot alerts if daily API spend exceeds threshold

9. Startup Budget Allocation

With a maximum €2,000 to spend wisely:

Category Amount What It Covers
Apple Developer Account €99 Annual fee, required for App Store
Google Play Developer €25 One-time fee
Domain (kalei.ai, 1 year) ~€55 Annual registration
Netcup VPS (6 months prepaid) ~€51 Runway for half a year of hosting
Claude API credits (initial deposit) €100 Covers dev + testing + first ~300 active user-months
Design assets (fonts, if not free) €050 Inter + custom weight = free. Icon set if needed.
Contingency ~€120 Unexpected costs
Total startup spend ~€450500
Remaining reserve ~€1,500 10+ months of Phase 1 operating costs

This means the €2,000 budget gives us roughly 1214 months of runway before we need a single paying customer. That's extremely comfortable for finding product-market fit.


10. When to Scale (And What Changes)

Trigger Action Cost Impact
>200 concurrent connections Upgrade to VPS 2000 (€15.59) +€7/month
>500 DAU Add Redis Cluster or separate DB VPS +€58/month
>$200/month API spend Implement Groq fallback for Lens Saves ~20% on AI costs
>2,000 DAU Upgrade to VPS 4000 (€26.18) +€10/month
>$500/month API spend Evaluate self-hosted Qwen3-32B on netcup vGPU Only if cost-effective at that volume
>10,000 DAU Consider second VPS for API/DB separation Architecture change
>$2,000/month revenue Consider dedicated server or managed Postgres Comfort/reliability upgrade

The beauty of this architecture is that nothing changes architecturally as we scale — we just give the same VPS more resources, and the API costs scale linearly and predictably with users.


11. Competitive Cost Comparison

To put this in perspective — what would this cost on "standard" startup infrastructure?

Our Stack "Normal" Startup Stack Monthly Cost
Netcup VPS (€8.45) AWS EC2 t3.medium $3550
PostgreSQL on VPS ($0) Supabase Pro or RDS $2550
Redis on VPS ($0) Redis Cloud or ElastiCache $1530
Cloudflare free ($0) AWS CloudFront + ALB $2040
Claude Haiku 4.5 (same) Claude Haiku 4.5 (same) Same
Our total: ~$28/mo Their total: ~$120200/mo

We're running at 2025% of what a "typical" startup would spend by self-hosting on a European VPS instead of defaulting to AWS/GCP.


12. Final Pricing Recommendation

Free Prism Prism+ (Phase 2 launch)
Price $0 $4.99/month $9.99/month
$39.99/year $79.99/year
Turns/day 3 Unlimited Unlimited
Mirror/week 2 Unlimited Unlimited
Lens Basic Full Full
Reframe styles 1 (Compassionate) All 4 All 4
Gallery 30 days Full history Full history
Fragment tracking No Yes Yes
Spectrum No No Full dashboard
Weekly AI insights No No Yes
Growth trajectory No No Yes
Export No Basic Full
Our cost per user ~$0.15 ~$0.45 ~$0.70
Margin N/A (acquisition) 91% 93%

Why This Works

At $4.99, Kalei is:

  • Cheaper than Headspace ($12.99), Calm ($14.99), Woebot (free but limited)
  • More personalized than any of them (AI-powered, not pre-recorded content)
  • Profitable from subscriber #6
  • Self-sustaining from month ~5
  • Fully funded for 12+ months on a €2,000 budget even with zero revenue

The model scales cleanly because AI costs are the only meaningful variable cost, and they scale linearly with usage at a rate that our pricing covers with 90%+ margins.


Last updated: February 2026 All prices include VAT where applicable. USD/EUR conversions at approximate current rates.