515 lines
18 KiB
Markdown
515 lines
18 KiB
Markdown
|
|
# Kalei System Architecture Plan
|
|||
|
|
|
|||
|
|
Version: 1.0
|
|||
|
|
Date: 2026-02-10
|
|||
|
|
Status: Proposed canonical architecture for implementation
|
|||
|
|
|
|||
|
|
## 1. Purpose and Scope
|
|||
|
|
|
|||
|
|
This document consolidates the existing Kalei docs into one implementation-ready system architecture plan.
|
|||
|
|
|
|||
|
|
In scope:
|
|||
|
|
- Core features: Mirror, Kaleidoscope (Turn), Lens, Gallery, Spectrum analytics, subscriptions (all ship in v1).
|
|||
|
|
- Mobile-first architecture (iOS/Android via Expo) with optional web support.
|
|||
|
|
- Production operations for safety, privacy, reliability, and cost control.
|
|||
|
|
|
|||
|
|
Out of scope:
|
|||
|
|
- Pixel-level UI specs and brand copy details.
|
|||
|
|
- Provider contract/legal details.
|
|||
|
|
- Full threat model artifacts (to be produced separately).
|
|||
|
|
|
|||
|
|
## 2. Inputs Reviewed
|
|||
|
|
|
|||
|
|
- `docs/app-blueprint.md`
|
|||
|
|
- `docs/kalei-infrastructure-plan.md`
|
|||
|
|
- `docs/kalei-ai-model-comparison.md`
|
|||
|
|
- `docs/kalei-mirror-feature.md`
|
|||
|
|
- `docs/kalei-spectrum-phase2.md`
|
|||
|
|
- `docs/kalei-complete-design.md`
|
|||
|
|
- `docs/kalei-brand-guidelines.md`
|
|||
|
|
|
|||
|
|
## 3. Architecture Drivers
|
|||
|
|
|
|||
|
|
### 3.1 Product drivers
|
|||
|
|
|
|||
|
|
- Core loop quality: Mirror fragment detection and Turn reframes must feel high quality and emotionally calibrated.
|
|||
|
|
- Daily habit loop: low friction, fast response, strong retention mechanics.
|
|||
|
|
- Over time: longitudinal Spectrum insights from accumulated usage data.
|
|||
|
|
|
|||
|
|
### 3.2 Non-functional drivers
|
|||
|
|
|
|||
|
|
- Safety first: crisis language must bypass reframing and trigger support flow.
|
|||
|
|
- Privacy first: personal reflective writing is highly sensitive.
|
|||
|
|
- Cost discipline: launch target under ~EUR 30/month fixed infrastructure.
|
|||
|
|
- Operability: architecture must be maintainable by a small team.
|
|||
|
|
- Gradual scale: support ~50 DAU at launch and scale to ~10k DAU without full rewrite.
|
|||
|
|
|
|||
|
|
## 4. Canonical Decisions
|
|||
|
|
|
|||
|
|
This plan resolves conflicting guidance across current docs.
|
|||
|
|
|
|||
|
|
| Topic | Decision | Rationale |
|
|||
|
|
|---|---|---|
|
|||
|
|
| Backend platform | Self-hosted API-first modular monolith on Node.js (Fastify preferred) | Matches budget constraints and keeps full control of safety, rate limits, and AI routing. |
|
|||
|
|
| Data layer | PostgreSQL 16 + Redis | Postgres for source-of-truth relational + analytics tables; Redis for counters, rate limits, caching, idempotency. |
|
|||
|
|
| Auth | JWT auth service in API + refresh token rotation + social login (Apple/Google) | Aligns with self-hosted stack while preserving mobile auth UX. |
|
|||
|
|
| Mobile | React Native + Expo (local/native builds) | Fastest path for iOS/Android while keeping build pipeline under direct control. |
|
|||
|
|
| AI integration | AI Gateway abstraction via OpenRouter with provider pinning | Single API, automatic failover, no vendor lock-in, and deterministic routing to non-Chinese providers for data privacy. |
|
|||
|
|
| AI default | DeepSeek V3.2 via OpenRouter, hosted on DeepInfra/Fireworks (US/EU infrastructure) | 85–90% cheaper than Claude Haiku with comparable emotional intelligence benchmarks. Provider pinning ensures no data flows through Chinese servers. |
|
|||
|
|
| AI fallback | Claude Haiku 4.5 via OpenRouter (automatic failover on provider outage) | Highest-quality safety net activated transparently when primary provider is unavailable. |
|
|||
|
|
| Billing | Self-hosted entitlement authority (direct App Store + Google Play server APIs) | Keeps billing logic in-house and avoids closed SaaS dependency in core authorization path. |
|
|||
|
|
| Analytics/monitoring | PostHog self-hosted + GlitchTip + centralized app logs + cost telemetry | Open-source-first observability stack with lower vendor lock-in. |
|
|||
|
|
|
|||
|
|
## 5. System Context
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
flowchart LR
|
|||
|
|
user[User] --> app[Expo App]
|
|||
|
|
app --> edge[Edge Proxy]
|
|||
|
|
edge --> api[Kalei API]
|
|||
|
|
api --> db[(PostgreSQL)]
|
|||
|
|
api --> redis[(Redis)]
|
|||
|
|
api --> ai[AI Providers]
|
|||
|
|
api --> billing[Store Entitlements]
|
|||
|
|
api --> push[Push Gateway]
|
|||
|
|
api --> obs[Observability]
|
|||
|
|
app --> analytics[Product Analytics]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 6. Container Architecture
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
flowchart TB
|
|||
|
|
subgraph Client
|
|||
|
|
turn[Turn Screen]
|
|||
|
|
mirror[Mirror Screen]
|
|||
|
|
lens[Lens Screen]
|
|||
|
|
spectrum_ui[Spectrum Dashboard]
|
|||
|
|
profile_ui[Gallery and Profile]
|
|||
|
|
end
|
|||
|
|
|
|||
|
|
subgraph Platform
|
|||
|
|
gateway[API Gateway and Auth]
|
|||
|
|
turn_service[Turn Service]
|
|||
|
|
mirror_service[Mirror Service]
|
|||
|
|
lens_service[Lens Service]
|
|||
|
|
spectrum_service[Spectrum Service]
|
|||
|
|
safety_service[Safety Service]
|
|||
|
|
entitlement_service[Entitlement Service]
|
|||
|
|
jobs[Job Scheduler and Workers]
|
|||
|
|
ai_gateway[AI Gateway]
|
|||
|
|
cost_guard[Usage Meter and Cost Guard]
|
|||
|
|
end
|
|||
|
|
|
|||
|
|
subgraph Data
|
|||
|
|
postgres[(PostgreSQL)]
|
|||
|
|
redis[(Redis)]
|
|||
|
|
object_storage[(Object Storage)]
|
|||
|
|
end
|
|||
|
|
|
|||
|
|
subgraph External
|
|||
|
|
ai_provider[DeepSeek V3.2 via OpenRouter + DeepInfra/Fireworks + Claude Haiku fallback]
|
|||
|
|
store_billing[App Store and Play Billing APIs]
|
|||
|
|
push_provider[APNs and FCM]
|
|||
|
|
glitchtip[GlitchTip]
|
|||
|
|
posthog[PostHog self-hosted]
|
|||
|
|
end
|
|||
|
|
|
|||
|
|
turn --> gateway
|
|||
|
|
mirror --> gateway
|
|||
|
|
lens --> gateway
|
|||
|
|
spectrum_ui --> gateway
|
|||
|
|
profile_ui --> gateway
|
|||
|
|
|
|||
|
|
gateway --> turn_service
|
|||
|
|
gateway --> mirror_service
|
|||
|
|
gateway --> lens_service
|
|||
|
|
gateway --> spectrum_service
|
|||
|
|
gateway --> entitlement_service
|
|||
|
|
|
|||
|
|
mirror_service --> safety_service
|
|||
|
|
turn_service --> safety_service
|
|||
|
|
lens_service --> safety_service
|
|||
|
|
spectrum_service --> safety_service
|
|||
|
|
|
|||
|
|
turn_service --> ai_gateway
|
|||
|
|
mirror_service --> ai_gateway
|
|||
|
|
lens_service --> ai_gateway
|
|||
|
|
spectrum_service --> ai_gateway
|
|||
|
|
ai_gateway --> ai_provider
|
|||
|
|
|
|||
|
|
turn_service --> cost_guard
|
|||
|
|
mirror_service --> cost_guard
|
|||
|
|
lens_service --> cost_guard
|
|||
|
|
spectrum_service --> cost_guard
|
|||
|
|
|
|||
|
|
turn_service --> postgres
|
|||
|
|
mirror_service --> postgres
|
|||
|
|
lens_service --> postgres
|
|||
|
|
spectrum_service --> postgres
|
|||
|
|
entitlement_service --> postgres
|
|||
|
|
jobs --> postgres
|
|||
|
|
|
|||
|
|
turn_service --> redis
|
|||
|
|
mirror_service --> redis
|
|||
|
|
lens_service --> redis
|
|||
|
|
spectrum_service --> redis
|
|||
|
|
cost_guard --> redis
|
|||
|
|
jobs --> redis
|
|||
|
|
|
|||
|
|
entitlement_service --> store_billing
|
|||
|
|
jobs --> push_provider
|
|||
|
|
gateway --> glitchtip
|
|||
|
|
gateway --> posthog
|
|||
|
|
spectrum_service --> object_storage
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 7. Domain and Service Boundaries
|
|||
|
|
|
|||
|
|
### 7.1 Runtime modules
|
|||
|
|
|
|||
|
|
- `auth`: sign-up/sign-in, token issuance/rotation, device session management.
|
|||
|
|
- `entitlements`: direct App Store + Google Play sync, plan gating (`free`, `prism`, `prism_plus`).
|
|||
|
|
- `mirror`: session lifecycle, message ingestion, fragment detection, inline reframe, reflection.
|
|||
|
|
- `turn`: structured reframing workflow and saved patterns.
|
|||
|
|
- `lens`: goals, actions, daily focus generation, check-ins.
|
|||
|
|
- `spectrum`: analytics feature store, weekly/monthly aggregation, insight generation.
|
|||
|
|
- `safety`: crisis detection, escalation, crisis response policy.
|
|||
|
|
- `ai_gateway`: prompt templates, OpenRouter API integration with provider pinning (DeepInfra/Fireworks primary, Claude Haiku fallback), retries/timeouts, structured output validation.
|
|||
|
|
- `usage_cost`: token telemetry, per-user budgets, global spend controls.
|
|||
|
|
- `notifications`: push scheduling, reminders, weekly summaries.
|
|||
|
|
|
|||
|
|
### 7.2 Why modular monolith first
|
|||
|
|
|
|||
|
|
- Lowest operational overhead at launch.
|
|||
|
|
- Strong transaction boundaries in one codebase.
|
|||
|
|
- Easy extraction path later for `spectrum` workers or `ai_gateway` if load increases.
|
|||
|
|
|
|||
|
|
## 8. Core Data Architecture
|
|||
|
|
|
|||
|
|
### 8.1 Data domains
|
|||
|
|
|
|||
|
|
- Identity: users, profiles, auth_sessions, refresh_tokens.
|
|||
|
|
- Product interactions: turns, mirror_sessions, mirror_messages, mirror_fragments, lens_goals, lens_actions.
|
|||
|
|
- Analytics: spectrum_session_analysis, spectrum_turn_analysis, spectrum_weekly, spectrum_monthly.
|
|||
|
|
- Commerce: subscriptions, entitlement_snapshots, billing_events.
|
|||
|
|
- Safety and operations: safety_events, ai_usage_events, request_logs, audit_events.
|
|||
|
|
|
|||
|
|
### 8.2 Entity relationship view
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
flowchart LR
|
|||
|
|
users[USERS] --> profiles[PROFILES]
|
|||
|
|
users --> auth_sessions[AUTH_SESSIONS]
|
|||
|
|
users --> refresh_tokens[REFRESH_TOKENS]
|
|||
|
|
users --> turns[TURNS]
|
|||
|
|
users --> mirror_sessions[MIRROR_SESSIONS]
|
|||
|
|
mirror_sessions --> mirror_messages[MIRROR_MESSAGES]
|
|||
|
|
mirror_messages --> mirror_fragments[MIRROR_FRAGMENTS]
|
|||
|
|
users --> lens_goals[LENS_GOALS]
|
|||
|
|
lens_goals --> lens_actions[LENS_ACTIONS]
|
|||
|
|
users --> spectrum_session[SPECTRUM_SESSION_ANALYSIS]
|
|||
|
|
users --> spectrum_turn[SPECTRUM_TURN_ANALYSIS]
|
|||
|
|
users --> spectrum_weekly[SPECTRUM_WEEKLY]
|
|||
|
|
users --> spectrum_monthly[SPECTRUM_MONTHLY]
|
|||
|
|
users --> subscriptions[SUBSCRIPTIONS]
|
|||
|
|
users --> entitlement[ENTITLEMENT_SNAPSHOTS]
|
|||
|
|
users --> safety_events[SAFETY_EVENTS]
|
|||
|
|
users --> ai_usage[AI_USAGE_EVENTS]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 8.3 Storage policy
|
|||
|
|
|
|||
|
|
- Raw reflective content remains in transactional tables, encrypted at rest.
|
|||
|
|
- Spectrum dashboard reads aggregated tables only by default.
|
|||
|
|
- Per-session exclusion flags allow users to opt out entries from analytics.
|
|||
|
|
- Hard delete workflow removes raw + derived analytics for requested windows.
|
|||
|
|
|
|||
|
|
## 9. Key Runtime Sequences
|
|||
|
|
|
|||
|
|
### 9.1 Mirror message processing with safety gate
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
sequenceDiagram
|
|||
|
|
participant App as Mobile App
|
|||
|
|
participant API as Kalei API
|
|||
|
|
participant Safety as Safety Service
|
|||
|
|
participant Ent as Entitlement Service
|
|||
|
|
participant AI as AI Gateway
|
|||
|
|
participant Model as AI Provider
|
|||
|
|
participant DB as PostgreSQL
|
|||
|
|
participant Redis as Redis
|
|||
|
|
|
|||
|
|
App->>API: POST /mirror/messages
|
|||
|
|
API->>Ent: Check plan/quota
|
|||
|
|
Ent->>Redis: Read counters
|
|||
|
|
Ent-->>API: Allowed
|
|||
|
|
API->>Safety: Crisis precheck
|
|||
|
|
alt Crisis detected
|
|||
|
|
Safety->>DB: Insert safety_event
|
|||
|
|
API-->>App: Crisis resources response
|
|||
|
|
else Not crisis
|
|||
|
|
API->>AI: Detect fragments prompt
|
|||
|
|
AI->>Model: Inference request
|
|||
|
|
Model-->>AI: Fragments with confidence
|
|||
|
|
AI-->>API: Validated structured result
|
|||
|
|
API->>DB: Save message + fragments
|
|||
|
|
API->>Redis: Increment usage counters
|
|||
|
|
API-->>App: Highlight payload
|
|||
|
|
end
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 9.2 Turn (Kaleidoscope) request
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
sequenceDiagram
|
|||
|
|
participant App as Mobile App
|
|||
|
|
participant API as Kalei API
|
|||
|
|
participant Ent as Entitlement Service
|
|||
|
|
participant Safety as Safety Service
|
|||
|
|
participant AI as AI Gateway
|
|||
|
|
participant Model as AI Provider
|
|||
|
|
participant DB as PostgreSQL
|
|||
|
|
participant Cost as Cost Guard
|
|||
|
|
|
|||
|
|
App->>API: POST /turns
|
|||
|
|
API->>Ent: Validate tier + daily cap
|
|||
|
|
API->>Safety: Crisis precheck
|
|||
|
|
alt Crisis detected
|
|||
|
|
API-->>App: Crisis resources response
|
|||
|
|
else Safe
|
|||
|
|
API->>AI: Generate 3 reframes + micro-action
|
|||
|
|
AI->>Model: Inference stream
|
|||
|
|
Model-->>AI: Structured reframes
|
|||
|
|
AI-->>API: Response + token usage
|
|||
|
|
API->>Cost: Record token usage + budget check
|
|||
|
|
API->>DB: Save turn + metadata
|
|||
|
|
API-->>App: Stream final turn result
|
|||
|
|
end
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 9.3 Weekly Spectrum aggregation (background)
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
sequenceDiagram
|
|||
|
|
participant Cron as Scheduler
|
|||
|
|
participant Worker as Spectrum Worker
|
|||
|
|
participant DB as PostgreSQL
|
|||
|
|
participant AI as AI Gateway
|
|||
|
|
participant Model as Batch Provider
|
|||
|
|
participant Push as Notification Service
|
|||
|
|
|
|||
|
|
Cron->>Worker: Trigger weekly job
|
|||
|
|
Worker->>DB: Load eligible users + raw events
|
|||
|
|
Worker->>DB: Compute vectors and weekly aggregates
|
|||
|
|
Worker->>AI: Generate insight narratives from aggregates
|
|||
|
|
AI->>Model: Batch request
|
|||
|
|
Model-->>AI: Insight text
|
|||
|
|
AI-->>Worker: Validated summaries
|
|||
|
|
Worker->>DB: Upsert spectrum_weekly and monthly deltas
|
|||
|
|
Worker->>Push: Enqueue spectrum updated notifications
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 10. API Surface (v1)
|
|||
|
|
|
|||
|
|
### 10.1 Auth and profile
|
|||
|
|
|
|||
|
|
- `POST /auth/register`
|
|||
|
|
- `POST /auth/login`
|
|||
|
|
- `POST /auth/refresh`
|
|||
|
|
- `POST /auth/logout`
|
|||
|
|
- `GET /me`
|
|||
|
|
- `PATCH /me/profile`
|
|||
|
|
|
|||
|
|
### 10.2 Mirror
|
|||
|
|
|
|||
|
|
- `POST /mirror/sessions`
|
|||
|
|
- `POST /mirror/messages`
|
|||
|
|
- `POST /mirror/fragments/{id}/reframe`
|
|||
|
|
- `POST /mirror/sessions/{id}/close`
|
|||
|
|
- `GET /mirror/sessions`
|
|||
|
|
- `DELETE /mirror/sessions/{id}`
|
|||
|
|
|
|||
|
|
### 10.3 Turn
|
|||
|
|
|
|||
|
|
- `POST /turns`
|
|||
|
|
- `GET /turns`
|
|||
|
|
- `GET /turns/{id}`
|
|||
|
|
- `POST /turns/{id}/save`
|
|||
|
|
|
|||
|
|
### 10.4 Lens
|
|||
|
|
|
|||
|
|
- `POST /lens/goals`
|
|||
|
|
- `GET /lens/goals`
|
|||
|
|
- `POST /lens/goals/{id}/actions`
|
|||
|
|
- `POST /lens/actions/{id}/complete`
|
|||
|
|
- `GET /lens/affirmation/today`
|
|||
|
|
|
|||
|
|
### 10.5 Spectrum
|
|||
|
|
|
|||
|
|
- `GET /spectrum/weekly`
|
|||
|
|
- `GET /spectrum/monthly`
|
|||
|
|
- `POST /spectrum/reset`
|
|||
|
|
- `POST /spectrum/exclusions`
|
|||
|
|
|
|||
|
|
### 10.6 Billing and entitlements
|
|||
|
|
|
|||
|
|
- `POST /billing/webhooks/apple`
|
|||
|
|
- `POST /billing/webhooks/google`
|
|||
|
|
- `GET /billing/entitlements`
|
|||
|
|
|
|||
|
|
## 11. Security, Safety, and Compliance Architecture
|
|||
|
|
|
|||
|
|
### 11.1 Security controls
|
|||
|
|
|
|||
|
|
- TLS everywhere (edge proxy to API origin and service egress).
|
|||
|
|
- JWT access tokens (short TTL) + rotating refresh tokens.
|
|||
|
|
- Password hashing with Argon2id (preferred) or bcrypt with strong cost factor.
|
|||
|
|
- Row ownership checks enforced in API and optionally DB RLS for defense in depth.
|
|||
|
|
- Secrets in environment vault; never in client bundle.
|
|||
|
|
- Audit logging for auth events, entitlement changes, deletes, and safety events.
|
|||
|
|
|
|||
|
|
### 11.2 Data protection
|
|||
|
|
|
|||
|
|
- Encryption at rest for disk volumes and database backups.
|
|||
|
|
- Column-level encryption for highly sensitive text fields (Mirror message content).
|
|||
|
|
- Data minimization for analytics: Spectrum reads vectors and aggregates by default.
|
|||
|
|
- User rights flows: export, per-item delete, account delete, Spectrum reset.
|
|||
|
|
|
|||
|
|
### 11.3 Safety architecture
|
|||
|
|
|
|||
|
|
- Multi-stage crisis filter:
|
|||
|
|
1. Deterministic keyword and pattern pass.
|
|||
|
|
2. Low-latency model confirmation where needed.
|
|||
|
|
3. Hardcoded crisis response templates and hotline resources.
|
|||
|
|
- Crisis-level content is never reframed.
|
|||
|
|
- Safety events are logged and monitored for false-positive/false-negative tuning.
|
|||
|
|
|
|||
|
|
## 12. Reliability and Performance
|
|||
|
|
|
|||
|
|
### 12.1 Initial SLO targets
|
|||
|
|
|
|||
|
|
- API availability: 99.5% monthly at launch, 99.9% target at scale.
|
|||
|
|
- Turn and Mirror response latency:
|
|||
|
|
- p50 < 1.8s
|
|||
|
|
- p95 < 3.5s
|
|||
|
|
- Weekly Spectrum jobs completed within 2 hours of scheduled run.
|
|||
|
|
|
|||
|
|
### 12.2 Resilience patterns
|
|||
|
|
|
|||
|
|
- Idempotency keys on write endpoints.
|
|||
|
|
- AI provider timeout + retry policy with circuit breaker.
|
|||
|
|
- Graceful degradation hierarchy when budget/latency pressure occurs:
|
|||
|
|
1. Degrade Lens generation first (template fallback).
|
|||
|
|
2. Keep Turn and Mirror available.
|
|||
|
|
3. Pause non-critical Spectrum generation if needed.
|
|||
|
|
- Dead-letter queue for failed async jobs.
|
|||
|
|
|
|||
|
|
## 13. Observability and FinOps
|
|||
|
|
|
|||
|
|
### 13.1 Telemetry
|
|||
|
|
|
|||
|
|
- Structured logs with request ID, user ID hash, feature, model, token usage, cost.
|
|||
|
|
- Metrics:
|
|||
|
|
- request rate/error rate/latency by endpoint
|
|||
|
|
- AI token usage and cost by feature
|
|||
|
|
- quota denials and safety escalations
|
|||
|
|
- Tracing across API -> AI Gateway -> provider call.
|
|||
|
|
|
|||
|
|
### 13.2 Cost controls
|
|||
|
|
|
|||
|
|
- Global monthly AI spend cap and alert thresholds (50%, 80%, 95%).
|
|||
|
|
- Per-user daily token budget in Redis.
|
|||
|
|
- Feature-level cost envelope with OpenRouter provider routing:
|
|||
|
|
- All features: DeepSeek V3.2 via DeepInfra/Fireworks (US/EU, $0.26/$0.38 per MTok)
|
|||
|
|
- Automatic failover: Claude Haiku 4.5 on provider outage ($1.00/$5.00 per MTok)
|
|||
|
|
- Future: introduce tiered model routing at 5,000+ DAU when usage data justifies complexity
|
|||
|
|
- Prompt caching for stable system prompts (DeepInfra ~20% cache hit discount).
|
|||
|
|
|
|||
|
|
## 14. Deployment Topology and Scaling Path
|
|||
|
|
|
|||
|
|
### 14.1 Launch deployment (single-node)
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
flowchart LR
|
|||
|
|
EDGE[Caddy or Nginx Edge] --> NX[Nginx]
|
|||
|
|
NX --> API[API + Workers]
|
|||
|
|
API --> PG[(PostgreSQL)]
|
|||
|
|
API --> R[(Redis)]
|
|||
|
|
API --> AIP[AI Providers]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 14.2 Scaling evolution
|
|||
|
|
|
|||
|
|
```mermaid
|
|||
|
|
flowchart LR
|
|||
|
|
launch[Launch single VPS API DB Redis] --> traction[Traction split DB keep API monolith]
|
|||
|
|
traction --> growth[Growth separate workers and scale API]
|
|||
|
|
growth --> scale[Scale optional service extraction]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 14.3 Trigger-based scaling
|
|||
|
|
|
|||
|
|
- Move DB off app node when p95 query latency > 120ms sustained or storage > 70%.
|
|||
|
|
- Add API replica when CPU > 70% sustained at peak and p95 latency breaches SLO.
|
|||
|
|
- Split workers when Spectrum jobs impact interactive endpoints.
|
|||
|
|
|
|||
|
|
## 15. Delivery Plan
|
|||
|
|
|
|||
|
|
All features ship in a single unified v1 release. The build is a continuous 12-week effort:
|
|||
|
|
|
|||
|
|
### 15.1 Weeks 1–4: Platform Foundation
|
|||
|
|
|
|||
|
|
- API skeleton, auth, profile, entitlements integration.
|
|||
|
|
- Postgres schema v1 and migrations.
|
|||
|
|
- Mirror + Turn endpoints with safety pre-check.
|
|||
|
|
- Usage metering and rate limiting.
|
|||
|
|
|
|||
|
|
### 15.2 Weeks 5–8: Core Experience
|
|||
|
|
|
|||
|
|
- Lens flows, Rehearsal, Ritual, Evidence Wall, and Gallery history.
|
|||
|
|
- Push notifications and daily reminders.
|
|||
|
|
- Full observability, alerting, and incident runbooks.
|
|||
|
|
- Beta load testing and security hardening.
|
|||
|
|
|
|||
|
|
### 15.3 Weeks 9–12: Spectrum & Launch Readiness
|
|||
|
|
|
|||
|
|
- Spectrum: vector extraction pipeline, aggregated tables, weekly batch jobs, dashboard endpoints.
|
|||
|
|
- Data exclusion controls and reset workflow.
|
|||
|
|
- Cost optimization pass on AI routing.
|
|||
|
|
- Final QA, store submission, beta launch.
|
|||
|
|
|
|||
|
|
## 16. Risks and Mitigations
|
|||
|
|
|
|||
|
|
| Risk | Impact | Mitigation |
|
|||
|
|
|---|---|---|
|
|||
|
|
| Reframe quality variance by provider/model | Core UX degradation | Keep AI Gateway abstraction + blind quality harness + model canary rollout. |
|
|||
|
|
| Safety false negatives | High trust and user harm risk | Defense-in-depth crisis filter + explicit no-reframe crisis policy + monitoring and review loop. |
|
|||
|
|
| AI cost spikes | Margin compression | Hard spend caps, per-feature budgets, degradation order, model fallback lanes. |
|
|||
|
|
| Single-node bottlenecks | Latency and availability issues | Trigger-based scaling plan and early instrumentation. |
|
|||
|
|
| Sensitive data handling errors | Compliance and trust risk | Encryption, strict retention controls, deletion workflows, audit logs. |
|
|||
|
|
|
|||
|
|
## 17. Decision Log and Open Items
|
|||
|
|
|
|||
|
|
### 17.1 Decided in this plan
|
|||
|
|
|
|||
|
|
- Self-hosted API + Postgres + Redis is the canonical launch architecture.
|
|||
|
|
- AI provider routing is built in from day one.
|
|||
|
|
- Safety is an explicit service and gate on all AI-facing paths.
|
|||
|
|
- Spectrum runs asynchronously over aggregated data.
|
|||
|
|
|
|||
|
|
### 17.2 Resolved: AI Provider Strategy (February 2026)
|
|||
|
|
|
|||
|
|
- **Decided:** DeepSeek V3.2 via OpenRouter, pinned to non-Chinese providers (DeepInfra/Fireworks). Single model for all features at launch. Claude Haiku 4.5 as automatic fallback.
|
|||
|
|
- **Rationale:** 85–90% cost reduction vs Claude Haiku. Nature 2025 study confirms comparable emotional intelligence scores. Non-Chinese hosting avoids data sovereignty concerns. Single-model approach minimizes complexity for solo founder.
|
|||
|
|
- **Revisit at:** 600+ DAU (evaluate self-hosting), 5,000+ DAU (evaluate tiered model routing).
|
|||
|
|
|
|||
|
|
### 17.3 Remaining open decisions
|
|||
|
|
|
|||
|
|
- Exact hosting target for DB scaling at traction stage (dedicated VPS vs managed Postgres).
|
|||
|
|
- Regional crisis resource strategy (US-first or multi-region at launch).
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
If approved, this document should become the architecture source of truth and supersede conflicting details in older planning docs.
|