Files
pn-new-crm/docs/superpowers/specs/2026-04-29-gws-inbox-triage-design.md
Matt Ciaccio d9557edfc5 docs(spec): GWS inbox-triage exploratory design (not approved for build)
Surveys what it actually takes to ship the AI inbox-triage feature
gated on Google Workspace integration. Walks through three deployment
models with their real costs:

- Model A (Marketplace OAuth app): 4-6 months calendar, $15k-$75k for
  the required CASA security assessment, recurring re-verification
- Model B (per-customer Internal OAuth app): ~5 weeks engineering, $0
  Google-side, scoped to one workspace per customer
- Model C (forward-to-CRM mailbox): ~1 week, receive-only, no reply
  drafts possible

Recommends Model B for the current customer profile, with B → A
promotion only if 3+ customers ask unprompted.

Documents what's already scaffolded (email_accounts/threads/messages
tables, syncInbox stub, BullMQ email queue, ai_usage_ledger, per-port
aiEnabled flag, withRateLimit('ai')) vs what's new (OAuth flow, Pub/
Sub push receiver, gws_user_tokens + email_triage tables, /inbox UI).

End-to-end flow, schema additions, AI cost interaction with the
Phase 3b token budgets, 5-phase build plan (G1-G5), and 5 open
decisions to resolve before scheduling the build. Explicitly out of
scope: M365, sentiment analysis, smart-drafts, cross-staff triage
queue.

No code changes — this is a design doc to drive a go/no-go decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 01:18:15 +02:00

19 KiB
Raw Permalink Blame History

Google Workspace inbox-triage integration (exploratory)

Status: Exploratory — not approved for build Date: 2026-04-29 Tracks: AI inbox-triage, Google Workspace email connection

What this spec is for

The user has flagged inbox-triage as the most valuable AI surface left to build, but conditioned email integration on it being via Google Workspace specifically (not generic IMAP), with a per-port toggle so clients who don't use GWS aren't billed for capability they can't reach.

This document captures what that build actually costs — especially on the Google side, which is where most teams underestimate the work — so we can decide whether to commit before writing any code. Nothing in this spec is approved for implementation. The deliverable is a go / no-go decision and, if go, a scope choice between three deployment models that cost wildly different amounts of calendar time.

What inbox-triage actually does for the user

Concretely, on the staff member's desktop:

  1. Linked-inbox panel on the client detail page. When you open /[port]/clients/<id> you see the last N email threads with that client, pulled from the staff member's own Gmail. Each thread has the latest message preview, an "open in Gmail" deep-link, and a "draft reply" button (Phase 2+).
  2. Inbox triage queue. A new top-level page /[port]/inbox that lists unread/unanswered threads ranked by AI-assessed importance (high-value client, contractual urgency, chase-overdue). Each row has one-click actions: "log this as a note on the client", "create a follow-up reminder", "draft reply".
  3. Email-driven alerts. When a high-value client emails and no one responds within X hours, the existing alerts engine fires a inbox.unanswered_high_value rule (slots into the alert framework from Phase B without schema change).
  4. Reply drafts (Phase 3). AI generates a reply draft grounded in the client's CRM record (open interests, pending reservations, recent invoices). Staff edit and send through Gmail.

The value is selective: a port with three staff members fielding 50 client emails a day saves maybe an hour a day collectively if the ranking is right. Below that volume the build doesn't pay back.

What already exists in the codebase

The CRM is roughly halfway scaffolded for this:

Surface Status Notes
email_accounts table Exists Has provider: 'google' | 'outlook' | 'custom' discriminator and imap_* / smtp_* cols. Built for IMAP, not OAuth.
email_threads / email_messages tables Exists Already linked to clientId. Schema is good as-is for Gmail.
email-threads.service.ts syncInbox() ⚠ Stub-ish IMAP-flow only. Won't reach Gmail without OAuth + Gmail API rewrite.
email BullMQ queue + inbox-sync job name Exists Worker dispatches on the job name; new sync impl drops in.
google_calendar_tokens table Exists OAuth token storage shape we can mirror for Gmail.
Per-port email override (port email_settings) Exists Used for outbound only today; Gmail integration is per-staff-user, not per-port.
ai_usage_ledger + per-port aiEnabled flag Exists (Phase 3a/3b) Triage AI calls book against the same ledger.
withRateLimit('ai', ...) wrapper Exists (Phase 3c) Caps triage AI traffic at 60/min/user out of the box.

Net: schemas are mostly right. The OAuth flow, Gmail API client, push notification receiver, and triage classifier are the new builds.

Why Google Workspace specifically

The user's stated constraint: "I don't think we need email integration unless we connect it to Google Workspace." Reasons that hold up:

  • No password storage. OAuth tokens are revocable, scoped, and rotate. IMAP requires app passwords, which Google has been actively deprecating since 2024 — they'll be gone for the workspace plans this product targets.
  • Push notifications, not polling. Gmail's users.watch API plus Google Pub/Sub means we get an HTTP callback within seconds of a new message landing. IMAP requires polling on a 30-60 second cadence, which costs more and lags worse.
  • Search and labels. The Gmail API exposes label management and full-text search natively; IMAP search is much weaker.
  • Threading. Gmail's threadId is canonical. Reconstructing threads over IMAP from In-Reply-To / References headers is reliable in theory, painful in practice.

Microsoft 365 is the obvious peer integration but is out of scope here. The Graph API model is similar enough that a future M365 path can reuse most of the storage shape.

Three deployment models — pick one before building

This is the most important decision in the spec. Each model has different OAuth-verification consequences, which dominate everything else.

Model A — Marketplace-published OAuth app

A single OAuth client owned by Port Nimara, listed in the Google Workspace Marketplace, that any GWS customer can install. Each staff member clicks "Connect Gmail," consents to the scopes, and the CRM stores their refresh token.

Google-side work:

  1. Build the OAuth flow in CRM (~1 week).
  2. Submit for OAuth verification. Gmail's gmail.readonly / gmail.modify scopes are restricted scopes — they require:
    • Domain-verified production URLs
    • A homepage with a privacy policy that explicitly enumerates which scopes are used and why
    • A demo video (literally a screen recording) showing the consent screen and what happens next
    • A third-party security assessment from a Google-approved vendor ($15k$75k, 612 weeks)
    • A Cloud Application Security Assessment (CASA) report
  3. Marketplace listing review (~2 weeks after CASA passes).

Calendar time: 46 months. Money: $15k$75k for the security assessment alone. Recurring: Re-verification every 12 months.

Right answer if Port Nimara wants to be the marina-CRM that ships GWS out of the box for any customer. Wrong answer if there are <5 customers who'd use it.

Model B — Per-customer "Internal" OAuth app

Each customer's GWS admin creates an OAuth client inside their own workspace and gives Port Nimara the client ID + secret. Because the app is "Internal," Google skips verification entirely — the consent screen is unverified-but-permitted. Tokens never cross workspace boundaries.

Google-side work per customer:

  1. Customer's GWS admin enables the Gmail API in their Cloud project.
  2. Creates an OAuth 2.0 client ID with type "Internal" + your CRM's redirect URI.
  3. Hands the client ID + secret to Port Nimara out-of-band.
  4. Staff connect their Gmail through that client.

Calendar time per customer: ~1 hour of admin work. Money: $0. Limit: Doesn't span across GWS workspaces. A user with two GWS accounts (e.g. the marina + a personal workspace) can only connect the one matching the OAuth client.

This is the clear winner for the current customer base: small number of customers, each with their own GWS workspace, and each buying the integration as part of an onboarding conversation.

Model C — Forward-to-CRM mailbox

The CRM exposes a per-port email alias (e.g. port-nimara-NN@inbox.portnimara.com). Customers configure a Gmail filter or mailing rule that BCCs that alias on relevant threads. The CRM ingests via SMTP and runs the same triage pipeline.

Google-side work: None. Customer does it as a Gmail filter. Calendar time: ~1 week of CRM-side build. Limit: Receive-only — no reply drafts, no thread state changes, no labels. The "draft reply" feature in Phase 3 above is impossible under this model.

Model C is the right answer if the user wants to ship inbox-triage now and decide on bidirectional Gmail integration later. The schema is designed so the model can be upgraded to A or B without data migration.

Recommendation

Build Model B first. It costs nothing on the Google side, takes ~3 weeks of CRM work, and matches the actual customer profile. Promote to Model A only after 3+ paying customers ask for it unprompted. Until then, the security-assessment cost can't justify itself.

Model C as a fallback for customers who refuse to set up an Internal OAuth app. Build it last, lazily — the schema accommodates it.

End-to-end flow (Model B)

1. Per-port OAuth-app config

New admin page /[port]/admin/google-workspace:

  • Field: "OAuth client ID" (their internal client ID)
  • Field: "OAuth client secret" (encrypted at rest using ENCRYPTION_KEY)
  • Field: "Authorized redirect URI" (read-only; we display the value they need to paste into their Google Cloud Console)
  • Toggle: "Enable Gmail integration for this port"

Stored in system_settings under key gws.config, port-scoped. Resolution mirrors the existing OCR config service.

2. Per-staff connect flow

Staff member visits /[port]/me/integrations, clicks "Connect Gmail."

GET /api/v1/auth/gws/start
  → looks up port's gws.config
  → builds Google authorize URL with port's client_id + state token
  → 302 to Google
[ user consents ]
  → 302 back to /api/v1/auth/gws/callback?code=…&state=…
  → exchanges code for tokens via port's client_secret
  → stores in new `gws_user_tokens` table (encrypted)
  → schedules an `inbox-watch` job

3. Push notification subscription

After tokens are stored, the worker calls gmail.users.watch({ topicName: <Pub/Sub topic>, labelIds: ['INBOX'] }). Gmail then posts to a Pub/Sub topic on every inbox change. The CRM exposes a Pub/Sub push subscription endpoint at /api/webhooks/gmail-push which fetches the changed messages via the delta historyId and writes them into email_messages.

Watch subscriptions expire every 7 days. A maintenance job re-establishes them daily.

4. Triage pipeline

For each new inbound message:

  1. Match against clients and companies by from_address against client_contacts (email channel). Persist a thread→client link if found.
  2. If port has aiEnabled AND gws.triageEnabled, queue an ai job that classifies the thread:
    • urgency: low / medium / high
    • category: invoice-question / availability / contract / other
    • requires_response: boolean
  3. AI call records into ai_usage_ledger with feature='inbox_triage'. The existing per-port budget gates apply automatically.
  4. Triage output written to a new email_triage table keyed on email_messages.id.

5. UI surfaces

  • /[port]/inbox — sorted by triage rank, port-wide view.
  • Linked-inbox panel on client-tabs.tsx — adds a new "Email" tab pulling from email_threads filtered to that client.
  • Alert rule inbox.unanswered_high_value slots into Phase B's alert engine; no schema change.

Schema additions

Three new tables, all port-scoped where it matters:

// Per-staff Gmail tokens. Mirror of google_calendar_tokens.
gws_user_tokens {
  id, userId (UNIQUE), portId, emailAddress,
  accessTokenEnc, refreshTokenEnc, tokenExpiry,
  scope, watchExpiresAt, watchHistoryId,
  connectedAt, lastSyncAt, syncEnabled, createdAt, updatedAt
}

// Triage classifications keyed to messages.
email_triage {
  messageId (PK, FK  email_messages.id ON DELETE CASCADE),
  urgency, category, requiresResponse,
  modelVersion, tokensUsed, classifiedAt
}

// Pub/Sub idempotency log. Gmail re-delivers; we dedupe.
gws_push_log {
  messageId (Pub/Sub message id, PK),
  historyId, receivedAt
}

Plus extensions to email_messages:

  • googleMessageId (text, indexed) — Gmail's own ID for thread ops.
  • googleThreadId (text, indexed).
  • gmailLabels (text[]) — for "is unread" checks without hitting Gmail.

The existing emailAccounts.provider='google' column repurposes unchanged; the IMAP fields go nullable since OAuth-flow accounts won't populate them.

AI cost interaction

Triage AI is opt-in twice: the port admin must turn on aiEnabled (Phase 3a flag, default off) and gws.triageEnabled (this spec, default off). Either toggle off and the inbox sync still runs but skips classification, so staff can manually scan threads without burning tokens.

Per-message token cost on a current Haiku-class model is roughly 15002500 tokens including the system prompt. A port doing 200 inbound emails a day at the upper bound is ~500k tokens/day. The default hard-cap is 500k/month, so triage will trip it inside a day. Two mitigations baked in:

  • The system prompt is short (<500 tokens) and prompt-cached on the Anthropic side, so most tokens are output.
  • Triage runs only on threads not already classified — re-syncs from the watch loop don't re-bill.

The admin UI shows triage as its own line in the per-feature breakdown so customers can see how much their inbox is costing them and tune caps accordingly.

Phased build (assuming Model B)

Phase Scope Effort Ships when
G1 Connect OAuth flow + per-port config + per-user token storage. No sync yet. Staff can connect; nothing happens. 1 week Standalone
G2 Read-only sync Pub/Sub push receiver + delta sync into email_messages. Linked-inbox tab on client detail. No AI. 1 week After G1
G3 Triage classification AI classifier, email_triage writes, /inbox page sorting. Per-port toggle. 1 week After G2; depends on Phase 3b budgets being live (they are)
G4 Reply drafts Gmail API send + draft creation. "Draft reply" button on the client detail Email tab. 1 week After G3
G5 Alerts New inbox.unanswered_high_value rule. Hooks into Phase B alert engine. 2 days After G3

Total: ~5 weeks for a single engineer, assuming the user provides one real GWS workspace to test against during G1.

Open decisions for the user

These are the questions to resolve before scheduling the build, in priority order:

  1. Deployment model — A, B, or C? Default recommendation B.
  2. Single user or domain-wide delegation? Per-staff connect (one token per user) is simpler. Domain-wide delegation lets the port admin connect once on behalf of every staff member but requires the customer to grant a service account broader access. Default recommendation: per-staff.
  3. Scope set. Minimal viable scope is gmail.readonly. To send replies (G4) we need gmail.send. To manage labels (e.g. mark "triaged-by-CRM") we need gmail.modify. Each scope expansion widens the consent screen scariness but doesn't add new verification steps under Model B.
  4. Pub/Sub topic ownership. Pub/Sub topics live in some GCP project. Under Model B the customer's project owns the topic — they pay for Pub/Sub (cents/month) and grant our service account subscriber access. Alternative: Port Nimara owns the topic and the customer's Gmail publishes cross-project (allowed, slightly more setup). Default: customer-owned topic, fewer moving parts.
  5. Triage model. Haiku 4.5 is right for cost; Sonnet 4.6 is right if the ranking quality on Haiku turns out to be poor. Defer this until G3 has real-world tuning data.

Things that are NOT in this spec

  • Microsoft 365 / Outlook integration. Same shape, different API. Once Model B is proven on GWS, Graph API takes another ~3 weeks.
  • Reply drafts grounded in CRM context. That's G4 and depends on the work in this spec, but the prompt engineering for "good replies citing this client's open interests + reservations + invoices" deserves its own design pass before building.
  • Cross-staff triage queue (i.e. "show me all unanswered emails across the team"). That requires either domain-wide delegation (decision #2 above) or per-staff opt-in to a shared view. Punt until staff actually ask for it.
  • Sentiment / urgency tone analysis. Tempting; almost always wrong; skip in v1.
  • "Smart drafts" using the recipient's past replies as context. Every customer asks for this and almost no one uses it once built. Skip.

Cost summary at a glance

Item Model A Model B Model C
Build effort 34 weeks ~5 weeks (over G1G5) ~1 week (receive-only)
Calendar time to first customer 46 months 1 hour of customer admin work 1 hour of customer Gmail-filter work
Up-front cash $15k$75k (CASA) $0 $0
Recurring Re-verification annually None None
Best for 50+ customers, Marketplace play 110 customers, white-glove onboarding Customers who refuse OAuth setup

The recommendation stands: build Model B for G1 + G2 + G3, ship that, and let real customer demand decide whether G4/G5 and Model A promotion are worth the calendar time.