docs(spec): GWS inbox-triage exploratory design (not approved for build)
Surveys what it actually takes to ship the AI inbox-triage feature
gated on Google Workspace integration. Walks through three deployment
models with their real costs:
- Model A (Marketplace OAuth app): 4-6 months calendar, $15k-$75k for
the required CASA security assessment, recurring re-verification
- Model B (per-customer Internal OAuth app): ~5 weeks engineering, $0
Google-side, scoped to one workspace per customer
- Model C (forward-to-CRM mailbox): ~1 week, receive-only, no reply
drafts possible
Recommends Model B for the current customer profile, with B → A
promotion only if 3+ customers ask unprompted.
Documents what's already scaffolded (email_accounts/threads/messages
tables, syncInbox stub, BullMQ email queue, ai_usage_ledger, per-port
aiEnabled flag, withRateLimit('ai')) vs what's new (OAuth flow, Pub/
Sub push receiver, gws_user_tokens + email_triage tables, /inbox UI).
End-to-end flow, schema additions, AI cost interaction with the
Phase 3b token budgets, 5-phase build plan (G1-G5), and 5 open
decisions to resolve before scheduling the build. Explicitly out of
scope: M365, sentiment analysis, smart-drafts, cross-staff triage
queue.
No code changes — this is a design doc to drive a go/no-go decision.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
376
docs/superpowers/specs/2026-04-29-gws-inbox-triage-design.md
Normal file
376
docs/superpowers/specs/2026-04-29-gws-inbox-triage-design.md
Normal file
@@ -0,0 +1,376 @@
|
||||
# Google Workspace inbox-triage integration (exploratory)
|
||||
|
||||
**Status:** Exploratory — not approved for build
|
||||
**Date:** 2026-04-29
|
||||
**Tracks:** AI inbox-triage, Google Workspace email connection
|
||||
|
||||
## What this spec is for
|
||||
|
||||
The user has flagged inbox-triage as the most valuable AI surface left to
|
||||
build, but conditioned email integration on it being via Google Workspace
|
||||
specifically (not generic IMAP), with a per-port toggle so clients who
|
||||
don't use GWS aren't billed for capability they can't reach.
|
||||
|
||||
This document captures what that build actually costs — especially on
|
||||
the Google side, which is where most teams underestimate the work — so
|
||||
we can decide whether to commit before writing any code. **Nothing in
|
||||
this spec is approved for implementation.** The deliverable is a go /
|
||||
no-go decision and, if go, a scope choice between three deployment
|
||||
models that cost wildly different amounts of calendar time.
|
||||
|
||||
## What inbox-triage actually does for the user
|
||||
|
||||
Concretely, on the staff member's desktop:
|
||||
|
||||
1. **Linked-inbox panel on the client detail page.** When you open
|
||||
`/[port]/clients/<id>` you see the last N email threads with that
|
||||
client, pulled from the staff member's own Gmail. Each thread has
|
||||
the latest message preview, an "open in Gmail" deep-link, and a
|
||||
"draft reply" button (Phase 2+).
|
||||
2. **Inbox triage queue.** A new top-level page `/[port]/inbox` that
|
||||
lists unread/unanswered threads ranked by AI-assessed importance
|
||||
(high-value client, contractual urgency, chase-overdue). Each row
|
||||
has one-click actions: "log this as a note on the client",
|
||||
"create a follow-up reminder", "draft reply".
|
||||
3. **Email-driven alerts.** When a high-value client emails and no one
|
||||
responds within X hours, the existing alerts engine fires a
|
||||
`inbox.unanswered_high_value` rule (slots into the alert framework
|
||||
from Phase B without schema change).
|
||||
4. **Reply drafts (Phase 3).** AI generates a reply draft grounded in
|
||||
the client's CRM record (open interests, pending reservations,
|
||||
recent invoices). Staff edit and send through Gmail.
|
||||
|
||||
The value is selective: a port with three staff members fielding 50
|
||||
client emails a day saves maybe an hour a day collectively if the
|
||||
ranking is right. Below that volume the build doesn't pay back.
|
||||
|
||||
## What already exists in the codebase
|
||||
|
||||
The CRM is roughly halfway scaffolded for this:
|
||||
|
||||
| Surface | Status | Notes |
|
||||
| ----------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `email_accounts` table | ✅ Exists | Has `provider: 'google' \| 'outlook' \| 'custom'` discriminator and `imap_*` / `smtp_*` cols. Built for IMAP, not OAuth. |
|
||||
| `email_threads` / `email_messages` tables | ✅ Exists | Already linked to `clientId`. Schema is good as-is for Gmail. |
|
||||
| `email-threads.service.ts` `syncInbox()` | ⚠ Stub-ish | IMAP-flow only. Won't reach Gmail without OAuth + Gmail API rewrite. |
|
||||
| `email` BullMQ queue + `inbox-sync` job name | ✅ Exists | Worker dispatches on the job name; new sync impl drops in. |
|
||||
| `google_calendar_tokens` table | ✅ Exists | OAuth token storage shape we can mirror for Gmail. |
|
||||
| Per-port email override (port `email_settings`) | ✅ Exists | Used for outbound only today; Gmail integration is per-staff-user, not per-port. |
|
||||
| `ai_usage_ledger` + per-port `aiEnabled` flag | ✅ Exists (Phase 3a/3b) | Triage AI calls book against the same ledger. |
|
||||
| `withRateLimit('ai', ...)` wrapper | ✅ Exists (Phase 3c) | Caps triage AI traffic at 60/min/user out of the box. |
|
||||
|
||||
Net: schemas are mostly right. The OAuth flow, Gmail API client, push
|
||||
notification receiver, and triage classifier are the new builds.
|
||||
|
||||
## Why Google Workspace specifically
|
||||
|
||||
The user's stated constraint: "I don't think we need email integration
|
||||
unless we connect it to Google Workspace." Reasons that hold up:
|
||||
|
||||
- **No password storage.** OAuth tokens are revocable, scoped, and
|
||||
rotate. IMAP requires app passwords, which Google has been actively
|
||||
deprecating since 2024 — they'll be gone for the workspace plans
|
||||
this product targets.
|
||||
- **Push notifications, not polling.** Gmail's `users.watch` API plus
|
||||
Google Pub/Sub means we get an HTTP callback within seconds of a new
|
||||
message landing. IMAP requires polling on a 30-60 second cadence,
|
||||
which costs more and lags worse.
|
||||
- **Search and labels.** The Gmail API exposes label management and
|
||||
full-text search natively; IMAP search is much weaker.
|
||||
- **Threading.** Gmail's `threadId` is canonical. Reconstructing
|
||||
threads over IMAP from `In-Reply-To` / `References` headers is
|
||||
reliable in theory, painful in practice.
|
||||
|
||||
Microsoft 365 is the obvious peer integration but is out of scope here.
|
||||
The Graph API model is similar enough that a future M365 path can reuse
|
||||
most of the storage shape.
|
||||
|
||||
## Three deployment models — pick one before building
|
||||
|
||||
This is the most important decision in the spec. Each model has
|
||||
different OAuth-verification consequences, which dominate everything
|
||||
else.
|
||||
|
||||
### Model A — Marketplace-published OAuth app
|
||||
|
||||
A single OAuth client owned by Port Nimara, listed in the Google
|
||||
Workspace Marketplace, that any GWS customer can install. Each staff
|
||||
member clicks "Connect Gmail," consents to the scopes, and the CRM
|
||||
stores their refresh token.
|
||||
|
||||
**Google-side work:**
|
||||
|
||||
1. Build the OAuth flow in CRM (~1 week).
|
||||
2. Submit for OAuth verification. Gmail's `gmail.readonly` /
|
||||
`gmail.modify` scopes are **restricted scopes** — they require:
|
||||
- Domain-verified production URLs
|
||||
- A homepage with a privacy policy that explicitly enumerates which
|
||||
scopes are used and why
|
||||
- A demo video (literally a screen recording) showing the consent
|
||||
screen and what happens next
|
||||
- **A third-party security assessment from a Google-approved
|
||||
vendor** ($15k–$75k, 6–12 weeks)
|
||||
- A Cloud Application Security Assessment (CASA) report
|
||||
3. Marketplace listing review (~2 weeks after CASA passes).
|
||||
|
||||
**Calendar time:** 4–6 months.
|
||||
**Money:** $15k–$75k for the security assessment alone.
|
||||
**Recurring:** Re-verification every 12 months.
|
||||
|
||||
Right answer if Port Nimara wants to be the marina-CRM that ships GWS
|
||||
out of the box for _any_ customer. Wrong answer if there are <5
|
||||
customers who'd use it.
|
||||
|
||||
### Model B — Per-customer "Internal" OAuth app
|
||||
|
||||
Each customer's GWS admin creates an OAuth client _inside their own
|
||||
workspace_ and gives Port Nimara the client ID + secret. Because the
|
||||
app is "Internal," Google skips verification entirely — the consent
|
||||
screen is unverified-but-permitted. Tokens never cross workspace
|
||||
boundaries.
|
||||
|
||||
**Google-side work per customer:**
|
||||
|
||||
1. Customer's GWS admin enables the Gmail API in their Cloud project.
|
||||
2. Creates an OAuth 2.0 client ID with type "Internal" + your CRM's
|
||||
redirect URI.
|
||||
3. Hands the client ID + secret to Port Nimara out-of-band.
|
||||
4. Staff connect their Gmail through that client.
|
||||
|
||||
**Calendar time per customer:** ~1 hour of admin work.
|
||||
**Money:** $0.
|
||||
**Limit:** Doesn't span across GWS workspaces. A user with two GWS
|
||||
accounts (e.g. the marina + a personal workspace) can only connect the
|
||||
one matching the OAuth client.
|
||||
|
||||
This is the **clear winner for the current customer base**: small
|
||||
number of customers, each with their own GWS workspace, and each
|
||||
buying the integration as part of an onboarding conversation.
|
||||
|
||||
### Model C — Forward-to-CRM mailbox
|
||||
|
||||
The CRM exposes a per-port email alias (e.g.
|
||||
`port-nimara-NN@inbox.portnimara.com`). Customers configure a Gmail
|
||||
filter or mailing rule that BCCs that alias on relevant threads. The
|
||||
CRM ingests via SMTP and runs the same triage pipeline.
|
||||
|
||||
**Google-side work:** None. Customer does it as a Gmail filter.
|
||||
**Calendar time:** ~1 week of CRM-side build.
|
||||
**Limit:** Receive-only — no reply drafts, no thread state changes,
|
||||
no labels. The "draft reply" feature in Phase 3 above is impossible
|
||||
under this model.
|
||||
|
||||
Model C is the right answer if the user wants to ship inbox-triage
|
||||
_now_ and decide on bidirectional Gmail integration later. The schema
|
||||
is designed so the model can be upgraded to A or B without data
|
||||
migration.
|
||||
|
||||
### Recommendation
|
||||
|
||||
**Build Model B first.** It costs nothing on the Google side, takes
|
||||
~3 weeks of CRM work, and matches the actual customer profile.
|
||||
**Promote to Model A only after 3+ paying customers ask for it
|
||||
unprompted.** Until then, the security-assessment cost can't justify
|
||||
itself.
|
||||
|
||||
Model C as a fallback for customers who refuse to set up an Internal
|
||||
OAuth app. Build it last, lazily — the schema accommodates it.
|
||||
|
||||
## End-to-end flow (Model B)
|
||||
|
||||
### 1. Per-port OAuth-app config
|
||||
|
||||
New admin page `/[port]/admin/google-workspace`:
|
||||
|
||||
- Field: "OAuth client ID" (their internal client ID)
|
||||
- Field: "OAuth client secret" (encrypted at rest using `ENCRYPTION_KEY`)
|
||||
- Field: "Authorized redirect URI" (read-only; we display the value
|
||||
they need to paste into their Google Cloud Console)
|
||||
- Toggle: "Enable Gmail integration for this port"
|
||||
|
||||
Stored in `system_settings` under key `gws.config`, port-scoped.
|
||||
Resolution mirrors the existing OCR config service.
|
||||
|
||||
### 2. Per-staff connect flow
|
||||
|
||||
Staff member visits `/[port]/me/integrations`, clicks "Connect Gmail."
|
||||
|
||||
```
|
||||
GET /api/v1/auth/gws/start
|
||||
→ looks up port's gws.config
|
||||
→ builds Google authorize URL with port's client_id + state token
|
||||
→ 302 to Google
|
||||
[ user consents ]
|
||||
→ 302 back to /api/v1/auth/gws/callback?code=…&state=…
|
||||
→ exchanges code for tokens via port's client_secret
|
||||
→ stores in new `gws_user_tokens` table (encrypted)
|
||||
→ schedules an `inbox-watch` job
|
||||
```
|
||||
|
||||
### 3. Push notification subscription
|
||||
|
||||
After tokens are stored, the worker calls
|
||||
`gmail.users.watch({ topicName: <Pub/Sub topic>, labelIds: ['INBOX'] })`.
|
||||
Gmail then posts to a Pub/Sub topic on every inbox change. The CRM
|
||||
exposes a Pub/Sub push subscription endpoint at
|
||||
`/api/webhooks/gmail-push` which fetches the changed messages via the
|
||||
delta `historyId` and writes them into `email_messages`.
|
||||
|
||||
Watch subscriptions expire every 7 days. A maintenance job
|
||||
re-establishes them daily.
|
||||
|
||||
### 4. Triage pipeline
|
||||
|
||||
For each new inbound message:
|
||||
|
||||
1. Match against `clients` and `companies` by `from_address` against
|
||||
`client_contacts` (email channel). Persist a thread→client link if
|
||||
found.
|
||||
2. If port has `aiEnabled` AND `gws.triageEnabled`, queue an `ai`
|
||||
job that classifies the thread:
|
||||
- `urgency`: low / medium / high
|
||||
- `category`: invoice-question / availability / contract / other
|
||||
- `requires_response`: boolean
|
||||
3. AI call records into `ai_usage_ledger` with `feature='inbox_triage'`.
|
||||
The existing per-port budget gates apply automatically.
|
||||
4. Triage output written to a new `email_triage` table keyed on
|
||||
`email_messages.id`.
|
||||
|
||||
### 5. UI surfaces
|
||||
|
||||
- `/[port]/inbox` — sorted by triage rank, port-wide view.
|
||||
- Linked-inbox panel on `client-tabs.tsx` — adds a new "Email" tab
|
||||
pulling from `email_threads` filtered to that client.
|
||||
- Alert rule `inbox.unanswered_high_value` slots into Phase B's
|
||||
alert engine; no schema change.
|
||||
|
||||
## Schema additions
|
||||
|
||||
Three new tables, all port-scoped where it matters:
|
||||
|
||||
```ts
|
||||
// Per-staff Gmail tokens. Mirror of google_calendar_tokens.
|
||||
gws_user_tokens {
|
||||
id, userId (UNIQUE), portId, emailAddress,
|
||||
accessTokenEnc, refreshTokenEnc, tokenExpiry,
|
||||
scope, watchExpiresAt, watchHistoryId,
|
||||
connectedAt, lastSyncAt, syncEnabled, createdAt, updatedAt
|
||||
}
|
||||
|
||||
// Triage classifications keyed to messages.
|
||||
email_triage {
|
||||
messageId (PK, FK → email_messages.id ON DELETE CASCADE),
|
||||
urgency, category, requiresResponse,
|
||||
modelVersion, tokensUsed, classifiedAt
|
||||
}
|
||||
|
||||
// Pub/Sub idempotency log. Gmail re-delivers; we dedupe.
|
||||
gws_push_log {
|
||||
messageId (Pub/Sub message id, PK),
|
||||
historyId, receivedAt
|
||||
}
|
||||
```
|
||||
|
||||
Plus extensions to `email_messages`:
|
||||
|
||||
- `googleMessageId` (text, indexed) — Gmail's own ID for thread ops.
|
||||
- `googleThreadId` (text, indexed).
|
||||
- `gmailLabels` (text[]) — for "is unread" checks without hitting Gmail.
|
||||
|
||||
The existing `emailAccounts.provider='google'` column repurposes
|
||||
unchanged; the IMAP fields go nullable since OAuth-flow accounts won't
|
||||
populate them.
|
||||
|
||||
## AI cost interaction
|
||||
|
||||
Triage AI is opt-in **twice**: the port admin must turn on
|
||||
`aiEnabled` (Phase 3a flag, default off) **and** `gws.triageEnabled`
|
||||
(this spec, default off). Either toggle off and the inbox sync still
|
||||
runs but skips classification, so staff can manually scan threads
|
||||
without burning tokens.
|
||||
|
||||
Per-message token cost on a current Haiku-class model is roughly
|
||||
1500–2500 tokens including the system prompt. A port doing 200 inbound
|
||||
emails a day at the upper bound is ~500k tokens/day. The default
|
||||
hard-cap is 500k/month, so triage will trip it inside a day. Two
|
||||
mitigations baked in:
|
||||
|
||||
- The system prompt is short (<500 tokens) and prompt-cached on the
|
||||
Anthropic side, so most tokens are output.
|
||||
- Triage runs only on threads not already classified — re-syncs from
|
||||
the watch loop don't re-bill.
|
||||
|
||||
The admin UI shows triage as its own line in the per-feature breakdown
|
||||
so customers can see how much their inbox is costing them and tune
|
||||
caps accordingly.
|
||||
|
||||
## Phased build (assuming Model B)
|
||||
|
||||
| Phase | Scope | Effort | Ships when |
|
||||
| ---------------------------- | ------------------------------------------------------------------------------------------------------- | ------ | ----------------------------------------------------------- |
|
||||
| **G1** Connect | OAuth flow + per-port config + per-user token storage. No sync yet. Staff can connect; nothing happens. | 1 week | Standalone |
|
||||
| **G2** Read-only sync | Pub/Sub push receiver + delta sync into `email_messages`. Linked-inbox tab on client detail. No AI. | 1 week | After G1 |
|
||||
| **G3** Triage classification | AI classifier, `email_triage` writes, `/inbox` page sorting. Per-port toggle. | 1 week | After G2; depends on Phase 3b budgets being live (they are) |
|
||||
| **G4** Reply drafts | Gmail API send + draft creation. "Draft reply" button on the client detail Email tab. | 1 week | After G3 |
|
||||
| **G5** Alerts | New `inbox.unanswered_high_value` rule. Hooks into Phase B alert engine. | 2 days | After G3 |
|
||||
|
||||
Total: ~5 weeks for a single engineer, assuming the user provides one
|
||||
real GWS workspace to test against during G1.
|
||||
|
||||
## Open decisions for the user
|
||||
|
||||
These are the questions to resolve before scheduling the build, in
|
||||
priority order:
|
||||
|
||||
1. **Deployment model — A, B, or C?** Default recommendation B.
|
||||
2. **Single user or domain-wide delegation?** Per-staff connect (one
|
||||
token per user) is simpler. Domain-wide delegation lets the port
|
||||
admin connect once on behalf of every staff member but requires
|
||||
the customer to grant a service account broader access. Default
|
||||
recommendation: per-staff.
|
||||
3. **Scope set.** Minimal viable scope is `gmail.readonly`. To send
|
||||
replies (G4) we need `gmail.send`. To manage labels (e.g. mark
|
||||
"triaged-by-CRM") we need `gmail.modify`. Each scope expansion
|
||||
widens the consent screen scariness but doesn't add new
|
||||
verification steps under Model B.
|
||||
4. **Pub/Sub topic ownership.** Pub/Sub topics live in _some_ GCP
|
||||
project. Under Model B the customer's project owns the topic —
|
||||
they pay for Pub/Sub (cents/month) and grant our service account
|
||||
subscriber access. Alternative: Port Nimara owns the topic and
|
||||
the customer's Gmail publishes cross-project (allowed, slightly
|
||||
more setup). Default: customer-owned topic, fewer moving parts.
|
||||
5. **Triage model.** Haiku 4.5 is right for cost; Sonnet 4.6 is
|
||||
right if the ranking quality on Haiku turns out to be poor.
|
||||
Defer this until G3 has real-world tuning data.
|
||||
|
||||
## Things that are NOT in this spec
|
||||
|
||||
- **Microsoft 365 / Outlook integration.** Same shape, different API.
|
||||
Once Model B is proven on GWS, Graph API takes another ~3 weeks.
|
||||
- **Reply drafts grounded in CRM context.** That's G4 and depends on
|
||||
the work in this spec, but the prompt engineering for "good replies
|
||||
citing this client's open interests + reservations + invoices"
|
||||
deserves its own design pass before building.
|
||||
- **Cross-staff triage queue (i.e. "show me all unanswered emails
|
||||
across the team").** That requires either domain-wide delegation
|
||||
(decision #2 above) or per-staff opt-in to a shared view. Punt
|
||||
until staff actually ask for it.
|
||||
- **Sentiment / urgency tone analysis.** Tempting; almost always
|
||||
wrong; skip in v1.
|
||||
- **"Smart drafts" using the recipient's past replies as context.**
|
||||
Every customer asks for this and almost no one uses it once
|
||||
built. Skip.
|
||||
|
||||
## Cost summary at a glance
|
||||
|
||||
| Item | Model A | Model B | Model C |
|
||||
| ------------------------------- | ------------------------------- | -------------------------------------- | ------------------------------------ |
|
||||
| Build effort | 3–4 weeks | ~5 weeks (over G1–G5) | ~1 week (receive-only) |
|
||||
| Calendar time to first customer | 4–6 months | 1 hour of customer admin work | 1 hour of customer Gmail-filter work |
|
||||
| Up-front cash | $15k–$75k (CASA) | $0 | $0 |
|
||||
| Recurring | Re-verification annually | None | None |
|
||||
| Best for | 50+ customers, Marketplace play | 1–10 customers, white-glove onboarding | Customers who refuse OAuth setup |
|
||||
|
||||
The recommendation stands: build Model B for G1 + G2 + G3, ship that,
|
||||
and let real customer demand decide whether G4/G5 and Model A
|
||||
promotion are worth the calendar time.
|
||||
Reference in New Issue
Block a user