Files
pn-new-crm/docs/superpowers/specs/2026-04-29-gws-inbox-triage-design.md

377 lines
19 KiB
Markdown
Raw Permalink Normal View History

# Google Workspace inbox-triage integration (exploratory)
**Status:** Exploratory — not approved for build
**Date:** 2026-04-29
**Tracks:** AI inbox-triage, Google Workspace email connection
## What this spec is for
The user has flagged inbox-triage as the most valuable AI surface left to
build, but conditioned email integration on it being via Google Workspace
specifically (not generic IMAP), with a per-port toggle so clients who
don't use GWS aren't billed for capability they can't reach.
This document captures what that build actually costs — especially on
the Google side, which is where most teams underestimate the work — so
we can decide whether to commit before writing any code. **Nothing in
this spec is approved for implementation.** The deliverable is a go /
no-go decision and, if go, a scope choice between three deployment
models that cost wildly different amounts of calendar time.
## What inbox-triage actually does for the user
Concretely, on the staff member's desktop:
1. **Linked-inbox panel on the client detail page.** When you open
`/[port]/clients/<id>` you see the last N email threads with that
client, pulled from the staff member's own Gmail. Each thread has
the latest message preview, an "open in Gmail" deep-link, and a
"draft reply" button (Phase 2+).
2. **Inbox triage queue.** A new top-level page `/[port]/inbox` that
lists unread/unanswered threads ranked by AI-assessed importance
(high-value client, contractual urgency, chase-overdue). Each row
has one-click actions: "log this as a note on the client",
"create a follow-up reminder", "draft reply".
3. **Email-driven alerts.** When a high-value client emails and no one
responds within X hours, the existing alerts engine fires a
`inbox.unanswered_high_value` rule (slots into the alert framework
from Phase B without schema change).
4. **Reply drafts (Phase 3).** AI generates a reply draft grounded in
the client's CRM record (open interests, pending reservations,
recent invoices). Staff edit and send through Gmail.
The value is selective: a port with three staff members fielding 50
client emails a day saves maybe an hour a day collectively if the
ranking is right. Below that volume the build doesn't pay back.
## What already exists in the codebase
The CRM is roughly halfway scaffolded for this:
| Surface | Status | Notes |
| ----------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| `email_accounts` table | ✅ Exists | Has `provider: 'google' \| 'outlook' \| 'custom'` discriminator and `imap_*` / `smtp_*` cols. Built for IMAP, not OAuth. |
| `email_threads` / `email_messages` tables | ✅ Exists | Already linked to `clientId`. Schema is good as-is for Gmail. |
| `email-threads.service.ts` `syncInbox()` | ⚠ Stub-ish | IMAP-flow only. Won't reach Gmail without OAuth + Gmail API rewrite. |
| `email` BullMQ queue + `inbox-sync` job name | ✅ Exists | Worker dispatches on the job name; new sync impl drops in. |
| `google_calendar_tokens` table | ✅ Exists | OAuth token storage shape we can mirror for Gmail. |
| Per-port email override (port `email_settings`) | ✅ Exists | Used for outbound only today; Gmail integration is per-staff-user, not per-port. |
| `ai_usage_ledger` + per-port `aiEnabled` flag | ✅ Exists (Phase 3a/3b) | Triage AI calls book against the same ledger. |
| `withRateLimit('ai', ...)` wrapper | ✅ Exists (Phase 3c) | Caps triage AI traffic at 60/min/user out of the box. |
Net: schemas are mostly right. The OAuth flow, Gmail API client, push
notification receiver, and triage classifier are the new builds.
## Why Google Workspace specifically
The user's stated constraint: "I don't think we need email integration
unless we connect it to Google Workspace." Reasons that hold up:
- **No password storage.** OAuth tokens are revocable, scoped, and
rotate. IMAP requires app passwords, which Google has been actively
deprecating since 2024 — they'll be gone for the workspace plans
this product targets.
- **Push notifications, not polling.** Gmail's `users.watch` API plus
Google Pub/Sub means we get an HTTP callback within seconds of a new
message landing. IMAP requires polling on a 30-60 second cadence,
which costs more and lags worse.
- **Search and labels.** The Gmail API exposes label management and
full-text search natively; IMAP search is much weaker.
- **Threading.** Gmail's `threadId` is canonical. Reconstructing
threads over IMAP from `In-Reply-To` / `References` headers is
reliable in theory, painful in practice.
Microsoft 365 is the obvious peer integration but is out of scope here.
The Graph API model is similar enough that a future M365 path can reuse
most of the storage shape.
## Three deployment models — pick one before building
This is the most important decision in the spec. Each model has
different OAuth-verification consequences, which dominate everything
else.
### Model A — Marketplace-published OAuth app
A single OAuth client owned by Port Nimara, listed in the Google
Workspace Marketplace, that any GWS customer can install. Each staff
member clicks "Connect Gmail," consents to the scopes, and the CRM
stores their refresh token.
**Google-side work:**
1. Build the OAuth flow in CRM (~1 week).
2. Submit for OAuth verification. Gmail's `gmail.readonly` /
`gmail.modify` scopes are **restricted scopes** — they require:
- Domain-verified production URLs
- A homepage with a privacy policy that explicitly enumerates which
scopes are used and why
- A demo video (literally a screen recording) showing the consent
screen and what happens next
- **A third-party security assessment from a Google-approved
vendor** ($15k$75k, 612 weeks)
- A Cloud Application Security Assessment (CASA) report
3. Marketplace listing review (~2 weeks after CASA passes).
**Calendar time:** 46 months.
**Money:** $15k$75k for the security assessment alone.
**Recurring:** Re-verification every 12 months.
Right answer if Port Nimara wants to be the marina-CRM that ships GWS
out of the box for _any_ customer. Wrong answer if there are <5
customers who'd use it.
### Model B — Per-customer "Internal" OAuth app
Each customer's GWS admin creates an OAuth client _inside their own
workspace_ and gives Port Nimara the client ID + secret. Because the
app is "Internal," Google skips verification entirely — the consent
screen is unverified-but-permitted. Tokens never cross workspace
boundaries.
**Google-side work per customer:**
1. Customer's GWS admin enables the Gmail API in their Cloud project.
2. Creates an OAuth 2.0 client ID with type "Internal" + your CRM's
redirect URI.
3. Hands the client ID + secret to Port Nimara out-of-band.
4. Staff connect their Gmail through that client.
**Calendar time per customer:** ~1 hour of admin work.
**Money:** $0.
**Limit:** Doesn't span across GWS workspaces. A user with two GWS
accounts (e.g. the marina + a personal workspace) can only connect the
one matching the OAuth client.
This is the **clear winner for the current customer base**: small
number of customers, each with their own GWS workspace, and each
buying the integration as part of an onboarding conversation.
### Model C — Forward-to-CRM mailbox
The CRM exposes a per-port email alias (e.g.
`port-nimara-NN@inbox.portnimara.com`). Customers configure a Gmail
filter or mailing rule that BCCs that alias on relevant threads. The
CRM ingests via SMTP and runs the same triage pipeline.
**Google-side work:** None. Customer does it as a Gmail filter.
**Calendar time:** ~1 week of CRM-side build.
**Limit:** Receive-only — no reply drafts, no thread state changes,
no labels. The "draft reply" feature in Phase 3 above is impossible
under this model.
Model C is the right answer if the user wants to ship inbox-triage
_now_ and decide on bidirectional Gmail integration later. The schema
is designed so the model can be upgraded to A or B without data
migration.
### Recommendation
**Build Model B first.** It costs nothing on the Google side, takes
~3 weeks of CRM work, and matches the actual customer profile.
**Promote to Model A only after 3+ paying customers ask for it
unprompted.** Until then, the security-assessment cost can't justify
itself.
Model C as a fallback for customers who refuse to set up an Internal
OAuth app. Build it last, lazily — the schema accommodates it.
## End-to-end flow (Model B)
### 1. Per-port OAuth-app config
New admin page `/[port]/admin/google-workspace`:
- Field: "OAuth client ID" (their internal client ID)
- Field: "OAuth client secret" (encrypted at rest using `ENCRYPTION_KEY`)
- Field: "Authorized redirect URI" (read-only; we display the value
they need to paste into their Google Cloud Console)
- Toggle: "Enable Gmail integration for this port"
Stored in `system_settings` under key `gws.config`, port-scoped.
Resolution mirrors the existing OCR config service.
### 2. Per-staff connect flow
Staff member visits `/[port]/me/integrations`, clicks "Connect Gmail."
```
GET /api/v1/auth/gws/start
→ looks up port's gws.config
→ builds Google authorize URL with port's client_id + state token
→ 302 to Google
[ user consents ]
→ 302 back to /api/v1/auth/gws/callback?code=…&state=…
→ exchanges code for tokens via port's client_secret
→ stores in new `gws_user_tokens` table (encrypted)
→ schedules an `inbox-watch` job
```
### 3. Push notification subscription
After tokens are stored, the worker calls
`gmail.users.watch({ topicName: <Pub/Sub topic>, labelIds: ['INBOX'] })`.
Gmail then posts to a Pub/Sub topic on every inbox change. The CRM
exposes a Pub/Sub push subscription endpoint at
`/api/webhooks/gmail-push` which fetches the changed messages via the
delta `historyId` and writes them into `email_messages`.
Watch subscriptions expire every 7 days. A maintenance job
re-establishes them daily.
### 4. Triage pipeline
For each new inbound message:
1. Match against `clients` and `companies` by `from_address` against
`client_contacts` (email channel). Persist a thread→client link if
found.
2. If port has `aiEnabled` AND `gws.triageEnabled`, queue an `ai`
job that classifies the thread:
- `urgency`: low / medium / high
- `category`: invoice-question / availability / contract / other
- `requires_response`: boolean
3. AI call records into `ai_usage_ledger` with `feature='inbox_triage'`.
The existing per-port budget gates apply automatically.
4. Triage output written to a new `email_triage` table keyed on
`email_messages.id`.
### 5. UI surfaces
- `/[port]/inbox` — sorted by triage rank, port-wide view.
- Linked-inbox panel on `client-tabs.tsx` — adds a new "Email" tab
pulling from `email_threads` filtered to that client.
- Alert rule `inbox.unanswered_high_value` slots into Phase B's
alert engine; no schema change.
## Schema additions
Three new tables, all port-scoped where it matters:
```ts
// Per-staff Gmail tokens. Mirror of google_calendar_tokens.
gws_user_tokens {
id, userId (UNIQUE), portId, emailAddress,
accessTokenEnc, refreshTokenEnc, tokenExpiry,
scope, watchExpiresAt, watchHistoryId,
connectedAt, lastSyncAt, syncEnabled, createdAt, updatedAt
}
// Triage classifications keyed to messages.
email_triage {
messageId (PK, FK → email_messages.id ON DELETE CASCADE),
urgency, category, requiresResponse,
modelVersion, tokensUsed, classifiedAt
}
// Pub/Sub idempotency log. Gmail re-delivers; we dedupe.
gws_push_log {
messageId (Pub/Sub message id, PK),
historyId, receivedAt
}
```
Plus extensions to `email_messages`:
- `googleMessageId` (text, indexed) — Gmail's own ID for thread ops.
- `googleThreadId` (text, indexed).
- `gmailLabels` (text[]) — for "is unread" checks without hitting Gmail.
The existing `emailAccounts.provider='google'` column repurposes
unchanged; the IMAP fields go nullable since OAuth-flow accounts won't
populate them.
## AI cost interaction
Triage AI is opt-in **twice**: the port admin must turn on
`aiEnabled` (Phase 3a flag, default off) **and** `gws.triageEnabled`
(this spec, default off). Either toggle off and the inbox sync still
runs but skips classification, so staff can manually scan threads
without burning tokens.
Per-message token cost on a current Haiku-class model is roughly
15002500 tokens including the system prompt. A port doing 200 inbound
emails a day at the upper bound is ~500k tokens/day. The default
hard-cap is 500k/month, so triage will trip it inside a day. Two
mitigations baked in:
- The system prompt is short (<500 tokens) and prompt-cached on the
Anthropic side, so most tokens are output.
- Triage runs only on threads not already classified — re-syncs from
the watch loop don't re-bill.
The admin UI shows triage as its own line in the per-feature breakdown
so customers can see how much their inbox is costing them and tune
caps accordingly.
## Phased build (assuming Model B)
| Phase | Scope | Effort | Ships when |
| ---------------------------- | ------------------------------------------------------------------------------------------------------- | ------ | ----------------------------------------------------------- |
| **G1** Connect | OAuth flow + per-port config + per-user token storage. No sync yet. Staff can connect; nothing happens. | 1 week | Standalone |
| **G2** Read-only sync | Pub/Sub push receiver + delta sync into `email_messages`. Linked-inbox tab on client detail. No AI. | 1 week | After G1 |
| **G3** Triage classification | AI classifier, `email_triage` writes, `/inbox` page sorting. Per-port toggle. | 1 week | After G2; depends on Phase 3b budgets being live (they are) |
| **G4** Reply drafts | Gmail API send + draft creation. "Draft reply" button on the client detail Email tab. | 1 week | After G3 |
| **G5** Alerts | New `inbox.unanswered_high_value` rule. Hooks into Phase B alert engine. | 2 days | After G3 |
Total: ~5 weeks for a single engineer, assuming the user provides one
real GWS workspace to test against during G1.
## Open decisions for the user
These are the questions to resolve before scheduling the build, in
priority order:
1. **Deployment model — A, B, or C?** Default recommendation B.
2. **Single user or domain-wide delegation?** Per-staff connect (one
token per user) is simpler. Domain-wide delegation lets the port
admin connect once on behalf of every staff member but requires
the customer to grant a service account broader access. Default
recommendation: per-staff.
3. **Scope set.** Minimal viable scope is `gmail.readonly`. To send
replies (G4) we need `gmail.send`. To manage labels (e.g. mark
"triaged-by-CRM") we need `gmail.modify`. Each scope expansion
widens the consent screen scariness but doesn't add new
verification steps under Model B.
4. **Pub/Sub topic ownership.** Pub/Sub topics live in _some_ GCP
project. Under Model B the customer's project owns the topic —
they pay for Pub/Sub (cents/month) and grant our service account
subscriber access. Alternative: Port Nimara owns the topic and
the customer's Gmail publishes cross-project (allowed, slightly
more setup). Default: customer-owned topic, fewer moving parts.
5. **Triage model.** Haiku 4.5 is right for cost; Sonnet 4.6 is
right if the ranking quality on Haiku turns out to be poor.
Defer this until G3 has real-world tuning data.
## Things that are NOT in this spec
- **Microsoft 365 / Outlook integration.** Same shape, different API.
Once Model B is proven on GWS, Graph API takes another ~3 weeks.
- **Reply drafts grounded in CRM context.** That's G4 and depends on
the work in this spec, but the prompt engineering for "good replies
citing this client's open interests + reservations + invoices"
deserves its own design pass before building.
- **Cross-staff triage queue (i.e. "show me all unanswered emails
across the team").** That requires either domain-wide delegation
(decision #2 above) or per-staff opt-in to a shared view. Punt
until staff actually ask for it.
- **Sentiment / urgency tone analysis.** Tempting; almost always
wrong; skip in v1.
- **"Smart drafts" using the recipient's past replies as context.**
Every customer asks for this and almost no one uses it once
built. Skip.
## Cost summary at a glance
| Item | Model A | Model B | Model C |
| ------------------------------- | ------------------------------- | -------------------------------------- | ------------------------------------ |
| Build effort | 34 weeks | ~5 weeks (over G1G5) | ~1 week (receive-only) |
| Calendar time to first customer | 46 months | 1 hour of customer admin work | 1 hour of customer Gmail-filter work |
| Up-front cash | $15k$75k (CASA) | $0 | $0 |
| Recurring | Re-verification annually | None | None |
| Best for | 50+ customers, Marketplace play | 110 customers, white-glove onboarding | Customers who refuse OAuth setup |
The recommendation stands: build Model B for G1 + G2 + G3, ship that,
and let real customer demand decide whether G4/G5 and Model A
promotion are worth the calendar time.