Lands the one-shot migration pipeline from the legacy NocoDB Interests
base into the new client/interest schema. Dry-run mode is fully
operational: pulls the live snapshot, runs the dedup library, and
writes a CSV + Markdown report under .migration/<timestamp>/. The
--apply phase is stubbed for a follow-up PR per the design's P3
implementation sequence.
Schema additions
================
- `client_merge_candidates` — pairs flagged by the background scoring
job for the /admin/duplicates review queue. Status enum: pending /
dismissed / merged. Unique-(portId, clientAId, clientBId) so the
same pair can't surface twice. Empty until P2 lands the cron.
- `migration_source_links` — idempotency ledger. Maps source-system
rows (NocoDB Interest #624 → new client UUID) so re-running --apply
against the same dry-run report skips already-imported entities.
Both tables ship with the migration `0020_unusual_azazel.sql` —
already applied to the local dev DB during this commit's preparation.
Library
=======
src/lib/dedup/nocodb-source.ts
Read-only adapter for the legacy NocoDB v2 API. xc-token auth,
auto-paginates until isLastPage, captures the table IDs from the
2026-05-03 audit. `fetchSnapshot()` pulls every relevant table in
parallel into one in-memory object the transform layer consumes.
src/lib/dedup/migration-transform.ts
Pure function: NocoDB snapshot in, MigrationPlan out. Per row:
- normalizes name / email / phone / country via the dedup library
- parses the legacy DD-MM-YYYY / DD/MM/YYYY / ISO date formats
- maps the 8-stage `Sales Process Level` enum to the new 9-stage
pipelineStage
- filters yacht-name placeholders ('TBC', 'Na', etc.)
- merges Internal Notes + Extra Comments + Berth Size Desired into
a single notes blob
Then runs `findClientMatches` pairwise (with blocking) and
union-finds clusters of rows whose score crosses the auto-link
threshold (90). Lower-scoring pairs (50–89) become 'needs review'.
Each cluster's "lead" row is picked by completeness score with
recency tie-break.
src/lib/dedup/migration-report.ts
Writes three artifacts to .migration/<timestamp>/:
- report.csv — one row per planned op, RFC-4180 escaped
- summary.md — human-skimmable overview
- plan.json — full structured plan for the --apply phase
CSV cells with comma / quote / newline are quoted; internal quotes
are doubled. No external CSV dep.
src/lib/dedup/phone-parse.ts
Script-safe wrapper around libphonenumber-js's `core` entry that
loads `metadata.min.json` directly. The default `index.cjs.js`
bundled by libphonenumber hits a metadata-shape interop bug under
Node 25 + tsx (`{ default }` wrapping); core+JSON sidesteps it.
The dedup `normalizePhone` and `find-matches` both use this wrapper
now so the same code path runs in vitest, Next.js, and the migration
CLI without surprises.
src/lib/dedup/normalize.ts
Tightened country resolution: added Caribbean short-form aliases
('antigua' → AG, 'st kitts' → KN, etc.) and a city map covering the
US locations seen in the NocoDB dump (Boston, Tampa, Fort
Lauderdale, Port Jefferson, Nantucket). Also relaxed phone parsing
to drop the `isValid()` strict check — the libphonenumber min build
rejects many real NANP-territory numbers, and dedup only needs a
canonical E.164 to compare.
CLI
===
scripts/migrate-from-nocodb.ts
pnpm tsx scripts/migrate-from-nocodb.ts --dry-run
→ Pulls the live NocoDB base (NOCODB_URL + NOCODB_TOKEN env vars),
runs the transform, writes report. No DB writes.
pnpm tsx scripts/migrate-from-nocodb.ts --apply --report .migration/<dir>/
→ Stubbed; exits with `not yet implemented` and a pointer to the
design doc. Apply phase ships in a follow-up.
Tests
=====
tests/unit/dedup/migration-transform.test.ts (7 cases)
Fixture-based regression. A frozen 12-row NocoDB snapshot covers
every duplicate pattern in the design (§1.2). The test asserts:
- 12 input rows → 7 unique clients (cluster math is right)
- Patterns A / B / C / E auto-link
- Pattern F (Etiennette Clamouze) does NOT auto-link
- Every interest preserved as its own row even when clients merge
- 8-stage → 9-stage enum mapping is correct per spec
- Multi-yacht merge (Constanzo CALYPSO + Costanzo GEMINI under one
client) — the design's signature win
- Output is deterministic (run twice, identical)
Validation against real data
============================
Ran `pnpm tsx scripts/migrate-from-nocodb.ts --dry-run` against the
live NocoDB. Result on 252 Interests rows:
- 237 clients (15 merged into 13 clusters)
- 252 interests (one per source row)
- 406 contacts, 52 addresses
- 13 auto-linked clusters (every confirmed cluster from §1.2 audit)
- 3 pairs flagged for review (Camazou, Zasso, one new)
- 1 phone placeholder flagged
Total dedup test count: 57 (50 from P1 + 7 fixture tests).
Lint: clean. Tsc: clean for new files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a full GDPR Article 15 (right of access) workflow. Staff trigger
an export from the client detail; a BullMQ worker assembles every row
keyed to that client (profile, contacts, addresses, notes, tags,
yachts, company memberships, interests, reservations, invoices,
documents, last 500 audit events) into JSON + a self-contained HTML
report, ZIPs them, uploads to MinIO, and optionally emails the client
a 7-day signed download link.
- New table gdpr_exports tracks lifecycle (pending → building → ready
→ sent / failed) with a 30-day cleanup target
- Bundle builder (gdpr-bundle-builder.ts) — pure read-side, tenant-
scoped, with HTML escaping to block injection from rogue field values
- Worker hook in export queue dispatches on job name 'gdpr-export'
- New audit actions: 'request_gdpr_export', 'send_gdpr_export'
- API: POST/GET /api/v1/clients/:id/gdpr-export (admin-gated, exports
rate-limit, Article-15 audit on POST); GET /:exportId returns a
fresh signed URL
- UI: <GdprExportButton> dialog on client detail header — admin-only,
shows recent exports, supports email-to-client + override recipient,
polls every 5s while open
- Validation: refuses email-to-client when no primary email + no
override (rather than silently dropping the send)
Tests: 778/778 vitest (was 771) — +7 covering builder happy path,
HTML escaping, tenant isolation, empty client, request-flow validation,
and audit / queue interaction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a token-denominated guardrail in front of every server-side AI call
so a misconfigured port can't run up an unbounded bill. Soft caps surface
a banner; hard caps refuse new requests until the period rolls over.
Usage flows into a feature-typed ledger so future AI surfaces (summary,
embeddings, reply-draft) can drop in without schema changes.
- New table ai_usage_ledger (port, user, feature, provider, model,
input/output/total tokens, request id) with two indexes for rollup
- New service ai-budget.service.ts: getAiBudget/setAiBudget,
checkBudget (pre-flight gate), recordAiUsage, currentPeriodTokens,
periodBreakdown — all token-based, period boundaries in UTC
- runOcr now returns provider usage so the route can record the actual
spend instead of estimating
- Scan-receipt route gates on checkBudget before invoking AI; returns
source: manual / reason: budget-exceeded when blocked, surfaces
softCapWarning on the success path
- Admin UI: new AiBudgetCard on the OCR settings page — shows current
spend, per-feature breakdown, soft/hard cap inputs, period selector
- Permission: admin.manage_settings on both routes
Tests: 766/766 vitest (was 756) — +10 budget tests covering enforce/
disabled/cap-exceed/estimate-exceed/soft-warn/period boundaries/
cross-port isolation/silent ledger failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR1 of Phase B per docs/superpowers/specs/2026-04-28-phase-b-insights-alerts-design.md.
Lays the foundation that PRs 2-10 will fill in with behaviour.
Schema (migration 0014):
- alerts table with rule-engine fields (rule_id, severity, link,
entity_type/id, fingerprint, fired/dismissed/acknowledged/resolved
timestamps, jsonb metadata). Partial-unique fingerprint index keeps
one open row per (port, rule, entity); separate indexes power
severity-filtered and time-ordered queries.
- analytics_snapshots (port_id, metric_id) -> jsonb cache + computedAt
for the 15-min recurring refresh.
- expenses: duplicate_of self-FK, dedup_scanned_at, ocr_status/raw/
confidence; partial index on (port, vendor, amount, date) where
duplicate_of IS NULL drives the dedup heuristic.
- audit_logs.search_text: GENERATED ALWAYS tsvector over
action+entity_type+entity_id+user_id, GIN-indexed (drizzle can't
model GENERATED ALWAYS in TS yet, so the migration appends manual
ALTER + the GIN index).
Service skeletons in src/lib/services/:
- alerts.service.ts: fingerprintFor, reconcileAlertsForPort (upsert +
auto-resolve), dismiss, acknowledge, listAlertsForPort.
- alert-rules.ts: RULE_REGISTRY of 10 rule evaluators (currently no-op);
PR2 fills in the bodies.
- analytics.service.ts: readSnapshot/writeSnapshot with 15-min TTL +
no-op compute* stubs for the four chart series; PR3 fills behavior.
- expense-dedup.service.ts: scanForDuplicates + markBestDuplicate
using the partial dedup index. PR8 wires the BullMQ trigger.
- expense-ocr.service.ts: OcrResult/OcrLineItem types + ocrReceipt
stub. PR9 wires Claude Vision (Haiku 4.5 + ephemeral system-prompt
cache).
- audit-search.service.ts: tsvector @@ plainto_tsquery + cursor
pagination on (createdAt, id). PR10 wires the admin UI.
tsc clean, lint clean, vitest 675/675 (one unrelated AES random-output
flake passes solo).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The client portal no longer uses passwordless / magic-link sign-in. Each
client now has a `portal_users` row with a scrypt-hashed password,
created by an admin from the client detail page; the admin's invite
mails an activation link that the client uses to set their own password.
Forgot-password is wired through the same token mechanism.
Schema (migration `0009_outgoing_rumiko_fujikawa.sql`):
- `portal_users` — one per client account, separate from the CRM
`users` table (better-auth) so the auth realms stay isolated. Email
is globally unique, password is null until activation.
- `portal_auth_tokens` — single-use activation / reset tokens. Stores
only the SHA-256 hash so a DB compromise never leaks live tokens.
Services:
- `src/lib/portal/passwords.ts` — scrypt hash/verify (no new deps;
uses node:crypto), token mint+hash helpers.
- `src/lib/services/portal-auth.service.ts` — createPortalUser,
resendActivation, activateAccount, signIn (timing-safe),
requestPasswordReset, resetPassword. Auth failures throw the new
UnauthorizedError (401); enumeration-safe behaviour everywhere.
Routes:
- POST /api/portal/auth/sign-in — sets the existing portal JWT cookie.
- POST /api/portal/auth/forgot-password — always 200.
- POST /api/portal/auth/reset-password — token + new password.
- POST /api/portal/auth/activate — token + initial password.
- POST /api/v1/clients/:id/portal-user — admin invite (and `?action=resend`).
- Removed: /api/portal/auth/request, /api/portal/auth/verify (magic link).
UI:
- /portal/login — replaced email-only magic-link form with email +
password + "forgot password" link.
- /portal/forgot-password, /portal/reset-password, /portal/activate — new.
- New shared `PasswordSetForm` component used by activate + reset.
- New `PortalInviteButton` rendered on the client detail header.
Email send:
- `createTransporter` now wires SMTP auth when SMTP_USER+SMTP_PASS are
set (gmail app-password or marina-server creds, configured via env).
- `SMTP_FROM` env var lets the sender address be overridden without
pinning it to `noreply@${SMTP_HOST}`.
Tests:
- Smoke spec 17 (client-portal) updated to the new flow: 7/7 green.
- Smoke specs 02-crud-spine, 05-invoices, 20-critical-path updated to
match the post-refactor client + invoice forms (drop companyName,
use OwnerPicker + billingEmail).
- Vitest 652/652 still green; type-check clean.
Drops the dead `requestMagicLink` from portal.service.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>