Files
pn-new-crm/docs/AUDIT-2026-05-12.md

7523 lines
476 KiB
Markdown
Raw Normal View History

# Port Nimara CRM — Comprehensive Platform Audit
**Generated:** 2026-05-12 (session run)
**Branch:** `feat/documents-folders`
**Method:** 19 parallel audit agents on Claude Opus 4.7, read-only static analysis. Each agent owned a single domain and wrote a CRITICAL/HIGH/MEDIUM-grouped report. This document consolidates the reports and overlays the fixes already shipped during the session.
---
## How to read this document
1. **Executive summary** lists every CRITICAL finding (must address before production), per domain.
2. **Already fixed in this session** is a manifest of the changes I shipped while the audit was running. Don't re-fix these.
3. **Cross-cutting priority queue** is the top ~15 highest-impact findings across the entire codebase, ordered. Tackle these first.
4. **Per-domain reports** below contain the full text of every agent's report verbatim — useful when you sit down to actually fix a specific area.
5. **Methodology + agent roster** appendix at the bottom lists who looked at what.
Severity is the auditor's judgment, not mine — I have not re-graded findings. Treat anything tagged CRITICAL as a real block on shipping.
---
## Executive summary
### CRITICAL findings (must address)
| # | Domain | File | Issue | Status |
| --- | ------------- | --------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- |
| 1 | Security | `src/app/api/v1/admin/users/[id]/permission-overrides/route.ts` | Admins could grant themselves every permission leaf via self-target | **FIXED this session** |
| 2 | Security | `src/app/api/auth/resolve-identifier/route.ts` | Username enumeration via hit/miss response shape + no rate limit | **FIXED this session** |
| 3 | Services | `src/lib/services/users.service.ts` (admin email-change) | `account.accountId` not updated → user can't sign in with either old or new email after admin rotation; sessions also not revoked | **FIXED this session** |
| 4 | Observability | `src/lib/services/search-nav-catalog.ts` | 10 NAV_CATALOG entries pointed at routes that don't exist (`/admin/audit-log`, `/admin/error-events`, `/user-settings`, 7×`/settings/<x>`) | **FIXED this session** |
| 5 | Auth flow | `src/middleware.ts` | Token-gated email confirm/cancel routes blocked by session 401 | **FIXED this session** |
| 6 | Email | `src/lib/env.ts` + `src/lib/email/index.ts` | `EMAIL_REDIRECT_TO` has no `NODE_ENV=production` guard — a stray prod env value silently funnels every email to one inbox | Open |
| 7 | Email | every template | URL interpolations into `href="…"` and link text are unescaped — a `"` in any URL breaks out, no scheme rejection | Open |
| 8 | Data model | `src/lib/db/migrations/0052_audit_critical_fixes.sql` | `CREATE INDEX CONCURRENTLY` silently never runs because there's no real `db:migrate` runner — six composite indexes missing in prod | Open |
| 9 | Data model | `db:push` flow | Two structural constraints (berths.current_pdf_version_id circular FK, system_settings NULLS NOT DISTINCT) not in `db:push`; fresh-deploy diverges from prod | Open |
| 10 | Services | `documents.service.ts: handleDocumentCompleted` | Orphan-blob window — failure between `storage.put` and `documents.update` leaves the blob and marks status='completed' with no `signedFileId` | Open |
| 11 | GDPR | `src/lib/services/gdpr-bundle-builder.ts` | Article-15 export missing portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions | Open |
| 12 | GDPR | `src/lib/services/client-hard-delete.service.ts` | "Right to be forgotten" doesn't actually erase — verbatim PII survives in email_messages.body_html, files, document_sends.recipient_email forever | Open |
| 13 | GDPR | `src/app/api/auth/resolve-identifier/route.ts` (post-fix) | Still echoes the real canonical email on a successful username hit (rate-limited but enumerable) | Partial — see Open follow-ups |
| 14 | GDPR | `audit_logs.metadata` field | Not covered by `maskSensitiveFields`; raw PII (emails, IPs, names) accumulates unbounded with no retention cron | Open |
| 15 | Observability | `src/app/api/webhooks/documenso/route.ts` | Webhook handler bypasses the platform-error pipeline entirely — admin/errors silent on Documenso webhook crashes | Open |
| 16 | UI/UX | 16 sites use native `window.confirm()` | Bypasses `ConfirmationDialog` / `AlertDialog` for destructive flows (cancel signing, delete files, archive interest/company/yacht…) | Open |
| 17 | Documenso | `documenso-client.ts` v1↔v2 routing | (Pending full report) | In progress |
| 18 | Concurrency | (see report) | Various race windows on multi-rep edits + partial-unique-index inserts | Open |
### HIGH-priority queue
Listed after CRITICALs in the priority queue section below.
---
## Already fixed in this session
These changes are on the `feat/documents-folders` branch (post-commit `660553c` and onward). Do not re-fix.
### Security
- **Self-target privilege escalation block** — `src/app/api/v1/admin/users/[id]/permission-overrides/route.ts` now refuses `PUT` when `targetUserId === ctx.userId`. Additionally, the body now sanitises against a canonical `ALLOWED_RESOURCE_ACTIONS` allow-list mirroring `RolePermissions`, so unknown resource/action keys are stripped before write. Cross-tenant pollution check added (refuses overrides for users without a `user_port_roles` row in the caller's port).
- **Username enumeration kill** — `src/app/api/auth/resolve-identifier/route.ts` now (a) shares the `auth` 5-per-15-min rate-limit bucket keyed by client IP, (b) returns a synthetic `@auth.invalid` email on miss so hit and miss are indistinguishable in shape. (Note: GDPR auditor flagged the hit-path still echoes a real canonical email — still an information leak that's worth a deeper redesign; see Open follow-ups.)
- **Email-change account/session rotation** — `src/lib/services/users.service.ts` now also updates `account.accountId` for the `credential` provider (Better Auth's actual login key) AND revokes every active `session` row when an admin rotates a user's email. Previously the user could not sign in with either old or new email after rotation.
- **Middleware unblocks token-gated email routes** — `src/middleware.ts` adds `/api/v1/me/email/confirm/` and `/api/v1/me/email/cancel/` to `PUBLIC_PATHS` so the confirm/cancel links work in a fresh browser without an existing session.
### Search + navigation
- **NAV_CATALOG dead-link sweep** — `src/lib/services/search-nav-catalog.ts` corrected 10 entries that pointed to non-existent routes. `/admin/audit-log``/admin/audit`, `/admin/error-events``/admin/errors`, `/user-settings``/settings/profile`, and the 7 phantom `/settings/<x>` entries redirected to their real `/admin/<x>` homes.
- **Topbar global search extended** — every admin sub-card now indexed in `NAV_CATALOG` with curated `keywords` (client portal, ai scoring, pipeline weights, recommender heat weights, etc.). Results sort to the bottom of the cmd-K dropdown, beneath entity hits.
- **Admin sections page search** — `src/components/admin/admin-sections-browser.tsx` `AdminSection` gained a `keywords?: string[]` field, populated for System Settings (mirrors `KNOWN_SETTINGS`), AI configuration, OCR, Users, and Website analytics. `filteredMatches` haystack now includes those keywords.
### User management
- **Disable / enable button** — third Power/PowerOff action button on the desktop user list + matching dropdown item on the mobile card. Backed by `userProfiles.isActive` (already enforced by `withAuth` → 403 on disabled accounts).
- **UserForm tabs + permissions matrix** — UserForm now wraps Profile & role + Permissions in tabs. New `UserPermissionMatrix` component renders the full `RolePermissions` shape with three-state per-leaf toggle (Inherit / Grant / Deny). The matrix is `role="radiogroup"` + `aria-checked` per option, and shows an amber callout explaining that overrides save on their own button. Dirty-state tracked via originalOverrides comparison.
- **First/last name + admin email change** — UserForm collects first + last name (canonical) alongside displayName. Email change behind an AlertDialog confirmation; on confirm sends an automated notice to the prior address (new template `src/lib/email/templates/admin-email-change.ts`).
- **Phone formatting** — UserForm swaps the bare tel input for the shared `PhoneInput` (country combobox + AsYouType + E.164 storage).
### Optional username sign-in
- Migration `0054_user_profiles_username.sql` adds `username` column (2..30 chars, regex `^[a-z0-9._-]{2,30}$`, partial unique index on `LOWER(username)`).
- Login page now accepts email OR username via `/api/auth/resolve-identifier`.
- Self-service username card on `src/components/settings/user-settings.tsx`.
- `/api/v1/me` PATCH now accepts username with allow-list + reserved-name check + uniqueness check before write.
### Per-user permission overrides
- Migration `0055_user_permission_overrides.sql` adds the table.
- Effective-permissions resolver in `src/lib/api/helpers.ts` now layers user overrides on top of role + port-role overrides + residential toggle.
- `GET / PUT /api/v1/admin/users/[id]/permission-overrides` endpoints.
### Role + enum normalization
- `formatRole()` + `ROLE_LABELS` in `src/lib/constants.ts` — replaces the ad-hoc `humanizeRole` in `sidebar.tsx` and `prettifyRoleName` in `role-list.tsx`. user-list, user-card, role-list, user-form now render "Sales Agent" instead of "sales_agent".
- `formatOutcome()` + `OUTCOME_LABELS` for interest outcomes. Updated `client-columns.tsx`, `realtime-toasts.tsx`, `interest-detail-header.tsx`, `command-search.tsx`.
- Pipeline stage normalization extended to: `next-in-line-notify.service.ts`, `command-search.tsx` (interest + residential interest bucket), `yacht-tabs.tsx`, `interest-picker.tsx`, `ai.ts` worker email body, `pipeline-report.ts` + `revenue-report.ts` PDF generators.
### Auto-memory
- Saved feedback memory: "Be thorough — audit everything that ends in a user-facing notification". (Memory subsystem is /Users/matt/.claude/projects/...)
---
## Cross-cutting priority queue
Tackle in this order. C-prefix = CRITICAL still open; H-prefix = HIGH.
1. **[C] Wire a real `db:migrate` runner** — without it, `0052_audit_critical_fixes.sql` silently never creates 6 composite indexes (data-model C1). Recommended: a tsx script that reads migrations in order, splits on `--> statement-breakpoint`, runs `CREATE INDEX CONCURRENTLY` outside a tx, and tracks state in a `__drizzle_migrations` table. Same script gives you `db:migrate:status` for prod readiness.
2. **[C] Add `EMAIL_REDIRECT_TO` prod guard** — `src/lib/env.ts` should refine to reject when `NODE_ENV === 'production'`, and `src/lib/email/index.ts` should `logger.warn` at boot when set (not debug). 5 minutes of work, prevents an extremely-bad-day class of incident.
3. **[C] Fix orphan-blob window in `handleDocumentCompleted`** — `src/lib/services/documents.service.ts:1100-1253`. Wrap the storage.put + files.insert + documents.update sequence in a transaction or a saga with a compensating delete. The current catch-block path also incorrectly marks `status='completed'` with no `signedFileId`, hiding the failure from reps.
4. **[C] Escape URLs in email templates** — every template in `src/lib/email/templates/*` inlines `${data.link}` etc. into href/text without escaping. Move all template rendering through a shared `escapeUrl` helper and add scheme allow-listing (http(s) only).
5. **[C] Eliminate the 16 native `window.confirm()` calls** — each one is a destructive flow that bypasses `ConfirmationDialog` / `AlertDialog`. ui-ux-auditor lists the sites; high-leverage UX fix.
6. **[C] GDPR export completeness** — `gdpr-bundle-builder.ts` must include portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions. This is a regulator-finding-level gap.
7. **[C] Right-to-be-forgotten actually erase** — `client-hard-delete.service.ts` currently nullifies FKs but leaves verbatim PII in email_messages.body_html, files, document_sends.recipient_email. Add a true wipe path (or document the limitation in the legal text and gate the feature behind a "we cannot fully erase X" warning).
8. **[C] Add `user_permission_overrides.user_id` FK + onDelete='set null' on nullable client refs** — data-model H1+H2. Migration 0056.
9. **[C] Resolve-identifier hit-path still leaks email** — replace the API entirely with a server-side signIn proxy that takes `{identifier, password}` and never returns the canonical email at all. Current rate-limited hit still echoes real emails to anyone with a guessable username.
10. **[H] Re-audit `audit_logs.metadata` masking** — extend `maskSensitiveFields` to cover `audit_logs.metadata`; add a 90-day retention cron (mirroring `error_events`).
11. **[H] Webhook → error pipeline** — `documenso/route.ts` should `captureErrorEvent` on handler crash. Apply the same to every other webhook route.
12. **[H] Wire admin email-template subject editor** — 5 of 8 templates ignore `overrides.subject`; admins see "Saved" with zero effect. `email-auditor` H1+H2.
13. **[H] Wire admin signature/footer fields** — `/admin/email` writes `email_signature_html` + `email_footer_html` which the shell never reads. Either delete or wire.
14. **[H] PII redaction in audit/error pipeline** — `error_events.request_body_excerpt` sanitizer redacts password/token but not email/phone/name/dob/address.
15. **[H] Notification email worker XSS** — `workers/notifications.ts:65-71` interpolates `notif.description` and `notif.link` into HTML unescaped. Apply `escapeHtml` + URL allow-list.
---
## Per-domain reports
Each section below is the agent's report verbatim. File:line refs reference the repo as it stands at the start of the audit session — some have already been addressed (see "Already fixed in this session" above).
---
## 1. Security + API + auth audit (security-auditor + early api-security run)
Two reports — the team-spawned `security-auditor` and an earlier standalone run. Both included verbatim.
### Report A: security-auditor (team)
# Security / API / Auth Audit — `feat/documents-folders` branch
Read-only audit of the `pn-crm` repo. Scope: auth wrappers, tenant scoping,
public/webhook endpoints, the just-shipped username-resolve + permission-
overrides + admin email-change flows, CSRF posture, audit-log coverage.
No **CRITICAL** issues found — auth helpers (`withAuth` / `withPermission` /
`requireSuperAdmin`) are applied consistently across `src/app/api/v1/**`,
public endpoints all use timing-safe secret compares + per-IP rate limits,
and the Documenso webhook idempotency + per-port secret resolution is sound.
The findings below are HIGH / MEDIUM.
---
## HIGH
### H1. `resolve-identifier` leaks username→email mapping AND has no rate limit
**File:** `src/app/api/auth/resolve-identifier/route.ts` (lines 2558)
The route's own docstring claims it "pairs with the global login-attempt
limiter" — but no `enforcePublicRateLimit` / `checkRateLimit` is actually
called in the handler. Unauthenticated attackers can POST `{identifier:"matt"}`
at unbounded volume; on a hit the response is `{email:"matt@letsbe.solutions"}`,
on a miss the response echoes the raw input. That makes existence
trivially decidable (response contains `@` ↔ hit), and on a hit the caller
_also_ learns the actual email address. Usernames are typically far more
guessable than emails (first names, social handles), so this becomes a one-
way `username → email` harvester usable for downstream phishing / password
spraying. **Fix:** wrap with `enforcePublicRateLimit(req, 'portalSignIn',
identifier.toLowerCase())` (or a new `loginIdentifier` bucket) AND stop
echoing the resolved email — either return `{ok:true}` and require the
caller to POST `(username,password)` together to a single sign-in endpoint
that does the lookup server-side, or return an opaque short-lived token that
Better Auth's sign-in step can redeem internally.
### H2. Admin email-change leaves `emailVerified` true → account takeover via reset
**File:** `src/lib/services/users.service.ts` (lines 233262, 355387)
`updateUser` rotates `user.email` directly when an admin edits the address
(line 246247) but never resets `emailVerified`. A hostile or compromised
admin can point any victim's account at an attacker-controlled mailbox, then
trigger the existing "forgot password" flow on the new address and silently
hijack the account; the existing `notifyAdminEmailChange` notice fires to
the _old_ address fire-and-forget and is documented as non-blocking
("failure to send doesn't roll back"). There is _also_ no `createAuditLog`
specifically for the email-change — the generic update audit at line 287
buries the change inside `newValue: data` rather than emitting a dedicated
`email_change` action that monitoring can alert on. **Fix:** when
`wantsEmailChange`, set `emailVerified: false` in the Better Auth user
update, write a dedicated `severity: 'warning'` audit row with
`{oldEmail, newEmail, changedBy}`, and require the recipient to click the
existing `/api/v1/me/email/confirm/[token]` flow before the rotation
applies — i.e. mint a `user_email_changes` row rather than direct-UPDATE.
### H3. Permission-overrides PUT accepts arbitrary keys → JSONB pollution + deep-merge surprise
**File:** `src/app/api/v1/admin/users/[id]/permission-overrides/route.ts`
(lines 3135, 97141)
`updateOverridesSchema` is `z.record(z.string(), z.record(z.string(), z.boolean()))` — no allow-list against the known `RolePermissions` resource/action keys. An admin (or a stolen admin session) can persist arbitrary keys into `user_permission_overrides.permission_overrides`. Two concrete impacts: (a) future deep-merge logic that maps unknown keys into newly added resources promotes the rogue keys silently (silent privilege creep when new permissions ship); (b) the JSONB can be bloated to harm downstream readers. **Fix:** validate against `KNOWN_PERMISSION_LEAVES` derived from `RolePermissions` (resource → action set), reject unknown keys with `ValidationError`, and bound the merged blob size as `/api/v1/me/route.ts` already does for `preferences`. The GET handler is fine — it only reads what was already persisted.
### H4. `/api/v1/me/email/confirm|cancel/[token]` is unreachable for logged-out users (middleware 401)
**File:** `src/app/api/v1/me/email/cancel/[token]/route.ts`,
`src/app/api/v1/me/email/confirm/[token]/route.ts`,
`src/middleware.ts` (PUBLIC_PATHS list, line 820)
The handlers correctly skip `withAuth` ("the token IS the proof") but
`/api/v1/me/email/...` is not in `PUBLIC_PATHS`, so `middleware.ts` returns
a 401 JSON for any unauthenticated request — exactly the case a user
clicking the confirm link from email on a different device will hit. End
result: every confirm/cancel click from a logged-out browser fails with
"Authentication required". Also, the GET request applies an irreversible
state mutation with no CSRF guard (the origin-check in middleware only fires
for `STATE_CHANGING_METHODS`). **Fix:** move these handlers under
`/api/auth/email-change/{confirm,cancel}/[token]` so they're covered by the
`/api/auth/` PUBLIC_PATHS prefix, OR add `/api/v1/me/email/` to
`PUBLIC_PATHS`. Convert the GET mutation to a POST landing page (one-click
confirm form) so cross-site image/prefetch tags can't silently flip state.
---
## MEDIUM
### M1. Direct `Schema.parse(body)` instead of `parseBody(req, schema)`
**Files:** `src/app/api/v1/admin/custom-fields/[fieldId]/route.ts:18-19`,
`src/app/api/v1/search/route.ts:11`,
`src/app/api/v1/files/upload/route.ts:21`,
`src/app/api/v1/companies/[id]/members/[mid]/handlers.ts:29`,
`src/app/api/public/website-inquiries/route.ts:97-98`,
`src/app/api/public/residential-inquiries/route.ts:51-52`,
`src/app/api/public/interests/route.ts:47-48`,
`src/app/api/portal/auth/{sign-in,forgot-password,reset-password,activate,change-password}/route.ts`,
`src/app/api/auth/{set-password,resolve-identifier}/route.ts`.
CLAUDE.md explicitly requires `parseBody` so the 400 envelope + field-
errors shape stays uniform (the frontend's `toastError` hook depends on
it). Most of these are caught by an outer try/catch that routes ZodError
into `errorResponse`, which masks the issue — but the response shape
diverges (a thrown ZodError becomes a generic 500 unless `errorResponse`
maps it). Admin route `custom-fields/[fieldId]` is the worst case: a
malformed PATCH body 500s instead of 400-with-field-errors. **Fix:** swap
to `parseBody(req, schema)` in the admin/internal routes; the portal /
public auth routes intentionally use `safeParse` + manual `ValidationError`
mapping and can be left as-is.
### M2. CSRF origin check disabled in development
**File:** `src/middleware.ts` (line 80)
`process.env.NODE_ENV !== 'development'` gates the origin check. If a
production deployment is ever booted with `NODE_ENV=development`
accidentally (shell export leakage, container override, "debug deploy"),
all CSRF defense-in-depth is silently off — SameSite=Lax still helps but
isn't enough for legacy browsers / extension contexts. **Fix:** key the
bypass on an explicit `DISABLE_CSRF_FOR_LAN=1` env var that's defaulted to
unset and refused in `lib/env.ts` when `NODE_ENV==='production'`.
### M3. Permission-override audit log lacks severity escalation
**File:** `src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:124-134`
Changing user permission grants is exactly the action an attacker would
take after compromising an admin; the audit row should be emitted with
`severity:'warning'` (matching the `email_change_cancelled` precedent in
`src/app/api/v1/me/email/cancel/[token]/route.ts:46`) so the audit UI's
default filter surfaces it. Today it's a vanilla `action:'update'` lost in
the noise.
### M4. `/api/public/interests` audit row stores client phone in `metadata`
**File:** `src/app/api/public/interests/route.ts:254-271`
The audit row's `newValue` and surrounding `metadata` capture `ip` plus
foreign keys, which is fine, but `data.phone` is held in scope and could
easily slip in during a future edit. Today the row is OK; flag as a place
to add a regression test. (Not a finding to act on, just a watch-list item
for the broader audit team.)
### M5. Filesystem storage proxy: token leak via Referer
**File:** `src/app/api/storage/[token]/route.ts:42-119`
`Cache-Control: private, no-store` is set on the response, but the URL
itself (with the HMAC token in the path) leaks via the `Referer` header
when the downloaded asset is opened inside a browser tab that then
navigates to a third-party link. Single-use replay protection mitigates
reuse, but a token still-in-window is good for one stolen download. **Fix:**
either rotate to a POST-with-token-in-body form (breaks `<a download>`),
or set `Referrer-Policy: no-referrer` on the response and document that
issuers should mint with the shortest possible expiry. Lower-impact
because filesystem mode is single-tenant per the boot guard.
### M6. `/api/v1/clients/bulk-hard-delete` lacks per-IP rate-limit
**File:** `src/app/api/v1/clients/bulk-hard-delete/route.ts` (no `withRateLimit`)
The sibling `bulk-hard-delete-request/route.ts` is wrapped in `withRateLimit`
but the actual delete endpoint is not. A compromised admin session could
fan out hundreds of irrevocable hard-deletes in a tight loop with no
limiter to slow it down. **Fix:** add `withRateLimit('destructiveBulk', ...)`
or similar with a 5/minute cap; the existing audit row will still be
emitted, but the limiter caps the blast radius.
---
## Verified clean (no finding)
- `withAuth` / `withPermission` / `requireSuperAdmin` applied uniformly:
every `route.ts` under `src/app/api/v1/**` was checked; the only files
without the wrappers are `me/email/{confirm,cancel}/[token]/route.ts`
(covered by H4) which intentionally use bearer-token auth.
- `withAuth` enforces port-context via `X-Port-Id` header / preferences,
never from body (helpers.ts:160168).
- Documenso webhook: timing-safe per-port secret resolution, replay guard
via `signatureHash` unique index, per-handler `portScope` forwarded so a
documensoId reused across ports can't cross-mutate.
- Public website-intake: timing-safe `verifySecret` with length-equal
buffer pad, refusal-by-default when `WEBSITE_INTAKE_SECRET` unset, per-IP
rate-limit, unknown port slug → generic 400 (no input echo).
- Raw `sql\`...\``usage scanned across`src/lib/services`and`src/app/api`: every interpolation is via Drizzle's parameter binding
(`sql\`... ${foo} ...\``); no string concatenation gaps found.
- Storage proxy upload (PUT) does HMAC verify + single-use replay + size cap
- PDF magic-byte enforcement before disk write.
— security-auditor (read-only audit; no source files edited)
### Report B: api-security (standalone earlier run)
# API + Auth + Security Audit Port Nimara CRM
Scope: `src/app/api/**`, `src/lib/api/helpers.ts`, `src/lib/auth/**`, `src/middleware.ts`,
plus the newly-added permission-overrides and resolve-identifier flows.
## CRITICAL
### 1. Privilege escalation via `PUT /api/v1/admin/users/[id]/permission-overrides`
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:97-141`
The PUT handler gates only on `withPermission('admin', 'manage_users', …)` and never
verifies that `params.id !== ctx.userId`. Any user who holds `admin.manage_users` can
target their own userId and write a `userPermissionOverrides` row that grants every
leaf (`{ admin: { manage_users: true, manage_settings: true, … }, … }`). Because
`withAuth` deep-merges `userOverride.permissionOverrides` last in the chain
(`src/lib/api/helpers.ts:227-238`), the row wins over the base role and instantly
escalates the caller to admin-of-everything on the next request. The companion
`removeUserFromPort` service in `src/lib/services/users.service.ts:319` does have a
self-target guard — the same guard is missing here. Fix: in the PUT handler, throw
`ForbiddenError` when `targetUserId === ctx.userId && !ctx.isSuperAdmin`, and require
super-admin to flip `admin.*` leaves (or any leaf that the calling user cannot already
grant). Tier-2 fix: rotate this row to require super-admin outright; admin-of-port
shouldn't be able to mint persistent overrides for peers anyway.
### 2. `/api/auth/resolve-identifier` has no rate-limit — username enumeration
`src/app/api/auth/resolve-identifier/route.ts:25-59`
The endpoint is unauthenticated, sits behind `/api/auth/*` (so the middleware
origin check is skipped per `src/middleware.ts:46-49`), and does NO rate-limit /
throttling. The header comment claims it "pairs with the global login-attempt
limiter" but that limiter is only triggered when the _subsequent_ sign-in call
runs — an attacker hitting just this endpoint with a wordlist is unconstrained.
While the response shape is the same on hit and miss (`{ email: <string> }`),
the _content_ differs: a hit returns an `@`-bearing email, a miss returns the
unchanged raw input. So with one HTTP call per candidate an attacker
deterministically learns which usernames map to real accounts; they then funnel
only the validated emails into the rate-limited sign-in flow, defeating the
per-account brute-force ceiling. Fix: wrap in `enforcePublicRateLimit(req,
'portalSignIn', normalized)` (or a new bucket like `usernameResolve` with ~10/15min
per-IP), and consider returning a constant fake-email when the username doesn't
resolve so hit/miss are indistinguishable at the response-body level too.
## HIGH
### 3. `permission-overrides` PUT does not validate the override shape
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:31-34, 97-141`
`updateOverridesSchema` is `z.record(z.string(), z.record(z.string(), z.boolean()))`
any resource name and any action key is accepted. This stores garbage in
`user_permission_overrides.permission_overrides` forever, and silently typo'd
keys (`'clien_ts.view'`) won't take effect but won't 400 either. More
importantly, there is no allow-list against the `RolePermissions` shape defined
in `src/lib/db/schema/users.ts:6`, so a future code path that does
`Object.keys(permissions).forEach(…)` could be surprised by a foreign resource
appearing in the merged map. Fix: derive a Zod allow-list at module load from
the canonical `RolePermissions` shape (the same `VALID_MERGE_TOKENS` pattern the
templates code uses) and reject unknown resource/action keys with 400.
### 4. `permission-overrides` PUT writes for users not assigned to the current port
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:97-122`
The PUT inserts/updates a `(userId, portId)` row without first verifying that
`targetUserId` actually has a `user_port_roles` row for `ctx.portId`. An admin at
port A can mint override rows for users belonging only to port B (the row is keyed
on the admin's portId, so it's a "future override that would activate if the user
ever joins this port"). Functionally inert today, but pollutes the override table
across tenants and breaks the implicit "you can only manage users in your port"
invariant the rest of the admin/users routes enforce. The GET path does the
implicit validation by failing the port-role lookup; the PUT should mirror it.
Fix: `findFirst` on `userPortRoles` with `(targetUserId, ctx.portId)` first; 404
if missing, mirroring `updateUser` at `src/lib/services/users.service.ts:216-219`.
### 5. Email-change confirm endpoint cannot be aborted after compromise window
`src/app/api/v1/me/email/confirm/[token]/route.ts:42-57`
Token-based unauthenticated swap. The flow looks otherwise correct (sha256-
hashed token, expiry, single-use via `appliedAt`, race-checked uniqueness). What's
missing: when a confirmation completes, all _other_ outstanding `userEmailChanges`
rows for the same `userId` should be cancelled, and all existing Better Auth
sessions for that user should be revoked. Today, if an attacker compromises the
account, requests an email change to attacker-owned address, and the victim
spots the cancel email but races against the attacker — once the attacker
confirms, the victim's cancel link still works on the _other_ pending row but
not on the now-applied change, and the attacker's existing CRM session
(`pn-crm.session_token`) survives the swap. Fix: in the confirm handler, after
the email UPDATE, also `db.delete(sessions).where(eq(sessions.userId,
pending.userId))` (or whatever the Better-Auth session table is called) and
mark all other open `userEmailChanges` rows for that user as cancelled. Mirror
the cancel-handler behaviour. Severity is HIGH not CRITICAL because the
attacker needs the session in the first place.
### 6. Public `/api/auth/[...all]` audits the attempted email but doesn't bound brute-force timing
`src/app/api/auth/[...all]/route.ts:100-146`
Better Auth handles sign-in rate-limiting internally (it has a built-in limiter
when configured), but I see no explicit `enforcePublicRateLimit` wrapper around
this catch-all. The `loginAttempt` bucket I expected in `src/lib/rate-limit.ts`
isn't present in the listing; the closest is `portalSignIn`, which is wired only
to the _portal_ sign-in handler, not the CRM sign-in. If Better Auth's default
limiter isn't actively configured in `src/lib/auth/index.ts:55-113` (and I don't
see a `rateLimit:` block there), the CRM login endpoint is effectively
unrate-limited and the resolve-identifier finding compounds into a real
brute-force window. Fix: add an `enforcePublicRateLimit(req, 'crmSignIn',
attemptedEmail)` call inside `withAuthAudit` before forwarding to
`upstream.POST(forwardReq)` when `isSignIn`, keyed per-email; declare the bucket
in `rate-limit.ts` mirroring `portalSignIn`'s shape.
## MEDIUM
### 7. CRM `updateUser` cross-tenant email change has no notification when target is super-admin
`src/lib/services/users.service.ts:236-262`
When an admin at port A updates a user (including a super-admin who happens to
have a port-role row at port A), the email-change flow flips Better Auth's
identity instantly with only a courtesy email to the prior address. There's no
challenge / token round-trip — the admin acts unilaterally. Self-service email
change (`/api/v1/me/email`) DOES require token confirmation; admin-initiated
should at least block when the target is a super-admin or require the change to
go through the same confirm-token flow. Fix: gate `wantsEmailChange` on
`!profile.isSuperAdmin || ctx.isSuperAdmin` and/or always use the token flow
even for admin-initiated changes.
### 8. `permission-overrides` PUT does not write audit log atomically with the DB write
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:111-134`
The `existing` row is read, then conditionally update-or-insert, but two
concurrent PUTs against the same `(userId, portId)` race: both see `existing`
as the same value, both call `update`, second writer wins silently with a
last-write audit log that's missing the intermediate state. Severity is medium
because the audit log still captures both writers' new values and there's no
correctness invariant broken — just a forensic gap. Fix: wrap the read +
update/insert in `withTransaction` with `FOR UPDATE` (or use an upsert with
`returning('old')`-equivalent semantics) and log `oldValue` from the locked row.
### 9. Documenso webhook returns 200 on every failure including dedup, which masks crashes
`src/app/api/webhooks/documenso/route.ts:264-268`
The handler's outermost `try/catch` logs `err` but always returns 200. That's
the correct posture for _signature_-invalid traffic (don't leak signal), but
also masks downstream handler crashes — Documenso will never retry a 5xx
because it never sees one. The handlers are documented as idempotent
(`handleDocumentCompleted` early-returns on duplicate completion), so a retry
storm wouldn't double-write, but the missing retry signal turns one transient
DB failure into a permanently dropped event. Fix: return 500 on the catch
branch so Documenso retries; keep 200 for _secret_-invalid (line 100) and
dedup (line 123) since those are intentional no-ops.
### 10. `withAuth` deep-merge: permission overrides only ADD permissions, never EXPLICITLY DENY
`src/lib/api/helpers.ts:73-98, 233-238`
`deepMerge` does a recursive shallow assignment — `userOverride.permissionOverrides`
overwrites leaves wholesale. So `{clients: {view: false}}` works as a deny.
However the override is keyed by _resource → action map_, and the override row
stores `Partial<RolePermissions>`. There's no "tri-state" (inherit/grant/deny)
expressed at the DB layer — the comment in the route says "use null at a leaf
to clear an override" but the Zod schema only accepts `z.boolean()` per leaf,
not null. So the UI cannot actually clear an override leaf via this endpoint
without removing the resource key entirely from the JSON. Worth aligning the
schema with the documented contract. Fix: accept `z.union([z.boolean(),
z.null()])` and strip null leaves server-side before writing.
### 11. Origin check disabled in dev — but `process.env.NODE_ENV` check is per-process
`src/middleware.ts:79-89`
CSRF defense-in-depth is skipped when `NODE_ENV !== 'production'`. The
dev/staging boundary is correct in principle, but `staging` deployments
typically run with `NODE_ENV=production`, while CI / preview-builds may not.
Worth confirming the Dockerfile (`Dockerfile`) sets `NODE_ENV=production` on
any environment that's reachable from the internet. Note also that the
fallback at `src/middleware.ts:68-69` allows a request with neither Origin nor
Referer through — this is correct for server-side fetches but means any HTTP
client that strips both headers (curl with `-H "Origin:"`) bypasses the check.
Combined with SameSite=strict cookies the residual risk is low.
### 12. `me/email` confirm/cancel tokens are URL-only — referer leakage risk
`src/app/api/v1/me/email/route.ts:88-89, src/app/api/v1/me/email/confirm/[token]/route.ts:24-35`
The confirm/cancel URLs are emailed as `${baseUrl}/api/v1/me/email/confirm/${rawToken}`.
The user clicks from their inbox; the email client opens the URL in a browser
which then renders `/settings?emailChange=confirmed` (a redirect). If
`/settings` makes any third-party request before navigating away, the Referer
header carries the full confirm URL including the token. The token is
single-use and short-lived, so the post-redirect exposure window is small, but
defensively the route should `Referrer-Policy: no-referrer` on the redirect
response. Fix: `res.headers.set('Referrer-Policy', 'no-referrer')` on the
`NextResponse.redirect(...)` call.
## Summary
Two CRITICAL findings: self-targetable permission-overrides escalation
(finding 1) and unlimited username harvesting at `/api/auth/resolve-identifier`
(finding 2). Both are direct consequences of the recently-added routes that
prompted this audit. The remainder are mostly hardening — the v1/\* surface
overall is well-disciplined: nearly every route under `/api/v1/**` flows
through `withAuth(withPermission(...))`, body parsing consistently uses
`parseBody` (only public/auth handlers use raw `req.json()` for documented
reasons), and the few raw `sql\`…\`` usages I sampled
(`admin/website-submissions`, `admin/document-sends`, `search/recently-viewed`)
all interpolate via the parameterized tag form rather than string concat.
Multi-tenant scoping looks consistent — services accept `ctx.portId` and the
defense-in-depth pattern is well-applied (e.g. the berth-recommender note in
CLAUDE.md). The Documenso webhook receiver has solid replay/dedup/secret
discipline.
---
## 2. UI/UX consistency + accessibility audit (ui-ux-auditor)
# UI/UX Consistency + Accessibility Audit
Scope: Form patterns, dialog/sheet/drawer choices, mobile parity, enum leakage, empty/loading states, badge tones, a11y, plus the recently added surfaces (UserForm tabs, UserList Power toggle, UserPermissionMatrix, Login identifier field, user settings username card).
---
## CRITICAL
### C1 — `window.confirm()` / `confirm()` used for destructive flows (>=15 sites)
Files using native browser confirm instead of `ConfirmationDialog` (which wraps `AlertDialog`):
- `src/components/clients/contacts-editor.tsx:115` — remove contact
- `src/components/clients/client-files-tab.tsx:50` — delete file
- `src/components/yachts/yacht-list.tsx:187` — archive yacht (bulk)
- `src/components/admin/document-templates/template-version-history.tsx:54` — restore older version
- `src/components/shared/addresses-editor.tsx:77` — remove address
- `src/components/documents/document-detail.tsx:160` — cancel/void signing envelope
- `src/components/interests/interest-list.tsx:314` — archive interest
- `src/components/interests/interest-tabs.tsx:483` — outcome/archival flow
- `src/components/interests/interest-eoi-tab.tsx:299` — cancel EOI
- `src/components/interests/interest-reservation-tab.tsx:313` — cancel contract
- `src/components/interests/interest-contact-log-tab.tsx:222` — delete contact log
- `src/components/interests/interest-contract-tab.tsx:310` — cancel contract
- `src/components/interests/interest-documents-tab.tsx:80` — delete file
- `src/components/companies/company-files-tab.tsx:50` — delete file
- `src/components/companies/company-list.tsx:201` — archive company
- `src/components/documents/document-list.tsx:136` — delete document
**Why it matters:** native confirm cannot be styled, bypasses our `<AlertDialog>` keyboard semantics, no focus trap, no destructive-action red styling, fails focus-return after dismiss; inconsistent with the rest of the app which uses `ConfirmationDialog`. Several of these are catastrophic (cancel signing envelope, hard-delete file, archive company).
**Fix:** replace each with `<ConfirmationDialog destructive title=… description=… onConfirm={…}>` matching the pattern in `user-list.tsx`.
### C2 — UserForm "Permissions" tab silently drops unsaved overrides
`src/components/admin/users/user-form.tsx:204-212` and `user-permission-matrix.tsx:175-191`.
The matrix has its own "Save overrides" button; the parent Sheet's "Save changes" only persists Profile-tab fields. `onSaveStateChange` is declared in the matrix props but **never passed** by `user-form.tsx` (line 206), so the parent has no idea overrides are dirty. A user who toggles Inherit/Grant/Deny then clicks "Save changes" loses everything when the Sheet closes — no warning, no toast.
**Fix:** lift `overrides` state to `user-form.tsx`, persist both endpoints inside `persist()`, or track dirty state via `onSaveStateChange` and block Sheet close with an AlertDialog.
---
## HIGH
### H1 — Raw enum render via `.replace(/_/g, ' ')` outside `constants.ts` (40+ sites)
Examples (not exhaustive):
- `src/components/documents/documents-hub.tsx:292`, `document-detail.tsx:204,210,386`, `entity-folder-view.tsx:63`, `hub-root-view.tsx:69`, `signing-details-dialog.tsx:123``status`, `eventType`, `documentType`
- `src/components/reservations/reservation-detail.tsx:230,285,339``tenureType`, agreement status
- `src/components/berths/berth-status-suggestion-dialog.tsx:61,65`
- `src/components/expenses/expense-detail.tsx:229,233`, `expense-card.tsx:71`, `expense-columns.tsx:121`, `expense-form-dialog.tsx:257,278`, `expense-filters.tsx:16`
- `src/components/admin/audit/audit-log-list.tsx:234-235`, `roles/role-list.tsx:223,239`, `roles/role-form.tsx:123`
- `src/components/admin/users/user-permission-matrix.tsx:101` — local `formatAction` duplicates pattern
- `src/components/dashboard/source-conversion-chart.tsx:60`, `activity-feed.tsx:34,44`
- `src/components/scan/scan-shell.tsx:227,242`
- `src/components/interests/linked-berths-list.tsx:94`, `interest-tabs.tsx:40`
- `src/app/(portal)/portal/{my-yachts,documents,interests}/page.tsx` — portal-side enum leakage
- `src/components/search/command-search.tsx:939,965` — fallback after `STAGE_LABELS`
**Fix:** route through `stageLabel`, `formatRole`, `formatOutcome`, `formatSource` (already in `constants.ts`); add `formatDocumentStatus`, `formatTenureType`, `formatEventType`, `formatExpenseCategory`, `formatPaymentMethod`, `formatBerthStatus`, `formatPermissionAction` to `constants.ts` and replace call-sites. Removes "manage memberships" / "Eoi Signed" inconsistencies.
### H2 — Mobile parity: 18 list components have no `cardRender`
DataTable already supports `cardRender`; without it the mobile view falls back to a raw horizontal-scroll table (bad UX on iOS):
- `src/components/reservations/reservation-list.tsx`, `berth-reservations-list.tsx`
- `src/components/website-analytics/top-list.tsx`
- `src/components/shared/notes-list.tsx`
- `src/components/residential/residential-clients-list.tsx`, `residential-interests-list.tsx`
- `src/components/documents/document-list.tsx`
- `src/components/interests/linked-berths-list.tsx`, `recommendation-list.tsx`
- `src/components/email/email-accounts-list.tsx`, `email-threads-list.tsx`
- `src/components/reports/reports-list.tsx`
- `src/components/admin/document-templates/template-list.tsx`, `forms/form-template-list.tsx`, `roles/role-list.tsx`, `tags/tag-list.tsx`, `ports/port-list.tsx`
**Fix:** add cardRender mirroring desktop columns. `UserCard`/`ClientCard`/`InterestCard` are good templates.
### H3 — User settings phone field is unbound on load
`src/components/settings/user-settings.tsx:69-92``loadProfile()` reads `firstName`, `lastName`, `email`, etc., but **never reads `phone`** into state. Yet `saveProfile()` at line 143 sends `phone: phone || null`, which **clears the user's stored phone on every save**. Also `country as never` cast at line 298 is unsound — when no country is selected the PhoneInput shows a US flag even for European users.
**Fix:** add `phone` to MeResponse + `setPhone(res.data.profile?.phone ?? '')`. Store country alongside phone (the PhoneInput value is `{e164, country}` — persist the parsed country).
### H4 — UserPermissionMatrix three-state toggle has no a11y semantics
`user-permission-matrix.tsx:247-267` — three sibling `<button>` elements with no `role="radiogroup"`/`role="radio"`/`aria-checked`. Screen readers announce "button, Grant" with no indication which is selected, what the options are, or that they're mutually exclusive. Also no focus ring on the active option.
**Fix:** wrap in `<div role="radiogroup" aria-label={`${action} permission`}>` and set `role="radio" aria-checked={state === opt}` on each. Or use Radix `RadioGroup` for keyboard arrow navigation.
### H5 — Login page form errors not associated with inputs
`src/app/(auth)/login/page.tsx:84-119``<p className="text-sm text-destructive">{errors.identifier.message}</p>` is rendered after the input but the `<Input>` has no `aria-describedby` pointing at it, and no `aria-invalid={!!errors.identifier}`. Same for password. Screen readers won't read the error message when focus lands on the input.
**Fix:** give each error `<p id="identifier-error">`, add `aria-describedby={errors.identifier ? 'identifier-error' : undefined}` and `aria-invalid={!!errors.identifier}` on the Input.
### H6 — Desktop sidebar nav lacks `aria-current="page"`
`src/components/layout/sidebar.tsx:177-201` (`NavItemLink`) — uses `active` for visual styling but doesn't set `aria-current` on the `<Link>`. Mobile bottom tabs already do this (`mobile-bottom-tabs.tsx:85`). Screen-reader users cannot identify the current page in the desktop sidebar.
**Fix:** `aria-current={active ? 'page' : undefined}` on the `<Link>`.
---
## MEDIUM
### M1 — Berth status pills use ad-hoc Tailwind colors instead of `StatusPill`
`src/components/berths/berth-columns.tsx:117-119`, `berth-card.tsx:21-23`, `berth-detail-header.tsx:90``bg-green-100 text-green-800`, `bg-yellow-100`, `bg-red-100`. The codebase has `StatusPill` (`src/components/ui/status-pill.tsx`) with semantic tokens (success-bg, warning-bg, error-bg) already used by docs/reservations. Berth statuses (available/under_offer/sold) map cleanly to active/expired/rejected pill states.
**Fix:** replace ad-hoc badges with `<StatusPill status={…}>` and extend `statusPillVariants` if a new tone is needed.
### M2 — UserList "Active"/"Disabled" badge inconsistent with StatusPill convention
`src/components/admin/users/user-list.tsx:104-115` — uses `<Badge variant="default" className="bg-green-600">` with inline green override and `<Badge variant="destructive">`. The Power/PowerOff icons and ShieldCheck/ShieldOff icons (also row 107/112) lack `aria-hidden` — but text is present so it's not blocking, just inconsistent.
**Fix:** use `<StatusPill status="active">Active</StatusPill>` / `<StatusPill status="archived">Disabled</StatusPill>`; add `aria-hidden` to all decorative `lucide-react` icons in the table.
### M3 — Only 5 of 73 dashboard routes have a `loading.tsx`
Only `clients/[clientId]`, `invoices`, `expenses`, `admin/errors`, `admin/errors/[requestId]` have route-level loading skeletons. The rest fall back to a blank flash. Lists/details that fetch via React Query show a skeleton inside the component, but full-page navigations show nothing.
**Fix:** add `loading.tsx` per route segment that returns a `<Skeleton>` matching the page chrome (sidebar/topbar already render via the layout).
### M4 — UserPermissionMatrix loading state uses text, not Skeleton
`user-permission-matrix.tsx:193-197` renders `"Loading permissions…"` text. Other list/detail loaders in the app use `<Skeleton>` from `@/components/ui/skeleton`. Adds inconsistency.
**Fix:** replace with a Skeleton grid mirroring the accordion shape.
### M5 — Settings transient messages persist forever instead of toasting
`user-settings.tsx` lines 167, 184, 197 (`usernameMsg`, `emailMsg`, `resetMsg`) — these `useState` strings stay rendered as a `<span>` next to their button indefinitely. Login uses `toast.error()`; reset-password and other auth surfaces also use sonner.
**Fix:** swap to `toast.success()` / `toast.error()`. Removes stale messages and the inconsistency between auth and settings.
### M6 — Email-or-username Login input: visible placeholder collides with sr-only space
`src/app/(auth)/login/page.tsx:93``placeholder="you@example.com or yourname"` with two literal spaces. Mac VoiceOver reads "you at example dot com or yourname" — fine; but the double space is just sloppy formatting. Also the placeholder duplicates the Label "Email or username" — placeholder is unreliable for instructions (clears on focus).
**Fix:** single-space the placeholder, or move the format hint into a `<p id="identifier-hint" className="text-xs text-muted-foreground">` and wire `aria-describedby`.
### M7 — User settings username card: client-side pattern validation never surfaces inline
`user-settings.tsx:359-386``pattern="^[a-z0-9._-]{2,30}$"` on the input. HTML5 validation only fires on form submit (this isn't inside a `<form>`); the Save button is a plain `<Button onClick>`. So invalid input only fails server-side with a generic 400. No `aria-describedby` pointing at the helper text (line 382-385).
**Fix:** add a zod-resolved react-hook-form mini-form OR validate on blur and show inline error; wire `aria-describedby="username-help"`.
### M8 — UserForm Tabs: focus does not follow tab switch & no dirty-tab indicator
`user-form.tsx:194-212` — switching from Profile to Permissions doesn't move keyboard focus to the matrix; switching back loses scroll position. The Permissions tab trigger is disabled for new-user mode (correct) but has no tooltip explaining why.
**Fix:** Radix Tabs handles focus by default; verify and add a `title` / `aria-describedby` on the disabled trigger with explanation. Add a small "•" dot on the trigger label when overrides are dirty (depends on C2 fix).
### M9 — Email confirmation AlertDialog in UserForm: default focus + return focus
`user-form.tsx:362-387` — opens on submit. Radix returns focus to the submit button after close (good), but the dialog's `<AlertDialogAction>` triggers `persist()` without disabling itself during the network call; rapid double-click can fire two PATCHes. Also `disabled={loading}` is set on action but not on `<AlertDialogCancel>` re-enable timing.
**Fix:** add a `submitting` guard or rely on existing `loading` state for both buttons; close dialog only after `persist()` resolves.
### M10 — Decorative icons missing `aria-hidden`
Across `user-list.tsx`, `user-card.tsx`, `documents-hub.tsx`, `berth-status-suggestion-dialog.tsx`, status pills with `<ShieldCheck>`, `<Power>`, `<PowerOff>`, `<Globe>`, etc., the icons supplement text — they should carry `aria-hidden="true"` so screen readers don't double-announce. Mixed across the codebase; some lucide imports get it, most don't.
### M11 — Drawer vs Sheet usage drift
`src/components/clients/client-interests-tab.tsx:217` uses Vaul `<Drawer>` for an interest preview, while every other detail-preview surface (yacht preview, company preview, reservation preview) uses `<Sheet>`. Vaul drawers are intended for mobile bottom-sheets; using it for an inline preview on desktop is inconsistent.
**Fix:** standardize on `<Sheet side="right">` for desktop right-rail previews; reserve `<Drawer>` for the mobile More menu (`more-sheet.tsx`).
---
## LOW
### L1 — `user-permission-matrix.tsx:264` button label cosmetic uppercase done inline
`{opt[0]!.toUpperCase() + opt.slice(1)}` — works but the non-null `!` and inline transform inside JSX is brittle. Consider a `OPTION_LABELS` constant.
### L2 — UserList action column uses `title="…"` instead of accessible tooltip
`user-list.tsx:135,147,180` — relies on native browser tooltips. They don't appear on touch and don't surface to screen readers; the `<span className="sr-only">` carries the label which is correct, but consider Radix `Tooltip` for parity with the rest of the app.
### L3 — Login page brand color hardcoded
`src/app/(auth)/login/page.tsx:106,123``#007bff` / `#0069d9` hex hardcoded instead of using `brand-500` / `brand-600` design tokens. Same issue in sidebar.tsx:190,196,379 (`#3a7bc8`).
### L4 — `formatAction` duplicated locally in matrix instead of in `constants.ts`
`user-permission-matrix.tsx:100-102` re-implements the title-case replace. Move to `constants.ts` as `formatPermissionAction` (used in 3+ files: role-list.tsx, role-form.tsx, matrix).
### L5 — Hard-coded "border-amber-300 bg-amber-50" warning callouts (15+ sites)
Across `bulk-archive-wizard.tsx`, `hard-delete-dialog.tsx`, `smart-archive-dialog.tsx`, `smart-restore-dialog.tsx`, `pdf-reconcile-dialog.tsx`, `user-settings.tsx:321`, etc. Need a shared `<Callout tone="warning|info|danger|success">` primitive that reads from design tokens.
---
## Verified OK
- Form helper coverage: `react-hook-form + zodResolver`, `PhoneInput`, `CountryCombobox`, `TimezoneCombobox`, `InlineEditableField`, `InlineTagEditor` are present and used consistently in client/yacht/company/interest forms.
- `parseBody` + `errorResponse` envelope convention holding for new endpoints checked.
- `ConfirmationDialog` correctly returns focus and traps focus via Radix `AlertDialog`.
- `StatusPill` is the right primitive; just under-adopted (M1, M2).
- Mobile bottom tabs handle `aria-current` correctly (template for H6).
- UserCard already adds `aria-label="Actions for ${displayName}"` on the icon-only `MoreHorizontal` trigger.
---
## 3. Data model + migrations + relations audit (data-model-auditor)
# Data Model + Migrations + Relations Audit
Scope: `src/lib/db/schema/*.ts` (24 files) and migrations 00000055.
~92 tables, multi-tenant on `port_id`. Drizzle ORM + `postgres-js`.
---
## CRITICAL
### C1 — No prod migration runner; 0052 uses `CREATE INDEX CONCURRENTLY`
`package.json` exposes only `db:generate` / `db:push` / `db:studio`. There is no
`db:migrate` script, no usage of `drizzle-orm/postgres-js/migrator`, and no
in-repo SQL replay loop. The numbered SQL files are applied by hand via psql
(implicitly). `0052_audit_critical_fixes.sql` runs `CREATE INDEX CONCURRENTLY`
for six composite indexes and its header explicitly forbids wrapping in
`BEGIN/COMMIT` — anyone running it via Drizzle's default migrator (which wraps
each file in a single tx) or `psql -1` will see it abort silently. The
aggregated-projection queries on `files`/`documents` then fall back to seq scans
in prod. **Action:** ship a real prod migrator that respects per-file transaction
hints, or split 0052 into pre/post files, and document the runbook in
CLAUDE.md.
### C2 — `db:push` skips two structural constraints
Both are flagged in source comments:
1. `berths.current_pdf_version_id``berth_pdf_versions.id` FK (circular dep, set up by 0030).
2. `system_settings_key_port_idx` `NULLS NOT DISTINCT` flag (0047) — required so global settings with `port_id IS NULL` are unique by `key` alone.
A fresh-deploy or developer onboarding via `db:push` produces a structurally
divergent DB: dangling pointers on the active berth column, and silent duplicate
global `(key, NULL)` settings accumulating over time. **Action:** post-push
reconciler, or kill `db:push` for prod and rely solely on the SQL files.
---
## HIGH
### H1 — New `user_permission_overrides.user_id` lacks any FK
Migration 0055 declares `user_id TEXT NOT NULL` with no `REFERENCES "user"(id)`.
Compare `portRoleOverrides` (cascades on both `port_id` and `role_id`). Deleting
a user leaves orphaned override rows; a future `user.id` collision (e.g.
re-creating a user with the same id via fixture seed) re-applies them. Same
pattern on `userPortRoles.userId`. The broader codebase treats better-auth user
IDs as opaque strings deliberately (~17 columns), but **this is a brand-new
CRM-owned table** where a real FK was straightforward. **Action:** ship 0056
adding `FOREIGN KEY (user_id) REFERENCES "user"(id) ON DELETE CASCADE` to
`user_permission_overrides` and `user_port_roles`.
### H2 — Nullable client FKs without `set null` block hard-delete
`documents.clientId` (line 72), `files.clientId` (line 30), `email_threads.clientId`,
`formSubmissions.clientId`, `documentTemplates.sourceFileId`,
`generatedReports.fileId` — nullable, declared `.references(...)`, no
`onDelete`. The new `admin.permanently_delete_clients` permission will fail with
FK violation on any client with attached files/documents. The aggregated
projection already preserves history via FK snapshots, so `ON DELETE SET NULL`
is the documented intent. **Action:** add `onDelete: 'set null'` + a 0056
migration. Same shape applies to `berthReservations` notNull parents
(`berth_id`, `port_id`, `client_id`, `yacht_id`) which have no `onDelete`
declared in Drizzle (Drizzle emits `NO ACTION` — correct behavior but
inconsistent with the explicit audit pattern in 0042).
### H3 — `yachts.current_owner_id` (and friends) are polymorphic, unconstrained
The `current_owner_type` discriminator has the 0036 CHECK; the paired
`current_owner_id` has no guarantee the referenced client/company row exists.
Same hole on `yacht_ownership_history.owner_id`, `invoices.billing_entity_id`,
`audit_logs.entityId`, `notifications.entityId`. The owner-resolver returns
`null` for missing rows, but direct reads (audit dossier, ownership history
rendering) trust the id. **Action:** daily reconciler reading
`(owner_type, owner_id)` pairs against the discriminator's target table,
surfacing orphan counts in the admin inspector.
### H4 — Migration 0042's `billing_entity_id` backfill is a tombstone
`UPDATE invoices SET billing_entity_id = COALESCE(NULLIF(client_name, ''), id)
WHERE billing_entity_id = ''` writes a clientName string as if it were an
entity id. The CHECK `billing_entity_id <> ''` passes, but downstream
`billing_entity_type='client'` resolution returns null forever for these rows.
The fix is right (won't fail the migration) but no follow-up tooling logs the
tombstones. **Action:** count post-0042 rows where resolver returns null and
expose in the admin inspector.
### H5 — System-folder write protection is service-only
`assertNotSystemManaged` lives in the folders service. Nothing at the DB level
rejects `UPDATE document_folders SET name='x' WHERE system_managed=true`. The
0052-tightened `chk_system_folder_shape` constrains shape but not write-access.
One careless `db.update` away from breaking the system roots invariant.
---
## MEDIUM
### M1 — Missing partial indexes on `archived_at`
0046 partial-archived indexes covered `clients`, `interests`, `yachts`,
`residential_clients`, `residential_interests`. **Missing:** `companies.archivedAt`
(filtered in companies.service), `document_folders.archived_at` (filtered in
hub list queries). Volume is low so it's M, not H.
### M2 — `userPortRoles` allows multiple roles per `(user, port)`
Unique index is on `(userId, portId, roleId)` — two role rows for the same
`(user, port)` are permitted. `getEffectivePermissions` reads `findFirst` without
an `ORDER BY` and silently picks one. Either tighten to `(userId, portId)` or
union-OR the permissions across all assigned roles.
### M3 — `interest_berths.berth_id` is `restrict` with no UI escape hatch
`onDelete: 'restrict'` is the right protective behaviour, but admins hard-deleting
a berth hit a raw FK error message. Offer a "detach this berth from N interests"
admin button before delete, or soften to `set null` with a service-side warning.
### M4 — `audit_logs.searchText` (tsvector) lacks a GIN index in Drizzle
The column is declared but only btree indexes appear in the Drizzle table
definition. Confirm 0044 (or earlier) ships `USING gin (search_text)` — if
absent, FTS scans linearly. **Action:** verify and add GIN if missing.
### M5 — Username docstring drift
`user_profiles.username` CHECK is `^[a-z0-9._-]{2,30}$` (matches the validator),
but the TS docstring (`src/lib/db/schema/users.ts:249`) says "330 chars".
Cosmetic.
### M6 — Polymorphic CHECK coverage gap on `document_folders.entity_type`
The CHECK round (0036+0042) covered `yachts.current_owner_type`,
`invoices.billing_entity_type`, `yacht_ownership_history.owner_type`,
`document_sends.document_kind`. Missing: a constraint that
`document_folders.entity_type IN ('root','client','company','yacht')` for
user-created folders. `chk_system_folder_shape` only fires when
`system_managed = true`.
### M7 — JSONB blobs without DB-level validators
`system_settings.value`, `audit_logs.metadata`/`oldValue`/`newValue`,
`notifications.metadata`, `savedViews.filters`/`sortConfig`/`columnConfig`,
`berth_pdf_versions.parseResults`. The `permission-overrides` PUT route is well
sanitized (`ALLOWED_RESOURCE_ACTIONS` allow-list before write). `userProfiles.preferences` is validated and 8KB-capped at the API. The others rely on per-caller validators only.
### M8 — `scratchpadNotes.linkedClientId` crosses ports without enforcement
Notes are user-scoped (no `portId`), but the linked client lives in a port. A
user reassigned between ports could open stale notes pointing at clients in a
port they no longer access. UI port-scoped queries hide them, but raw API
exposure does not.
### M9 — 0027 nationality-ISO backfill is non-idempotent on dirty data
Re-running after manual edits overwrites the nationality_iso column. CLAUDE.md
notes the `last_imported_at` guard for berths (0024/0034 mooring normalization)
but 0027 has no such guard.
### M10 — `currency_rates` has no retention
`(base, target)` is the only unique index; daily polling accumulates rows
forever. Low-priority (daily volume is small).
---
## Migration replayability — verdict
Idempotency is **strong** across 0036+: `DO $$ … EXCEPTION WHEN duplicate_object`
blocks, `IF NOT EXISTS` on every CREATE INDEX, `NOT VALID + VALIDATE` pattern
in 0042/0044/0052. The 0028→0029 split (data move then DROP `interests.berth_id`)
is correct. 0046 `DROP IF EXISTS` + `CREATE IF NOT EXISTS` is correct. The
0050/0051/0052 folder-lifecycle chain forms a clean migration sequence with the
shape CHECK tightened in the right order.
The single replayability cliff is C1 above: 0052 + the absent migrator.
---
## Partial-unique indexes — all verified present
| Constraint | Index | Source |
| ----------------------------------------------- | ----------------------------------------------------------------------------- | ------------------- |
| one primary berth per interest | `idx_ib_one_primary WHERE is_primary` | interests.ts |
| one default brochure per port (non-archived) | `idx_brochures_one_default_per_port WHERE is_default AND archived_at IS NULL` | brochures.ts |
| username case-insensitive | `idx_user_profiles_username_unique ON LOWER(username) WHERE NOT NULL` | 0054 |
| one open alert per fingerprint | `idx_alerts_fingerprint_open WHERE resolved_at IS NULL` | insights.ts |
| one active yacht owner | `idx_yoh_active WHERE end_date IS NULL` | yachts.ts |
| one primary contact per (client, channel) | `idx_cc_one_primary_per_channel WHERE is_primary` | clients.ts |
| one active reservation per berth | `idx_br_active WHERE status='active'` | reservations.ts |
| one subfolder per entity per port | `uniq_document_folders_entity WHERE entity_id IS NOT NULL` | documents.ts (0051) |
| one global setting per key (NULLS NOT DISTINCT) | `system_settings_key_port_idx` | 0047 (see C2) |
| one primary client address | `idx_ca_primary WHERE is_primary` | clients.ts |
| one primary company address | `idx_compa_primary WHERE is_primary` | companies.ts |
---
## Summary
- **CRITICAL (2):** no prod migration runner for 0052's CONCURRENTLY indexes; `db:push` skips two structural constraints (circular FK, NULLS NOT DISTINCT).
- **HIGH (5):** missing FK on new `user_permission_overrides.user_id`; nullable client FKs without `set null` block hard-delete; polymorphic owner_id un-validated; 0042 billing_entity_id tombstones invisible; system-folder write-protection is service-only.
- **MEDIUM (10):** missing partial indexes on companies + document_folders; userPortRoles allows duplicate roles; interest_berths `restrict` has no UI escape; audit_logs FTS GIN to verify; misc docstring drift, polymorphic CHECK gap on folders, JSONB writes without DB validators, scratchpad cross-port, 0027 idempotency, currency_rates retention.
Recommended sequencing: ship a real prod migration runner (C1), then a 0056
follow-up that closes H1 + H3 (FKs on user_permission_overrides.user_id and on
nullable client FKs).
---
## 4. Services + realtime + queue + storage audit (services-auditor)
# Services + Realtime + Queue + Storage Audit
Scope: business-logic correctness, webhook idempotency, BullMQ workers, Socket.IO fan-out, storage backend, cross-entity port isolation, the just-added `notifyAdminEmailChange` helper.
Repo: `new-pn-crm` @ `feat/documents-folders`. Audit window: ~22 min. Read-only.
---
## CRITICAL
### C1. `updateUserInPort` email-change bypasses Better Auth account row
**Files:** `src/lib/services/users.service.ts:236-262, 355-387`
`db.update(user).set({ email: ... })` writes the new email directly to the Better Auth `user` table. The Better Auth `account` table (`src/lib/db/schema/users.ts:194-210`, `providerId='credential'`) carries an `accountId` column that is typically the user's email — used by Better Auth's password-login flow to resolve a credential row. The update does NOT touch `account.accountId`, does NOT invalidate active sessions, does NOT update `account.updatedAt`, and does NOT use Better Auth's admin API (`auth.api.updateUser` / `setEmail`). Failure modes:
- After cutover the user cannot sign in with the new email (Better Auth resolves the credential by old `accountId`).
- Existing sessions (cookie keyed to userId) continue to work with the _new_ email already showing in profile — confusing UX, no forced re-auth.
- The whole flow runs **outside any transaction**`userProfiles` update (line 230), `user` update (line 247), `userPortRoles` update (line 281), audit log, and notification-fire are five independent writes. A failure between them leaves partial state with no rollback.
- **No idempotency under retry**: there is no guard that the email actually differs from the current `account.accountId`, and the email-change notification is fire-and-forget — a retried admin request re-fires the courtesy email and rewrites all rows.
Fix: route through `auth.api.updateUser` (or write `account.accountId` + bump session invalidation) and wrap in a transaction.
### C2. `handleDocumentCompleted` orphan blobs on mid-flight failure
**File:** `src/lib/services/documents.service.ts:1100-1253`
The idempotency early-return (`doc.status === 'completed' && doc.signedFileId`) only fires when **both** flags are set. The sequence is:
1. `downloadSignedPdf` (line 1120) — may throw.
2. `storage.put(storagePath, signedPdfBuffer)` (line 1134) — succeeds → blob exists.
3. `ensureEntityFolder` (line 1148) — best-effort.
4. `db.insert(files)` (line 1166) — succeeds → file row exists pointing at blob.
5. `db.update(documents).set({status:'completed', signedFileId})` (line 1185) — **if this fails** (e.g. transient connection loss after `files` insert), the document keeps `signedFileId = NULL`.
On the retry from Documenso, the early-return short-circuit is bypassed (signedFileId still NULL). The function re-downloads, **re-generates a new UUID** (`crypto.randomUUID()` at line 1131), re-puts to a new key, inserts a second `files` row, and only then updates the document. The first blob from step 2 + the first `files` row are now orphaned (unreachable via document, but the file row still exists and may surface in aggregated listings with no docs link).
Additionally, the `catch` block (line 1244) marks `status='completed'` with no `signedFileId` — this means the document is presented to the rep as "complete" while the signed PDF was never persisted. Subsequent webhook retries will retry (no early-return) but if Documenso stops retrying after Nth attempt, the document is permanently stuck "completed with no file."
Fix options: (a) wrap `files.insert` + `documents.update` in one transaction; (b) delete the blob in the catch when the file row insert succeeds but the document update fails; (c) refuse to mark `status='completed'` in the catch — leave as-is so the next retry / cron poll succeeds.
---
## HIGH
### H1. Notification-email worker HTML injection via `notif.link`
**File:** `src/lib/queue/workers/notifications.ts:65-71`
```ts
`<p>${notif.description ?? notif.title}</p>${
notif.link ? `<p><a href="${process.env.APP_URL}${notif.link}">View in CRM</a></p>` : ''
}`;
```
`notif.description`, `notif.title`, and `notif.link` are interpolated into HTML with no escaping. `notif.link` is mostly internal-generated (`/documents/{id}`) but several call sites push user-derived values into `description` (filenames, client names, custom alert text). A `description` of `<img src=x onerror=...>` ships as live HTML to the recipient's inbox. Lower-severity than C1 because most notifications are admin-only and the recipient is internal staff, but still an XSS-via-email primitive. Use the same `renderEmailBody` (allowlist) helper the send-out flow uses.
### H2. `expense-dedup.markBestDuplicate` lost-update race
**File:** `src/lib/services/expense-dedup.service.ts:58-73`
`scanForDuplicates` returns candidates, then `markBestDuplicate` writes `duplicateOf`. Two concurrent dedup-engine runs on a pair `(A,B)` can each mark the other as the duplicate → mutual `duplicateOf` cycle, both archived later by `mergeDuplicate`. No advisory lock, no transaction encompassing scan + update. Also: `scanForDuplicates` does not filter `archived_at IS NULL`, so already-merged sources can resurface as candidates.
### H3. `notes.service` dead-code dispatch helper
**File:** `src/lib/services/notes.service.ts:80-98`
`tableForEntity` is defined and immediately `void`-discarded — every CRUD branch inlines its own switch. New entity types (e.g. `residential_clients`) added to the type union are silently missed by inlined branches because the exhaustive-switch compiler check is absent. This is the actual drift-vector for the polymorphic dispatch CLAUDE.md called out. Either delete the helper or refactor every CRUD operation to go through it.
### H4. Socket-server max-connections race
**File:** `src/lib/socket/server.ts:103-106`
```ts
const userSockets = await io!.in(`user:${session.user.id}`).fetchSockets();
if (userSockets.length >= 10) return next(new Error('Maximum connections reached'));
```
Between `fetchSockets()` and the eventual `socket.join(`user:${userId}`)` at line 132, another concurrent handshake can pass the same check. Under burst reconnect (e.g. flaky network across many tabs), users get 11+ sockets. The Redis adapter's `fetchSockets` is multi-pod-aware, but the gating is not atomic. Use a Redis `INCR` keyed by `user:${id}:conn_count` with TTL fallback, decrement on disconnect.
### H5. Documenso webhook timing side-channel
**File:** `route.ts:60-68` + `documenso-webhook.ts:13-21`
`verifyDocumensoSecret` short-circuits on `length !== expected.length` before `timingSafeEqual`. Combined with the linear scan across all per-port secrets, response-time deltas leak the number of ports and the length of each secret. Marginal but easy fix: pad to fixed size.
### H6. Global Documenso secret silently drops events under multi-tenant ambiguity
**File:** `src/lib/services/documents.service.ts:967-996`
`resolveWebhookDocument` correctly refuses to mutate when documensoId matches multiple ports AND no portId was passed. The webhook route now resolves portId from the matched secret (good — see comment at line 138-143). But the global `env.DOCUMENSO_WEBHOOK_SECRET` fallback entry returns `portId: null` (`port-config.ts:370`), and any port still using the global secret falls back to the "ambiguous → refuse" path. **Result:** if two ports share the global secret, valid completion events get silently dropped instead of routed. The dedupe + dead-letter on the inbound side doesn't surface this — it just looks like Documenso never delivered. Recommend: require per-port secrets for production and warn loudly when more than one port resolves to `portId: null`.
---
## MEDIUM
### M1. Storage migration loads each blob fully into Node memory
**File:** `src/lib/storage/migrate.ts:170-204` (`copyAndVerify`)
`for await (chunk of stream) { chunks.push(chunk) }` materializes the full blob in memory twice (source read + verify re-read) per file. A 200MB signed PDF or GDPR export blows the worker. Consider piping through `crypto.createHash('sha256')` + tee to the target backend instead of `Buffer.concat`. The pre-flight free-disk check (line 298-310) does `Promise.all(refs.map(head))` for every blob in the table — for large `files` tables that's thousands of round-trips before any copy starts.
### M2. `archiveInterest` next-in-line dossier outside transaction
**File:** `src/lib/services/interests.service.ts:1067-1112`
The IIFE that builds next-in-line notifications fires after `softDelete(interests, ...)` and `evaluateRule` — both already queued via `void`. If the IIFE throws after the interest is archived but before notifications send, only a `logger.error` lands; the archived interest stays archived with no rep notification. Acceptable as best-effort, but the dossier doesn't run inside the same audit-context request (the `createAuditLog` call happens earlier), so an operator reading the audit trail sees "archived" without seeing what notifications were attempted. Consider attaching the dossier result to the audit metadata.
### M3. `attachWorkerAudit` always records `portId: null`
**File:** `src/lib/queue/audit-helpers.ts:50-86`
Every job-failure audit row is written with `portId: null`. Multi-port operators querying their port-scoped audit log will not see worker failures that affected their port (e.g. a `documenso-void` job carrying portId in `job.data`). The worker has access to `job.data.portId` for most queues — extract it where present.
### M4. `RECURRING_JOB_NAMES` drift
**File:** `src/lib/queue/audit-helpers.ts:27-48`
Hardcoded `Set` requires manual sync against `scheduler.ts`. Typos silently demote cron heartbeats to regular completion logs. Either co-locate or compute from the scheduler module at boot.
### M5. Aggregated workflow listing surfaces `draft` workflows
**File:** `src/lib/services/documents.service.ts:1888`
`INFLIGHT_STATUSES = ['draft', 'sent', 'partially_signed']` includes `draft`. CLAUDE.md describes the UI section as "Signing-in-progress" — drafts have not been sent. Confirm intent.
### M6. Documenso secrets stored plaintext in `system_settings`
**File:** `src/lib/services/port-config.ts:351-373`
`listDocumensoWebhookSecrets` reads `systemSettings.value` directly — no decryption. SMTP/IMAP passwords are AES-256-GCM-encrypted per CLAUDE.md; the Documenso webhook secret should be too.
### M7. `import` worker is a no-op
**File:** `src/lib/queue/workers/import.ts:13-17`
`process()` body is `// TODO(L2)`. Any job pushed to the `import` queue silently completes with no work — every CSV import is a silent success if the producer side ships first.
---
## Observations on what is solid
- **`handleDocumentCompleted` idempotency gate** (line 1110) is correct _when reached_. The hazard is the partial-write window above (C2), not the gate itself.
- **`resolveWebhookDocument`** correctly refuses to mutate on multi-port ambiguity.
- **Socket auth middleware** (`server.ts:91-124`) cross-checks the client-supplied `auth.portId` against `userPortRoles` — closes the prior tenant-room hijack.
- **Storage filesystem backend** correctly refuses to start when `MULTI_NODE_DEPLOYMENT=true` (`filesystem.ts:218`) using the zod-validated env, not raw `process.env`.
- **Magic-byte verification** is enforced both for brochures (`brochures.service.ts:241-263`) and berth PDFs (`berth-pdf.service.ts:234-262`) with delete-on-mismatch cleanup.
- **File-aggregation projection** (`files.ts:316-379, 526-579`) applies `port_id` at the entry-point assert, on `companies.port_id` / `clients.port_id` / `yachts.port_id` joins, on `files.port_id` in the predicate, and on the `documents` LEFT JOIN's residual (line 567). Defense-in-depth is consistent.
- **Webhook worker** has DNS-rebinding SSRF re-resolution at dispatch (`webhooks.ts:18-45`) and dead-letter handling with operator notifications.
---
**Headline asks:** C1 (Better Auth identity rotation), C2 (orphan-blob window), H1 (notification email XSS), H6 (global webhook secret ambiguity drops events silently).
---
## 5. Performance + code-trim + render-smoothness audit (perf-test-auditor)
# Performance + Testing-Coverage Audit
**Branch:** `feat/documents-folders` · **Scope:** static analysis only.
Numbers: 116 vitest files / 1293 tests · 33 smoke specs · 68 services
files (15 with a unit-test file → **78 % of services have zero unit tests**).
---
## CRITICAL
### C1. Zero test coverage for the user-mgmt + permission-override slice just shipped (commit `660553c`)
`git diff main` adds: username sign-in, identifier resolver, per-user
permission-override matrix, role-label rendering, search keyword index,
user disable/email-change paths, dashboard widget toggles.
`grep -rn 'username\|resolve-identifier\|permission-overrides' tests/`
**no matches**. Not one smoke spec, not one integration test, not one
unit test. The feature ships dark.
Highest-risk slices:
- `POST /api/auth/resolve-identifier` — public, unauthenticated, rate-limited
via a shared `auth` bucket. Anti-enumeration relies on a synthetic
`@auth.invalid` fall-through. A wrong shape regression here silently
re-enables username enumeration. Needs a vitest test with hit/miss/
empty/error paths.
- `PUT /api/v1/admin/users/[id]/permission-overrides` — the schema
allow-list (lines 4780 of the route) is hand-maintained against
`RolePermissions`. A drift here lets an admin grant themselves
unlisted leaves. There's already a `if (targetUserId === ctx.userId)`
self-target check; no test ensures it stays.
- The `UserPermissionMatrix` is the only UI for the new overrides table
and is **not** rendered by any spec.
→ Fix: add at minimum one smoke spec under `tests/e2e/smoke/24-admin-features.spec.ts`
that logs in with username, opens an admin user, toggles a grant/deny,
and reloads. Add a vitest test against `resolve-identifier` covering the
four branches.
### C2. Documents-hub aggregated projection runs 2 × (N companies + N yachts + N clients) sequential queries
`src/lib/services/documents.service.ts:1923-1956` (workflow groups) and
the file-aggregation cousin do this:
```
for (const {id, name} of related.companies) {
const g = await fetchWorkflowGroupRows(portId, eq(documents.companyId, id));
}
```
`fetchWorkflowGroupRows` itself issues a SELECT + a separate COUNT
(2 round-trips). For a client with 5 companies + 5 yachts + 3 sibling
clients, opening the Documents tab fires **(5+5+3)×2 = 26** sequential
queries on the inflight projection alone, plus another ~26 on the
files-aggregated cousin (mentioned in CLAUDE.md), so ~50 sequential
round-trips for a single tab open.
→ Fix: switch to a single SQL `WHERE … IN (UNNEST(:companyIds)) GROUP BY
:source_kind` returning grouped rows + a count window, or at minimum
`Promise.all` the per-id calls so latency is parallel.
---
## HIGH
### H1. `listUsers` is sequential and unbounded (no pagination)
`src/lib/services/users.service.ts:16-104` — two sequential SELECTs
(port-role rows then super-admin rows). Should be one query with a
UNION/LEFT JOIN, or at minimum `Promise.all`. No `limit`/`offset`. For
the multi-tenant install where a port could grow to thousands of users
this becomes O(N) memory + payload per admin page open. `GET /api/v1/admin/users`
also lacks pagination.
→ Fix: collapse to one SQL with `LEFT JOIN userPortRoles … OR userProfiles.isSuperAdmin`,
add `limit`/`offset`, surface `{ data, total, hasMore }`.
### H2. DataTable rebuilds the columns array on every render
`src/components/shared/data-table.tsx:109-137` constructs `allColumns`
on every render with no `useMemo`. TanStack Table's docs explicitly warn
this resets internal state (sorting, column resizing, virtual scrolling
indices) every render. For the clients/interests lists with 50+ rows
and 10+ columns this stalls every parent state change.
→ Fix: `useMemo(() => […selectColumn?, …columns], [bulkActions, columns])`.
### H3. Recharts is statically imported in `widget-registry.tsx` — every dashboard chart ships in the initial bundle
`src/components/dashboard/widget-registry.tsx:15-25` static-imports 7
chart files which in turn pull recharts (~80150 KB gzipped). The
registry is the only entry point for the dashboard so the first
dashboard load pays the entire recharts cost even for users whose
widgets are all hidden.
→ Fix: `const PipelineFunnelChart = dynamic(() => import('./pipeline-funnel-chart').then(m => m.PipelineFunnelChart), { ssr:false, loading: () => <WidgetSkeleton/> })` per chart. Same fix for `website-analytics/pageviews-chart.tsx`.
### H4. `tiptap-to-pdfme.ts` (571-line module) ships to the client just for `TEMPLATE_VARIABLES`
`src/components/admin/document-templates/{template-form,template-preview}.tsx`
import `TEMPLATE_VARIABLES` from `@/lib/pdf/tiptap-to-pdfme`. The named
import drags the whole module (~570 lines of TipTap→pdfme transform
logic) into the client bundle even though only ~60 lines of constant
data are used. The `@pdfme/common` import is type-only so that part is
stripped, but the runtime code still ships.
→ Fix: split `TEMPLATE_VARIABLES` into a leaf file (`@/lib/pdf/template-variables.ts`)
that has no other imports; have `tiptap-to-pdfme.ts` re-export it for
server-side callers.
### H5. `notifications.service.ts:updatePreferences` runs N sequential upserts in a loop
`src/lib/services/notifications.service.ts:368-385` — one INSERT … ON
CONFLICT per preference row. For ~30 notification types that's 30 round
trips per "Save preferences" click. Trivially batchable as a single
`db.insert().values(rows).onConflictDoUpdate(…)`.
### H6. `GET .../permission-overrides` chains 5 sequential round-trips
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:99-138`
goes profile → portRole → role → portOverride → userOverride
sequentially. Each is independent on the userId once `profile` is
loaded; collapse the trailing four into `Promise.all`.
### H7. `command-search.tsx` invalidates two query keys every time the dropdown opens
`src/components/search/command-search.tsx:142-146`:
```tsx
useEffect(() => {
if (!showDropdown) return;
queryClient.invalidateQueries({ queryKey: ['search', 'recently-viewed'] });
queryClient.invalidateQueries({ queryKey: ['search', 'recent-terms'] });
}, [showDropdown, queryClient]);
```
Each time the user clicks the search box, two queries refire. The
`useSearch` hook already sets `staleTime: 30_000` for these. Invalidating
on every open defeats the staleTime entirely. Use the existing staleTime
or `refetchOnMount: 'always'` for a single trigger.
---
## MEDIUM
### M1. `UserPermissionMatrix` re-creates `setOverrides` closures every render
`src/components/admin/users/user-permission-matrix.tsx:158-169` defines
`setState` (non-memoized) and passes it down inside `.map()` rows. The
component itself is small (~180 leaves) so the impact is modest, but
the 3-state buttons render 540 closures every save. Wrap `setState`/`getState`
in `useCallback`, or pull them out as module-scope pure helpers
taking `(overrides, setOverrides)`.
### M2. `dashboard.service.ts:getPipelineForecast` scans every active interest into memory
Lines 119-156 fetch every non-archived interest + primary berth and
reduce in JS. Push to SQL with `SUM(price * CASE WHEN stage = … END)
GROUP BY pipeline_stage`.
### M3. `documents.service.ts:listDocuments` LEFT-JOINs documents on signed_file_id, but no pagination on folder views
`grep` indicates `listDocuments` has no `limit`/`offset` when `folderId`
is set. A folder with 1000+ files would dump the whole set. Verify
with a quick read; if missing, add the same `{ limit, offset }` pattern
used by `/api/v1/clients`.
### M4. `service` tests gap: 53 of 68 service files have zero unit-test files
Service files **with no test** include several high-risk surfaces:
- `interests.service.ts` (multi-berth, primary-flag invariants)
- `documents.service.ts` (folder soft-rescue, owner-wins chain, system-folder lock)
- `document-folders.service.ts` (cycle prevention, sibling uniqueness)
- `notifications.service.ts` (preference dedupe key, watcher fan-out)
- `users.service.ts` (createUser, role assignment, deactivate path)
- `dashboard.service.ts` (forecast math, hot-deal rank)
- `client-merge.service.ts`, `client-hard-delete.service.ts`,
`client-archive.service.ts` (destructive paths — exactly the surfaces
most worth testing)
- `interest-berths.service.ts` (the "never query from outside this service"
rule has no integration test enforcing the partial unique index logic)
- `email-accounts.service.ts` (AES-256-GCM round-trip, no test ensures
the decrypt path stays sound after a key rotation)
- `recently-viewed.service.ts`, `search.service.ts` (search bucket
expansion already partly tested but service-level branches missing)
Even one happy-path + one edge-case test per file would lift the
coverage floor enormously.
### M5. Playwright coverage gap matches the new flows verbatim
`grep -rn 'username\|permission-overrides\|disable.*user\|email[-_]change\|widget.*toggle' tests/e2e/`**no hits**.
Specs that should exist but don't:
- `tests/e2e/smoke/01-auth.spec.ts` — currently only tests email login;
needs a username-only path.
- `tests/e2e/smoke/24-admin-features.spec.ts` — needs a permission-override
three-state toggle path and a user disable path.
- `tests/e2e/smoke/10-dashboard.spec.ts` — needs a widget visibility toggle
- reload assertion (the persistence path in user_profiles.preferences
has only the strict-allow-list cap, no UI-level integration test).
### M6. `realtime-toasts.tsx` was modified without a Playwright spec or vitest unit
Modified in this branch; no spec covers toast deduplication / port-id
filtering. Realtime fan-out is a textbook noisy-neighbor surface — a
regression here floods users with toast spam.
### M7. `interest-detail-header.tsx`, `yacht-tabs.tsx`, `user-card.tsx` modifications
These three changed in this branch with no corresponding spec change.
The smoke `02-crud-spine.spec.ts` exercises the underlying CRUD but
doesn't assert the new inline-edit visuals shipped in commit `04a5949`.
### M8. `clients.service.ts:986 db.query.clients.findMany` with no `limit`
The function looks like a "find all matching" helper. If it's reached
by any non-internal call path it could dump every client. Worth a
direct read and a `limit` arg.
### M9. `command-search.tsx` paste handler awaits `apiFetch` inside `onPaste` synchronously
Lines 189-206 — `onPaste` is `async` and awaits `apiFetch` before
`e.preventDefault()`. By the time the fetch resolves, the paste event
has already been processed and the text dropped into the input. The
`preventDefault` inside the `if (res.found && res.href)` block
silently no-ops on most pastes. Either preventDefault unconditionally
up-front, or read `e.clipboardData` and treat as plain "lookup +
navigate" without trying to cancel.
### M10. `audit-search.service.ts:80` and `gdpr-export.service.ts:266` use `findMany` without `limit`
Both are admin-only but a long-running port renders the audit page in
tens of seconds.
---
## Good news / verified safe
- `serverExternalPackages` in `next.config.ts` keeps pino/bullmq/ioredis/
minio/postgres/better-auth/nodemailer off the client.
- NAV_CATALOG (175 entries) is only reached via dynamic `import()` from
`search.service.ts`. **Server-only**, not in any client bundle.
- `lucide-react`, `pdfme/generator`, `pdf-lib`, `recharts` are absent
from RSC client boundaries except the dashboard widgets (H3).
- `lucide-react` imports are all named (tree-shake safe).
- No sync crypto, no sync PDF rendering in request handlers. `JSON.parse`
only on cheap surfaces.
- `useSearch` debounce + `keepPreviousData` + `staleTime: 30s` is correct.
---
## 6. Observability + i18n + docs-drift audit (obs-i18n-docs-auditor)
# Observability + i18n + Docs-Drift Audit
**Auditor:** obs-i18n-docs-auditor • Branch: `feat/documents-folders` • Date: 2026-05-12
**Scope:** A) `createAuditLog` coverage, pino discipline, error-event pipeline; B) timezone / currency / country / date-picker; C) CLAUDE.md, BACKLOG.md, numbered specs, admin-search keywords, NAV_CATALOG hrefs.
---
## CRITICAL
### C1. NAV_CATALOG has dead links — global topbar search jumps to 404s
`src/lib/services/search-nav-catalog.ts` — three confirmed-dead entries; cmd-K search routes users to non-existent routes:
| Catalog href | Actual route | Notes |
| ------------------------------- | ----------------------------- | --------------------------- |
| `/:portSlug/admin/audit-log` | `/:portSlug/admin/audit` | Audit log card link |
| `/:portSlug/admin/error-events` | `/:portSlug/admin/errors` | Super-admin platform errors |
| `/:portSlug/user-settings` | `/:portSlug/settings/profile` | User-menu uses correct path |
Also dead — these `:portSlug/settings/<X>` paths have **no folder** under `src/app/(dashboard)/[portSlug]/settings/`; the only subroute that exists is `profile/`:
- `/:portSlug/settings/email`
- `/:portSlug/settings/branding`
- `/:portSlug/settings/templates`
- `/:portSlug/settings/storage`
- `/:portSlug/settings/recommender`
- `/:portSlug/settings/tags`
- `/:portSlug/settings/notifications`
These look like aliases that were intended to deep-link inside `/settings` tabs but never wired up. Either redirect them to `/admin/<x>` (which all exist) or render real `settings/<x>` pages.
### C2. Webhooks bypass the platform-error pipeline
`src/app/api/webhooks/documenso/route.ts` is the only webhook route in the repo and it does **NOT** call `errorResponse(...)` / `captureErrorEvent(...)`. The handler always returns 200 with `logger.error(...)` only, so admin/errors never sees Documenso webhook crashes — the CLAUDE.md/docs imply errors flow into `error_events` universally but webhooks are silently outside that flow. Recommended: wrap the handler in a try/catch that calls `captureErrorEvent({ statusCode: 500, error, metadata: { source: 'webhook', event } })` before returning 200.
---
## HIGH
### H1. PDF templates hard-code `en-GB` date locale (ignores user prefs)
Every numbered PDF template hardcodes `toLocaleString('en-GB', …)` / `toLocaleDateString('en-GB')` regardless of the rendering user's locale/timezone:
- `src/lib/pdf/templates/interest-summary-template.ts:85,162`
- `src/lib/pdf/templates/client-summary-template.ts:97,133,143,156`
- `src/lib/pdf/templates/berth-spec-template.ts:172,187`
- `src/lib/pdf/templates/invoice-template.ts:116`
- `src/lib/pdf/templates/reports/{activity,occupancy,pipeline,revenue}-report.ts` — all use `'en-GB'`
- `src/lib/email/templates/document-signing.ts:141``completedAt.toLocaleString('en-GB', …)`
CLAUDE.md and the new dashboard greeting / timezone-drift banner suggest the rep's locale + timezone is honoured end-to-end. It isn't — at the PDF / signing-email surface we silently revert to `en-GB`. User-preference `timezone`/`locale` from `user_profiles` is plumbed nowhere into these templates.
### H2. PDF templates hard-code `USD` price formatting & build raw `Number().toLocaleString()` strings
- `interest-summary-template.ts:112`, `berth-spec-template.ts:127,172``berth.priceCurrency ?? 'USD'` followed by `Number(price).toLocaleString()` (no Intl currency formatter, no grouping conventions per locale).
- `reports/pipeline-report.ts:93`, `reports/revenue-report.ts:78,86``Number(...).toLocaleString()` with no currency code at all in revenue report.
Single-source `formatCurrency()` exists at `src/lib/utils/currency.ts` and is used everywhere else — these templates should call it.
### H3. Dashboard widgets hard-code `'USD'` despite per-port `berths_default_currency`
`berths_default_currency` is a `system_settings` key (admin/settings/settings-manager.tsx:223). But:
- `src/components/dashboard/kpi-cards.tsx:19``formatCurrency(value, 'USD', …)`
- `src/components/dashboard/revenue-forecast.tsx:25` — same
- `src/components/dashboard/pipeline-value-tile.tsx:45,47` — same (the inner data field is `pipelineValueUsd` — backend converts to USD before sending). The "pipeline value tile" claim that the comment says "USD-denominated" is fine, but the KPI / forecast tiles silently render Euro/GBP ports as USD.
### H4. CLAUDE.md missing two new auth surfaces
Migrations 0054 (`user_profiles.username`) and 0055 (`user_permission_overrides`) shipped in this branch. CLAUDE.md has **zero** mention of:
- Username sign-in alternative (login form + `resolve-identifier` endpoint + `src/lib/validators/username.ts`).
- Per-user permission overrides (effective-permission chain is now: `role → port_role_overrides → user_permission_overrides`).
The "Conventions / Auth" section currently implies `user_port_roles.role` is the leaf authority. New developers won't know to apply user-level overrides when reasoning about effective permissions.
### H5. `feedback_pwa_assets_pending` memory is stale
User memory says PWA assets (`icon-192.png`, `icon-512.png`, `icon-512-maskable.png`) must be added before shipping Phase B scanner. **All three exist in `public/`** plus `apple-touch-icon.png`. Memory should be cleared.
---
## MEDIUM
### M1. `archiveBrochure` has no `createAuditLog` call
`src/lib/services/brochures.service.ts:191` — service-level archive (`archivedAt` + `isDefault: false`) commits without an audit row. Every other archive/delete in this branch (yachts, clients, companies, interests, berths, documents, document-folders, files, invoices, document-templates, email-accounts, users, roles, portal-auth, custom-fields) creates audit logs. Brochures is the outlier — same UX risk as the others (admin can swap default brochure with no trail).
### M2. PII risk: portal-auth logs the email address of unknown / disabled-portal users
`src/lib/services/portal-auth.service.ts:356,373,423` log `email`, `user.email`. Logger redact paths cover passwords / tokens / encrypted blobs but **not** `email` / `*.email`. For most CRM logging this is fine (emails are not secret in this app), but the portal-reset paths specifically log emails of users **outside** the active session — a quiet PII surface in log aggregators. Recommend either (a) hash-prefix the email (`hash6(email)`) before logging, or (b) accept-but-document.
### M3. Pino `logger.info` discipline — Documenso & IMAP chatter
- `src/app/api/webhooks/documenso/route.ts:122,156,187,243,258,262` — six `logger.info(…)` per webhook fire (duplicate skip, lifecycle event, unhandled event type). At realistic Documenso traffic + retry pressure this is noisy. Consider downgrading the `'Documenso lifecycle event'` line at L258 (fires on **every** valid event) to `logger.debug`.
- `src/lib/services/email-threads.service.ts:290,298,358` — IMAP sync logs `mailbox.exists`, `messageCount`, `'No new messages to sync'` at info on **every** poll. At 5-min poll cadence × 24h × N accounts this floods info-level logs. Should be `logger.debug`.
### M4. Timezone-aware reminder `dueAt` storage looks correct but UI hands off naïve strings
`reminders.dueAt` is stored as `TIMESTAMPTZ` (`reminders.service.ts:179``new Date(data.dueAt)`). The validator accepts an ISO string. `<DateTimePicker>` in date-time inputs reads `new Date(input)` from the browser — interpretation is **local-TZ** for `YYYY-MM-DDTHH:mm` strings, **UTC** for full ISO with `Z`. Worth a focused look on the picker component to confirm it emits `Z`-terminated ISO (else "reminder at 9 AM" means 9 AM browser-local on creation, but server's `formatInTimeZone` against the rep's chosen TZ will misalign). I did **not** open `<DateTimePicker>` itself in this audit — flagging as a 30-minute follow-up.
### M5. CLAUDE.md numbered-spec section frames `01-15` as authoritative
`CLAUDE.md` says:
> "Numbered spec files in repo root (01-…through 15-…) contain detailed architecture decisions, feature specs, DB schema docs, API catalog, and implementation sequence."
These specs document the **pre-rebuild Nuxt 3 / NocoDB system being migrated FROM**. `01-CONSOLIDATED-SYSTEM-SPEC.md` header reads "Compiled: 2026-03-11" and the stack tables describe Nuxt 3 SPA, NocoDB, Keycloak OIDC, etc. — none of which are the live Next.js + Drizzle + better-auth stack. New contributors reading CLAUDE.md will be sent down the wrong path. Recommend reframing to "legacy reference for the rebuild target" or moving them to `docs/legacy/`.
### M6. BACKLOG.md doc-folders entry stale vs reality
`docs/BACKLOG.md` E. "Hidden / stubbed UI tabs" still lists Company Documents tab as ✅ landed 2026-05-08 but in the same section says **Berth Waiting List + Maintenance Log tabs** are "Removed entirely; revisit if/when product asks" — yet `src/components/berths/berth-tabs.tsx` still imports and renders the tabs strip (the comment in CLAUDE.md is silent on this). Not blocker — just a doc/code drift.
### M7. Admin sections browser missing two real admin routes
`src/components/admin/admin-sections-browser.tsx` registers 30 hrefs. Two `/[portSlug]/admin/*` routes exist but are **NOT** surfaced in the browser:
- `/admin/brochures` (full UI exists at `src/app/(dashboard)/[portSlug]/admin/brochures/page.tsx`)
- `/admin/errors` (super-admin platform-errors inspector, real route)
The NAV_CATALOG catalog (Cmd-K) covers "Platform errors" via the (dead — see C1) `error-events` href but no entry for `/admin/brochures`. Reps cannot discover the brochure admin surface from either the section card grid or global search.
### M8. Settings-manager keyword catalog drift
admin-sections-browser keywords list (settings card) is in sync with `settings-manager.tsx` KNOWN_SETTINGS keys today (21 keys, 39 aliases). **However**, two settings exist in production that are NOT in either list:
- `documenso_signing_order` (CLAUDE.md L46) — typeable via Documenso admin page, not the generic Settings card; reasonable to omit but flag if you want unified search.
- `documenso_redirect_url` — same.
Not a bug — just confirming the drift surface is intentional.
---
## What was NOT in scope but worth a quick note
- **Audit-coverage spot-check** for sensitive mutations: clients/yachts/companies/interests/berths/documents/document-folders/document-templates/invoices/users/roles/portal-auth/files/custom-fields/document-sends/email-accounts ALL call `createAuditLog`. Only `brochures.archiveBrochure` is missing (M1). No other gaps spotted in services that I sampled.
- **Pino redact paths** cover passwords, tokens, secrets, encrypted blobs, Authorization headers, cookie headers, two-level nesting — comprehensive. Only soft gap is `email` field (M2).
- **Error pipeline**: `errors.ts → captureErrorEvent` is invoked on every 5xx route response; the `error_events` table is read by admin/errors. Looks complete for API routes — gap is webhooks (C2).
- **Country/nationality**: consistently stored as ISO-3166-1 alpha-2 across clients, companies, residential — validators centralized in `src/lib/validators/i18n.ts`. Good.
---
## Recommended fix order
1. (C1) Fix the 10 dead `search-nav-catalog.ts` hrefs — pure typo fixes, very high user-visible impact.
2. (C2) Wrap webhook handler in `captureErrorEvent`.
3. (H4) Add CLAUDE.md sections for username + user permission overrides.
4. (H1, H2, H3) PDF & dashboard locale/currency consistency pass — plumb user prefs through, kill remaining `'en-GB'` / `'USD'` hardcodes.
5. (M1) `createAuditLog` in `archiveBrochure`.
6. (M5) Reframe or relocate numbered specs `01-15`.
7. (M3) Demote 2-3 chatty `logger.info` lines to `logger.debug`.
8. (H5) Clear stale `pwa_assets_pending` user memory.
~1450 words.
---
## 7. Concurrency + race conditions audit (concurrency-auditor)
# Concurrency & Race-Condition Audit — pn-crm
Scope: 22-minute read-only sweep of services, queue workers, webhook handlers, and
schema invariants. Findings grouped by severity. Line references against
`feat/documents-folders`.
---
## CRITICAL
### C-1. `handleDocumentCompleted` TOCTTOU lets two concurrent webhook retries both download + persist the signed PDF
`src/lib/services/documents.service.ts:1100-1190`
The idempotency gate `if (doc.status === 'completed' && doc.signedFileId) return;`
is read _outside_ any row lock. The CLAUDE.md note ("idempotent — early-returns
when …") is true for _sequential_ retries but not for concurrent ones.
Real-world hit: Documenso retries DOCUMENT_COMPLETED on a 5xx, and the local poll
worker also reconciles. If both arrive within milliseconds (e.g. the receiver was
slow once then retried while the poll worker also fires), both pass the gate,
both call `downloadSignedPdf` + `storage.put`, both `db.insert(files)`, and both
UPDATE `documents.signed_file_id`. The losing file row stays in `files`, but its
blob has no `documents` row pointing at it → **permanent orphan blob plus a
duplicate file in the entity folder.**
Fix: wrap the whole block in `db.transaction` + `SELECT … FROM documents WHERE
id = $1 FOR UPDATE` before re-checking the gate. (Or add a partial unique index
`(document_id) WHERE document_type='signed_pdf'` on `files` so the DB rejects the
second insert.)
### C-2. BullMQ jobs have NO `jobId` — every webhook retry / duplicate enqueue creates a new job
`src/lib/queue/index.ts:24-39` (queue defaults), every call site of `queue.add(...)`
(see `inquiry-notifications.service.ts:51,118`, `webhook-dispatch.ts:59`,
`webhooks.service.ts:323,374`, `invoices.ts:661,790`, `gdpr-export.service.ts:101`,
`reports.service.ts:90`, `email-draft.service.ts:59`, `notifications.service.ts:165`,
`expenses.ts:190`, `documents.service.ts` send-out paths).
BullMQ deduplicates only when callers pass `{ jobId: stableKey }`. Nothing in the
repo does. Implications:
- A second Documenso 5xx retry that comes through `handleDocumentCompleted`
doesn't go through BullMQ, but webhook _outbound_ deliveries via
`webhook-dispatch.ts` go through `queue.add('deliver', …)`. If `dispatchEvent`
is called twice for the same event id, both rows get a delivery job and the
external endpoint sees the event twice.
- `notifications.service.ts:165` enqueues a notification-email job per insert;
the dedupeKey collapses _DB_ rows but not jobs. A `createNotification` that
fails after the dedupe collapse but before the job add can leave the queue
short; a successful dedupe still adds a fresh job each call.
- The maintenance queue (concurrency 1, attempts 3) backs off on failure and
with no `removeOnFail` cap on count, a misbehaving job that errors thousands
of times can balloon Redis.
Fix: every "logically once-per-X" job needs `jobId: \`${name}:${entityId}\``,
plus a `removeOnFail: { count: 1000 }` cap on the queue defaults.
### C-3. `advanceStageIfBehind` double-fires `evaluateRule` on parallel webhook deliveries
`src/lib/services/interests.service.ts:881-908`. The current-stage read is plain
`db.query.interests.findFirst`, no `FOR UPDATE`. Two concurrent calls (DOCUMENT_SIGNED
- DOCUMENT_COMPLETED, both routed to `eoi_signed`, arriving in parallel because
Documenso may send the RECIPIENT_SIGNED + DOCUMENT_COMPLETED pair near-simultaneously)
both observe `currentIdx < targetIdx`, both call `changeInterestStage`, both
trigger `evaluateRule('eoi_signed', …)`. The downstream berth-rule then auto-flips
the berth status twice — and if the rule has any side effect like queue.add
(it does — see berth-rules-engine), you get two of them.
The comment at line 1274 ("Guard against double-fire") shows the author noticed
the risk but only added an idempotency check on the _eoi-signed-and-beyond_
branch, not on the `advanceStageIfBehind` path itself.
Fix: pull the interest row with `FOR UPDATE` inside `advanceStageIfBehind`
and call `changeInterestStage` in the same transaction, OR move the rule-fire
side effect inside `changeInterestStage` and gate it on the actual UPDATE
returning a row (i.e. `.returning()` + check `updated.pipelineStage === target`).
---
## HIGH
### H-1. `moveFolder` cycle check is not under a row lock — concurrent moves can create a cycle
`src/lib/services/document-folders.service.ts:212-275`. The cycle check walks
ancestors of the destination outside any transaction; the actual UPDATE happens
in a separate statement. Two reps moving folders simultaneously can each pass
the cycle check against pre-state, then both commit, leaving folder A under B
under A. The system folders are protected by `assertNotSystemManaged`, but any
two user folders are vulnerable. Subsequent reads (the `cursor` walks in
`listDocuments(..., includeDescendants=true)`) would infinite-loop until the
`seen` guard bails — but the tree is now inconsistent.
Fix: open a transaction at the top, `SELECT FOR UPDATE` the moving folder, walk
the ancestor chain _inside_ the same transaction, then UPDATE. PostgreSQL's
default READ COMMITTED isolation doesn't see other in-flight updates without
the lock.
### H-2. Berth-PDF upload writes the blob BEFORE acquiring the advisory lock
`src/lib/services/berth-pdf.service.ts:204-294`. Step 3 calls `backend.put(...)`;
step 4 takes the per-berth advisory lock + inserts the version row. If the
transaction in step 4 fails for any reason other than the unique-index conflict
(e.g. FK violation, network blip, statement timeout), the blob is already at
its UUID path with **no DB row pointing at it** → orphan blob. The author noted
the unique-index ⇒ orphan risk is mitigated by the UUID path (the second blob
gets its own path so no overwrite), but didn't address the "tx aborts, blob
stays" branch.
Fix: stage the blob in storage _after_ allocating the version row (or wrap both
in a saga that deletes the blob on tx rollback via a finalizer).
### H-3. `upsertInterestBerth` `isPrimary=true` race demotes nothing then both inserts succeed
`src/lib/services/interest-berths.service.ts:181-265`. Inside the transaction it
`UPDATE …SET isPrimary=false WHERE interestId=$1 AND isPrimary=true` then
`INSERT … (isPrimary: true)`. At default READ COMMITTED, two concurrent
`setPrimaryBerth(X, A)` + `setPrimaryBerth(X, B)` will: both UPDATE (no rows on
the first call so no lock — but the second's UPDATE may now hit the freshly
inserted row from the first). The partial unique index on `(interestId) WHERE
isPrimary=true` catches the _second insert_ — but only after the first tx
commits. If both txns interleave their UPDATE+INSERT before commit, postgres
serializes the unique-index check and one fails with 23505. Currently that
bubbles up as a generic 500, not as a friendly conflict — and a fast retry
would succeed because the loser saw the winner's row and would simply demote
it. So the data invariant holds, but the UX surfaces a confusing error.
Fix: catch 23505 on `idx_interest_berths_one_primary` and either retry once or
map to a `ConflictError` so the toast says "another rep just changed the
primary berth, refreshing".
### H-4. Admin email-change leaves orphan sessions on the old email
`src/lib/services/users.service.ts:233-262`. The admin UI flips
`user.email` directly on the Better-Auth `user` row but never deletes the
target user's existing sessions. Concurrent sessions of the affected user
keep working under the new email (because Better-Auth indexes sessions by
user*id, not email) — that's fine. \*\*But the \_previous* email is now free\*\*
to be claimed by a fresh signup before the admin sends the "your email was
changed" notice. There's no unique constraint that prevents an attacker
from re-registering as `old@example.com` and taking over outgoing identity
artefacts (audit logs reference user_id not email, so this is just identity
hygiene; still, the surface exists). Worse — there's no `emailVerified =
false` reset on the swap, so the new email is auto-treated as verified
without ever receiving a confirmation.
Fix: in the same transaction, also revoke the user's sessions if the change
is admin-initiated (`db.delete(session).where(eq(session.userId, …))`), and
re-set `emailVerified = false` so the next sign-in goes through the
re-verify flow.
### H-5. `userEmailChanges` has no partial unique index on (userId) WHERE not applied/cancelled
`src/lib/db/schema/users.ts:360-379`. A user can spam the self-service
email-change endpoint to create unlimited pending rows. Each row mails the
NEW address. Anti-abuse is missing at the DB layer — only application-side
rate limit (which I didn't fully audit) stands between a user and unbounded
email send-out from your domain.
Fix: `CREATE UNIQUE INDEX user_email_changes_one_pending ON user_email_changes
(user_id) WHERE applied_at IS NULL AND cancelled_at IS NULL;`
### H-6. Email-confirm token isn't atomically consumed
`src/app/api/v1/me/email/confirm/[token]/route.ts:28-57`. Three separate
statements: SELECT pending, UPDATE user, UPDATE pending.applied*at. No
transaction wrapper. A user who double-clicks the email link (or a link
preview pre-fetcher like Outlook SafeLinks) fires two near-simultaneous
GETs. Both pass the `appliedAt IS NULL` check, both flip `user.email`
(idempotent — same value), both mark applied. Functional, but the second
audit-log entry is misleading. More importantly: if the second click
arrives 200ms later AND the user re-fired a \_different* change in between
that the first click happened to apply, you've stomped state.
Fix: single transaction, `SELECT … FOR UPDATE` the pending row, branch on
its post-lock state.
---
## MEDIUM
### M-1. Unbounded fan-out on `Promise.all` per recipient
- `src/lib/services/inquiry-notifications.service.ts:116-129` — fans out one
`emailQueue.add` per external recipient with no concurrency cap. With a
ports admin who lists 500 emails (no UI cap I saw), one inquiry submission
pushes 500 queue inserts concurrently. Redis survives it; the surge in
pipelined Redis commands can stall co-tenant queues.
- `src/lib/services/notifications.service.ts:344-358` — same shape for
document events: one `createNotification` per recipient, fully parallel.
Each `createNotification` does its own DB insert AND its own
`queue.add('send-notification-email', …)`. Big-port notifications can
fan out to dozens of users simultaneously per document event.
Fix: use `p-limit(10)` (already in similar shape elsewhere) or batch with
`queue.addBulk`. Not data-corrupting; tail-latency / resource concern.
### M-2. Username uniqueness TOCTTOU surfaces as 500 instead of ConflictError
`src/app/api/v1/me/route.ts:137-145`. The LOWER(username) SELECT runs outside
any lock, and the partial unique index `idx_user_profiles_username_unique` is
the actual guard. Two reps claiming `dm` concurrently: one succeeds, the other
gets a generic 500 (the 23505 is not caught and rewritten). The pre-check
shows "available" right before the failed write, which is a worse UX than a
clean "already taken" message.
Fix: catch 23505 on the unique index name and translate to ConflictError.
### M-3. `ensureSystemRoots` self-heal recursion not bounded
`src/lib/services/document-folders.service.ts:512-516`. If `ensureSystemRoots`
throws transiently (e.g. DB hiccup), `ensureEntityFolder` recurses into itself
with no depth guard. In normal operation the second pass will find the root
and return; in a pathological case (root is created but the post-insert SELECT
fails repeatedly), this can stack. Low-likelihood but trivial to fix with a
"called from self-heal already" flag.
### M-4. Optimistic UI rollback drops the user's pending edits
`src/components/interests/pipeline-board.tsx:120-143`. The optimistic update
overwrites query data without snapshotting the prior value — on error, the
rollback path just `invalidateQueries`, which refetches from the server. If
two reps drag the same card, the second's drop happens after the first's
server commit; the server then accepts the second drag, but the first rep's
view briefly shows their own change before the next invalidation pulls in
the truth. Last-write-wins semantics with no warning. Acceptable today for
single-rep ports; will get reported as a bug when teams scale. No version
header (`If-Unmodified-Since` / `ETag`) anywhere in the API.
### M-5. Filesystem storage backend not multi-node — silent corruption if `MULTI_NODE_DEPLOYMENT` mis-set
CLAUDE.md and `src/lib/storage/` say the filesystem backend refuses to start
when `MULTI_NODE_DEPLOYMENT=true`. If that env var defaults to unset and the
operator forgets to flip it, two app nodes both write to their own local FS,
each thinking they own the only copy. Not a code bug but a configuration
cliff edge — worth re-stating in the deploy runbook.
### M-6. BullMQ default `removeOnFail` retains failed jobs for 7 days without count cap
`src/lib/queue/index.ts:33`. Same volume risk as C-2: a noisy worker that
fails 10k times per day fills Redis. Add `count: 1000`.
---
## Notes on what looks GOOD
- `archiveClientWithDecisions` (`client-archive.service.ts:165-300`) — proper
`FOR UPDATE` on client + berth rows inside one transaction. Exemplary.
- Berth-PDF version-number allocation under `pg_advisory_xact_lock` — correct
pattern, modulo H-2 above.
- NocoDB import (`scripts/import-berths-from-nocodb.ts:137`) — stable 64-bit
advisory key, scoped to transaction.
- Document-folder backfill (`scripts/backfill-document-folders.ts:77`) —
per-port advisory key via `hashtext(portId)`, idempotent.
- `ensureSystemRoots` ON CONFLICT DO NOTHING pattern is race-safe.
- `ensureEntityFolder` correctly distinguishes entity-id race vs sibling-name
race and re-SELECTs the winner.
- `interestBerths` partial unique index on `is_primary` enforces the invariant
at the DB layer (H-3 is only a UX gripe, not a data integrity issue).
- `brochures.is_default` partial unique index works the same way.
- Documenso `verifyDocumensoSecret` uses timing-safe equality (good).
---
## Recommended sequencing
1. C-1 (signed-PDF orphan) — add row lock or unique index. Highest data-loss risk.
2. C-2 (jobIds everywhere) — broad blast radius, mechanical fix.
3. H-4/H-5 (admin email change + pending-change unique index) — security-adjacent.
4. C-3, H-1, H-2, H-3, H-6 — correctness in real-world retry/burst scenarios.
5. Medium tier as time allows.
~1430 words.
---
## 8. GDPR + privacy + PII audit (gdpr-auditor)
# GDPR + Privacy + PII Audit
Repo: `new-pn-crm` @ `feat/documents-folders`. Read-only audit. Findings are grouped by severity. Line numbers are approximate.
---
## CRITICAL
### C1. GDPR export bundle is materially incomplete — Article 15 violation
`src/lib/services/gdpr-bundle-builder.ts` enumerates only a subset of tables that hold the data subject's PII. The following tables reference the client (`client_id` FK) but are **NOT** included in the bundle:
- `portal_users` — the portal account itself (email, name, `lastLoginAt`, `isActive`, `createdBy`). Strictly required: a copy of the account record is core "data we hold about you."
- `email_threads` / `email_messages` — full inbound/outbound correspondence including `bodyText`, `bodyHtml`, attachment IDs. This is the most PII-dense table in the system.
- `document_sends` — brochure / send-out audit with `recipient_email` (`brochures.ts`).
- `reminders` — operations table with `clientId` FK.
- `formSubmissions` (public form intake) — already collected via `documents` for the linked path, but rows where `client_id` is set directly are missed.
- `files` — files attached directly to a client (`files.client_id` ≠ via `documents`). The builder pulls `documents` only.
- `scratchpadNotes.linkedClientId` — rep-side free-text notes that reference the client.
- `clientMergeLog` — historical merge records that survived earlier deduplications.
- `contact_log` (referenced from `operations.ts`).
- `website_submissions` (raw inbound inquiries before they were promoted to interests).
The bundle currently advertises itself as the Article-15 dump. Hand-delivering it would expose the controller to a regulator finding of incomplete disclosure. **Fix:** widen `buildClientBundle` to cover every `client_id`-referencing table (the schema grep below produces the complete list); the audit-log limit of 500 events should also be lifted or paginated for long-tenured clients.
### C2. "Right to be forgotten" leaves email correspondence bodies intact
`client-hard-delete.service.ts` nullifies `emailThreads.clientId` so the thread (with `bodyHtml` / `bodyText` of every inbound + outbound message + the subject's address in `from_address` / `to_addresses`) survives the delete in perpetuity. Same pattern for `files.clientId`, `documents.clientId`, `formSubmissions.clientId`, `reminders.clientId`, `documentSends.clientId`. The justification in the file ("keep their audit history") is reasonable for _audit metadata_ but the actual PII content (email body, file contents, form answers, recipient emails on brochure sends) is preserved verbatim. A subject who exercised their right to erasure has not, in practice, been erased.
**Fix options:** (a) cascade-delete `email_threads` / `email_messages` on client hard-delete; (b) blank `body_text` / `body_html` / address columns inline; (c) require a separate "destructive erasure" mode that the smart-archive flow ladders into.
### C3. Username → email enumeration on public endpoint
`src/app/api/auth/resolve-identifier/route.ts` returns the canonical email when a known username is supplied (line 88: `return NextResponse.json({ email: rows[0]!.email })`). The miss-path returns a synthetic `.invalid` address, which protects hit/miss equality, but **any successful hit leaks the linked email** to an anonymous caller. Rate-limit is 5/15min/IP — sufficient to thwart wordlist brute-force but trivial to walk known/leaked usernames. This is also the entire point a malicious actor would call this endpoint (compromised-credentials stuffing).
**Fix:** don't echo the resolved email back to the client. Instead, set a short-lived signed cookie / Redis key keyed by the IP+identifier that the subsequent `signIn` call consumes, or hand the resolved email straight to Better Auth server-side and return only `{ ok: true }`.
### C4. Audit-log metadata is unmasked and stores raw PII forever
`src/lib/audit.ts` `maskSensitiveFields` covers `oldValue`/`newValue` only — `metadata` is written raw. Multiple call-sites stuff full email addresses into metadata:
- `client-hard-delete.service.ts:135` (`metadata: { sentTo: u.email }`)
- `client-hard-delete.service.ts:350` (bulk variant)
- `portal-auth.service.ts:123 / 187 / 336 / 363 / 380 / 403` (every portal lifecycle event)
- `crm-invite.service.ts:206 / 264` (`metadata: { email: invite.email }`)
- `email-accounts.service.ts:76 / 145` (`emailAddress`)
Compounded with **C5** (no audit-log retention), every staff member's, invitee's, and portal user's email lives in `audit_logs.metadata` indefinitely with `ip_address` + `user_agent` next to it. This is GDPR data-minimisation/storage-limitation breach territory.
**Fix:** extend `maskSensitiveFields` to walk `metadata` recursively, or stop emitting full emails into metadata (use user IDs + a join on demand). The masking set also needs `emailAddress` and `sentTo` aliases.
---
## HIGH
### H1. No retention policy on `audit_logs`
`src/lib/queue/scheduler.ts` registers retention crons for `ai_usage`, `error_events`, `website_submissions`, but **not** `audit_logs`. The schema docs the table being kept indefinitely (no pruning worker exists). With IP + user-agent on every row, plus PII in metadata (C4), the table grows unbounded and replays PII forever.
**Fix:** add a `audit-log-retention` maintenance job. Recommended split: keep `severity ∈ {warning, error, critical}` and `source = auth` for 2 years (legal/security), prune everything else after 12 months. Make the window admin-configurable.
### H2. `error_events` request-body excerpts redact only secret-shaped keys
`error-events.service.ts` `SENSITIVE_KEYS` redacts `password`/`token`/`apiKey`/`creditCard`/`ssn` etc. — but NOT `email`, `phone`, `name`, `dob`, `address`. Any 5xx on `POST /api/v1/clients`, `POST /api/v1/portal-users`, `POST /api/v1/clients/[id]/contacts`, `POST /api/v1/admin/users` lands the requester's full client-create payload in `error_events.request_body_excerpt`. Retention is 90 days (good) but the captured rows are visible to every super-admin via the inspector. **Fix:** add PII keys to `SENSITIVE_KEYS` or whitelist-only the body schema per route.
### H3. Email recipient address logged at `debug` level
`src/lib/email/index.ts:154` — every outbound email logs `{ to, originalTo, subject }`. In prod this is `info` only if `LOG_LEVEL=debug`, so usually safe, but the `originalTo` field also leaks the redirect-target's real address when `EMAIL_REDIRECT_TO` is set in dev. Tighten to `messageId + portId + bool` once redirect path is exercised.
### H4. Portal `lastLoginAt` & email kept after client hard-delete
On client hard-delete the `portal_users.client_id` cascade fires, so the portal user is removed — good. But `portal_users.email` has a global unique index (`idx_portal_users_email_unique`) with no `port_id`. A previously-deleted portal user blocks a new portal account at a different port from re-using that email until/unless the cascade fires. More importantly, if cascade ever doesn't run (e.g. archive-only, no hard delete), the portal account row survives with the email. Verify the archive path also disables/erases the portal user, or document the asymmetry.
### H5. Encryption-key rotation is non-incremental for SMTP/IMAP creds
`src/lib/utils/encryption.ts` hard-codes a single env var (`EMAIL_CREDENTIAL_KEY`) with no key-version or KID stored on the ciphertext. Rotating the key requires an offline mass re-encrypt; there is no migration path. The same applies to the S3 secret key (`storage/s3.ts:74`), webhook secret (`workers/webhooks.ts:116`), and storage_proxy_hmac (`storage/filesystem.ts:415`). Each decrypt-failure path falls through silently. **Fix:** prefix ciphertexts with a `kid` field, support 2 active keys at once, and ship a rotation script that re-wraps ciphertexts to the new key.
### H6. Activation/reset tokens travel in URL query strings
`portal-auth.service.ts:147 / 408` and `crm-invite.service.ts:71 / 233` ship `?token=…` in the activation/reset links. The token hash is stored server-side (good), but URL-borne tokens land in browser history, reverse-proxy access logs, Cloudflare logs, and `Referer` headers if the activation page links anywhere external (e.g. terms-of-service). Common pattern but worth flagging — consider hash fragments (`#token=…`) which browsers never put in `Referer`.
### H7. IP address recorded on every audit event without a lawful-basis note
`audit_logs.ip_address` (system.ts:38) and the legacy second copy (line 305) are populated unconditionally. Storing IPs is lawful under "legitimate interest" for security-relevant events, but for routine `update`/`view`/`create` of a client record the lawful-basis argument is much thinner under recent EU regulator guidance. **Fix:** only retain `ip_address` on `source ∈ {auth, webhook}` and on `severity ∈ {warning, error, critical}`; null it on routine `user`-source events at write time.
---
## MEDIUM
### M1. Recent-search Redis key holds verbatim free-text queries
`search.service.ts:2147` saves the raw search term to Redis under `recent-search:<userId>:<portId>`. If a rep types a client's email/phone/SSN to find them, that string lives in Redis with a 7-day TTL (per the constant) and is not in the GDPR bundle. Low-volume, but document and add to the bundle.
### M2. GDPR-export confirmation email contains client name verbatim
`client-hard-delete.service.ts` sends a confirmation code email to the requester with the deleted client's `fullName` in subject + body. Reasonable for human verification, but it means the operator's mailbox (often Gmail/Outlook) holds the to-be-erased client's name after deletion. Document this in the privacy notice or strip to initials.
### M3. GDPR export ZIP retention overlaps with subject-erasure
The bundle expires 30 days after generation (`EXPIRY_DAYS = 30`) — but if a subject requests _both_ export and erasure inside that window, the staged ZIP in MinIO will outlive the database row. The cleanup cron only checks `expires_at`. **Fix:** when `hardDeleteClient` runs, delete any non-expired `gdpr_exports` blobs for that client immediately.
### M4. Documenso webhook & document_sends body leak via audit metadata
`docs.documenso webhook handler` logs `signatureHash` only (good), but `document_sends` rows store full `recipient_email` on `set null` cascade — so when the linked client is hard-deleted, the recipient email survives on the send row. Same pattern as C2 but for the brochure channel.
### M5. `portal_auth_tokens.tokenHash` is SHA-256, not constant-time-compared
Tokens are hashed before storage (good) but the lookup `where idx_portal_tokens_hash_unique` uses normal equality. Since the index lookup is `O(1)` indexed-equality, timing attacks are not viable here — flagged for documentation only.
### M6. `error_events.error_stack` may contain user-supplied strings
Stack traces are 4 KB capped and only fire on 5xx, but PG-driver errors include the offending statement / parameter values in `.message` (e.g. duplicate-key violations expose the conflicting email). Already mitigated by error-coding most known cases through `CodedError`, but a defensive scrub on `errorMessage` for `@`-shaped or `+\d{6,}` substrings would harden the inspector.
### M7. `EMAIL_REDIRECT_TO` is enforced only by env, not by a build assertion
The README warns it must be unset in production but there's no runtime guard. A misconfigured prod could silently redirect ALL outbound client mail to a single inbox. **Fix:** in `src/lib/env.ts`, refuse `EMAIL_REDIRECT_TO` when `NODE_ENV === 'production'`.
### M8. Cross-port `portal_users.email` unique index leaks tenancy
Multi-tenant model says ports are isolated, but the global-unique email index means port A can probe whether email X is already a portal user at any other port by attempting to invite them and reading the conflict error. Tiny enumeration vector, fix by scoping the unique index to `(port_id, lower(email))`.
---
## Notes / good practices observed
- `audit.ts` `maskSensitiveFields` exists and is applied to `old/new` JSON.
- `logger.ts` ships a thorough `redact.paths` list covering auth headers, encrypted credentials, cookies.
- GDPR-export uses presigned 7-day URL (acceptable, behind email auth).
- Hard-delete is two-factor (permission + email code + typed name) and gated on prior archive.
- `error_events` has a 90-day retention cron (migration 0040).
- AES-256-GCM for encryption at rest is correctly implemented with random IV + auth tag.
## Top-3 fix priority
1. **C1 + C2** — complete the GDPR export bundle and make hard-delete actually erase email/file/document content (not just sever FKs).
2. **C3** — stop echoing real emails out of the public `resolve-identifier` endpoint.
3. **C4 + H1 + H2** — mask PII in `audit_logs.metadata` and `error_events.request_body_excerpt`; add an `audit_logs` retention cron.
---
## 9. Email deliverability + template quality audit (email-auditor)
# Audit #9 — Email deliverability + template quality
Scope: `src/lib/email/**`, `src/components/admin/email-*`, `src/app/(dashboard)/[portSlug]/admin/email/page.tsx`. Read-only.
Severity legend: CRITICAL = production-breaking / security; HIGH = silent feature bug or data leak; MEDIUM = rendering / brand / spam risk; LOW = polish.
---
## CRITICAL
### C1. `EMAIL_REDIRECT_TO` has no production guard
`src/lib/env.ts:41` declares it `z.string().email().optional()` with no `NODE_ENV` constraint. `src/lib/email/index.ts:131-133` silently rewrites every recipient when it is set. CLAUDE.md says "**must be unset in production**" but nothing enforces it: a stray prod `.env` value would funnel every client/portal/EOI invitation to one address with zero alarms. The only signal is a `logger.debug` line (`index.ts:154-156`) — debug, not warn, and no startup banner. The companion `documenso-client.ts`, `email-compose.service.ts`, and `webhooks.ts` paths also silently honour it. Add a refinement in `env.ts` rejecting it (or at least `logger.fatal`-ing) when `NODE_ENV === 'production'`, and emit a startup `logger.warn` whenever it is set so it shows up in container logs.
### C2. Unescaped URL interpolation into `href` attributes (XSS-able in browser previews)
Every template inlines `${data.link}` / `${data.signingUrl}` / `${data.loginUrl}` / `${item.link}` / `${data.inboxLink}` / `${data.crmDeepLink}` / `${data.signingUrl}` / `${crmUrl}` (the last is escaped in `inquiry-sales-notification.ts:34` — sole exception) directly into `href="…"` and into the visible link text. `escapeHtml` is applied to every recipient name and copy string but **never to URLs**:
- `crm-invite.ts:42, 48`
- `portal-auth.ts:50, 56, 106, 110`
- `admin-email-change.ts:48`
- `document-signing.ts:92, 98, 211, 216`
- `notification-digest.ts:58, 74`
- `residential-inquiry.ts:76`
Most URLs come from server-built strings (token endpoint + base URL), but `notification-digest.items[].link` is sourced from notification rows whose deep links can include user-typed entity titles / search queries depending on the producer. A single `"` in any of those will break out of the attribute. Email clients (Apple Mail, Outlook Web) render the resulting HTML and attribute injection becomes click-jacking / open-redirect. Cheapest fix: pass every URL through `encodeURI` or an `escapeAttribute` helper before interpolation, and reject `javascript:` / `data:` schemes at the helper level. None of the templates currently verify `https://` prefix.
---
## HIGH
### H1. Template-subject override mechanism is silently disconnected for ~half the catalog
`src/lib/email/template-catalog.ts:39-98` advertises 8 templates as customisable, and `src/components/admin/email-templates-admin.tsx` exposes a subject editor. But several templates **don't accept `overrides.subject`** and so the admin's edit is silently ignored:
- `inquiry-client-confirmation.ts:23-26` — no `overrides.subject` path
- `inquiry-sales-notification.ts:24` — same
- `residential-inquiry.ts:19, 61` — both functions, no override path
- `crm-invite.ts:26` — no override path
Only `portal-auth.ts` (activation/reset) and `document-signing.ts` honour `overrides.subject`. Admins editing the "Inquiry — client confirmation" subject in `/admin/email-templates` see the green "Saved" toast and nothing changes. This is the kind of bug users don't report; they assume their override worked.
### H2. Catalog `defaultSubject` strings DO NOT MATCH the literal subjects the code emits
The admin UI shows `Default: <catalog string>` so users can tell whether they have customised, but every comparison is broken because the strings diverge from the actual templates:
| Template | Catalog default | Code-emitted subject |
| ----------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------------------------- |
| `crm_invite` | `You have been invited to {{portName}} CRM` | `You're invited to the ${portName} CRM` |
| `inquiry_client_confirmation` | `We received your inquiry — {{portName}}` | `Thank You for Your Interest in Berth ${mooringNumber}` (or `…in a ${portName} Berth`) |
| `inquiry_sales_notification` | `New berth inquiry — {{clientName}}` | `New Interest - ${portName}` |
| `residential_inquiry_client_confirmation` | `We received your residential inquiry — {{portName}}` | `Thank You for Your Interest - ${portName} Residences` |
| `residential_inquiry_sales_alert` | `New residential inquiry — {{clientName}}` | `New Residential Inquiry - ${data.fullName}` |
| `portal_activation` | `Activate your {{portName}} client portal account` | matches |
| `portal_reset` | `Reset your {{portName}} client portal password` | matches |
Combined with H1 this means the entire admin-customisation surface is half wired. Pick one of: (a) wire `overrides.subject` through every template and remove the divergence, or (b) drop the catalog rows for templates that can't be customised yet.
### H3. Email Settings page exposes dead form fields
`src/app/(dashboard)/[portSlug]/admin/email/page.tsx:34-48` lets admins set `email_signature_html` and `email_footer_html`. `getPortEmailConfig` (`port-config.ts:153-154,179-180`) reads them into `PortEmailConfig.signatureHtml` / `footerHtml`. But **`sendEmail` (`src/lib/email/index.ts`) never reads or injects them**, and `shell.ts` only reads the unrelated `branding_email_footer_html` / `branding_email_header_html` keys (via `getBrandingShell``getPortBrandingConfig`, lines 39-40 of shell.ts).
Result: the "Default signature (HTML)" and "Email footer (HTML)" controls on `/admin/email` are write-only sinks. Admins customise the footer; outbound emails never include it. There is a real customer-confidence hit here — a port admin will set a legal disclaimer expecting it on every send. Either (a) wire `cfg.footerHtml`/`signatureHtml` into `renderShell` or `sendEmail`, or (b) delete the fields from the admin page and consolidate on the Branding-page keys.
### H4. `residential-inquiry.ts` returns NO plaintext fallback
`residentialClientConfirmation` (line 41) and `residentialSalesAlert` (line 78) return `{ subject, html }` only — no `text`. Every other template returns `text`. Lack of a `text` part materially hurts spam scoring and breaks plain-text-only readers (some BlackBerry, screen-reader bridges, legacy MTAs). `sendEmail` (`index.ts:144-152`) honours `text` when present, so the consequence is "no plain-text MIME part is attached" — Gmail will still render it but Spamassassin's `MIME_HTML_ONLY` adds points.
### H5. `inquiry-sales-notification` includes `crmUrl` (which the admin may set) without scheme validation
`inquiry-sales-notification.ts:34` does `escapeHtml(crmUrl)` then drops it into both `href` and visible text. Escaping prevents attribute breakout but a `javascript:` or `data:text/html` scheme survives entity-encoding. Validate the scheme is `https?:` server-side before passing in (the producer probably already does; defence-in-depth here is one regex).
---
## MEDIUM
### M1. Admin-authored `emailHeaderHtml` / `emailFooterHtml` injected raw
`shell.ts:39-40, 64-66` interpolates branding HTML directly: `${headerHtml ? \`<div>${headerHtml}</div>\` : ''}`. Source is `system_settings.branding_email_header_html` (admin-only write). An admin account compromise → arbitrary HTML in every outbound email (`<a href="javascript:…">`, tracking pixels exfiltrating recipient IPs, phishing forms in some clients). Email clients largely strip script, but `<form action=…>`, `<meta http-equiv="refresh">`, and CSS-position overlays still work in Apple Mail / Outlook desktop. Mitigation: run admin input through a server-side sanitiser (DOMPurify / sanitize-html with an email-safe allowlist).
### M2. No dark-mode safety
`shell.ts:42-74` ships no `<meta name="color-scheme" content="light dark">` / `<meta name="supported-color-schemes">` and no `@media (prefers-color-scheme: dark)` rules. Apple Mail and Gmail auto-invert backgrounds: the `#ffffff` card stays white but body text (`#333333`) gets pseudo-darkened in some clients, and the `#666` muted copy can drop below contrast threshold. Cards with `box-shadow: 0 2px 4px rgba(0,0,0,0.1)` render as halos in dark mode.
### M3. No MSO / Outlook fallbacks
The shell has no `<!--[if mso]>` conditionals. The CTA buttons are CSS-padded `<a>` — Outlook 2016/2019 on Windows renders them as tiny underlined text (the link works but the button shape is gone). Recommended: VML rect fallback inside `<!--[if mso]>` for every CTA, or switch to bulletproof-button pattern (`mso-padding-alt`, `text-decoration:none` etc.).
### M4. Background image won't render where it matters most
`shell.ts:55` sets `background-image: url('…Overhead_1_blur.png')` on the outer `<table>`. Outlook strips background-image entirely (no VML fallback supplied); Gmail mobile sometimes too. The `background-color:#f2f2f2` fallback works but the brand impression is lost in the highest-volume client. Either drop the bg image (CLAUDE.md flags moving the asset off `s3.portnimara.com` anyway) or add VML rect for Outlook.
### M5. No preheader
No hidden inbox-preview text. Currently the first visible line ("Welcome to {portName} CRM" or "Just a quick reminder") leaks into the preview pane after the subject. A 1px hidden preheader (`<div style="display:none;max-height:0;overflow:hidden;">…</div>`) is one of the highest-ROI deliverability tweaks; missing here.
### M6. Logo `width="100"` without 2x source / explicit height
`shell.ts:62`. Apple Mail on retina renders this scaled — acceptable since the source PNG is 250px wide. But `height` is unset which forces some clients to recompute mid-render (jank). Add `height="100"` (assuming square) and the asset is fine.
---
## LOW
- **L1. Hardcoded `en-GB` date locale.** `document-signing.ts:141` calls `toLocaleString('en-GB', …)`. CRM is positioned as multi-port; once a non-UK port arrives this is wrong. Read locale from port-config.
- **L2. No `List-Unsubscribe` / `List-Unsubscribe-Post` headers.** `sendEmail` adds none. Gmail's Feb-2024 bulk-sender requirements made `List-Unsubscribe: <mailto:>` table-stakes for any sender exceeding 5k/day. Transactional senders technically exempt, but with notification digests + inquiry confirmations flowing through the same SMTP path, hitting that threshold is plausible. One-line fix.
- **L3. No `Message-ID` / `In-Reply-To` threading for digest-style mails.** Each digest is a new thread; users will hate this once volume rises.
- **L4. `logger.debug` in `sendEmail` (`index.ts:154`) emits recipient address.** PII in log lines. At debug level so prod typically masks, but worth pino-redacting `to` and `originalTo`.
- **L5. `crmInviteEmail`, `adminEmailChangeEmail`, `notificationDigestEmail` not in `TEMPLATE_KEYS`.** Means the admin can't customise their subjects at all — inconsistent with the rest. Either add them or document the omission.
- **L6. Hardcoded English copy.** Every template — buttons ("Sign in", "Activate account", "Set up your account"), greetings ("Dear …", "Hi …"), legal-ish boilerplate ("If you didn't request this …"). No i18n hook. Out-of-scope for v1 but flag for the Phase-7 cutover note in `project_email_ownership_at_cutover.md`.
- **L7. `SMTP_FROM` fallback in `sendEmail` builds `noreply@${env.SMTP_HOST}`.** If `SMTP_HOST` is `smtp.gmail.com` the From becomes `noreply@smtp.gmail.com` — invalid sender, instant SPF fail. Acceptable because production sets `SMTP_FROM` explicitly, but worth a `logger.warn` when this fallback is hit.
- **L8. Subject prefix when redirecting** (`index.ts:132-134`) — `[redirected from x@y]` appears verbatim and is fine in dev, but if `EMAIL_REDIRECT_TO` ever slips into prod (see C1) this is the only forensic trail.
---
## What's good
- All admin-supplied content (names, descriptions, custom messages, notes) is consistently `escapeHtml`-ed before interpolation. The only escape gaps are URLs (C2).
- Per-port branding shell is well-isolated; `getBrandingShell` falls back to defaults cleanly.
- `resolveAttachments` enforces `portId` cross-tenant isolation (`index.ts:94-96`).
- SMTP timeouts are explicit (`SMTP_TIMEOUTS`, `index.ts:20-24`) — averts the BullMQ-slot starvation the comments warn about.
- `EMAIL_REDIRECT_TO` plumbing is consistent across `sendEmail`, `documenso-client`, `email-compose`, `webhooks` workers — when set, every outbound channel honours it.
---
## Suggested fix order
1. C1 (production guard for `EMAIL_REDIRECT_TO`)
2. C2 (URL escaping / scheme allowlist)
3. H3 (delete or wire up the dead Email Settings fields — fastest unblocker for admins)
4. H1 + H2 (fix catalog / wire override paths)
5. H4 (add plaintext to residential templates)
6. M1 (sanitise admin HTML)
7. M2 / M3 / M5 (dark mode + MSO + preheader)
8. The Lows whenever convenient.
Total: ~1460 words.
---
## 10. Error UX + failure-mode resilience audit (error-ux-auditor)
# Error UX + Failure-Mode Resilience Audit
Repo: `/Users/matt/Repos/new-pn-crm` — branch `feat/documents-folders`.
Scope: route-segment error/not-found/loading coverage, error-boundary placement,
toast quality, leak surface, degradation when Redis / SMTP / Documenso / MinIO
are down.
---
## CRITICAL
### C1. Only ONE `error.tsx` for the entire app, no `not-found.tsx` per group, no `global-error.tsx`
`find src/app -name "error.tsx"` returns exactly `src/app/(dashboard)/error.tsx`.
Plus `src/app/not-found.tsx` (root) and a single `loading.tsx` under
`clients/[clientId]`. 73 dashboard pages, 0 portal/auth/scanner error files.
Consequences:
1. A throw inside `/portal/*` (dashboard, invoices, documents, profile)
has no boundary — Next default unstyled page, no branding, no requestId.
2. `(auth)` (login, reset-password, set-password) same exposure.
3. `(scanner)/[portSlug]/scan` — receipt scanner on a phone, throws on
Tesseract/OpenAI failure, no fallback.
4. `notFound()` calls inside portal routes fall through to the root
`not-found.tsx`, which links to `/dashboard` — wrong destination for
portal users (lands them at CRM login).
5. No `global-error.tsx` — if `RootLayout` throws (it reads cookies + ALS),
user gets Next default.
Add at minimum:
- `src/app/(portal)/error.tsx` + `src/app/(portal)/portal/not-found.tsx`
- `src/app/(auth)/error.tsx` (wrapped in `BrandedAuthShell`)
- `src/app/(scanner)/[portSlug]/scan/error.tsx`
- `src/app/(dashboard)/[portSlug]/not-found.tsx` (port-aware link target)
- `src/app/global-error.tsx`
### C2. 14+ naked `toast.error(err.message)` call sites bypass `toastError()`
`grep "toast.error.*err.message"` returns 14 hits: `client-list.tsx`,
`hard-delete-dialog.tsx`, `bulk-archive-wizard.tsx`, `smart-restore-dialog.tsx`,
`smart-archive-dialog.tsx`, `bulk-hard-delete-dialog.tsx`,
`portal/change-password-form.tsx`, more. These drop:
- the stable error code line
- the Reference ID line
- the "Copy ID" action button
On a 500 they show `"Internal server error"` with nothing copyable. On a
network failure they show `"Failed to fetch"` (raw TypeError). Several of
these files already import `toastError` for other call sites — the swap is
mechanical: `toast.error(err instanceof Error ? err.message : 'X failed')`
`toastError(err, 'X failed')`.
### C3. `apiFetch` collapses 502/504 with non-JSON body to "Bad Gateway", no requestId
`src/lib/api/client.ts:75`:
```ts
const error = await res.json().catch(() => ({ error: res.statusText }));
```
Reverse-proxy error pages (nginx, Cloudflare) deliver HTML, not JSON. The
user gets `ApiError{ message: "Bad Gateway", code: null, requestId: null }`
— no "Copy ID" action. When proxy fails, the user has nothing to paste to
support. Synthesize a client-side correlation ID + a "The server is
unreachable. Please try again." string when `status >= 500 && JSON.parse
fails`.
### C4. Redis outage wedges every rate-limited route — including login
`src/lib/redis.ts` uses `maxRetriesPerRequest: 3`. After exhaustion every
call from `checkRateLimit()` (`rate-limit.ts:44`) throws ioredis errors.
`withRateLimit` doesn't try/catch, so it bubbles to `errorResponse()` as 500. `/api/auth/*` is wrapped in `withRateLimit('auth')` — **a Redis
blip 500s login**. Same exposure for portal sign-in (`portalSignIn`
limiter on `/api/portal/auth/sign-in`).
Fix: in `checkRateLimit`, catch redis errors and fail-open for auth /
portal-signin (log "rate-limit subsystem unavailable, allowing request")
or fall back to a local in-memory limiter.
Same audit needed for `BullMQ getQueue().add()` calls — confirm
user-blocking enqueues (sendForSigning, requestGdprExport) degrade to
"we'll process this later" instead of 500.
---
## HIGH
### H1. SMTP failure semantics differ across callers
`sendEmail()` has 10s/10s/30s timeouts (`email/index.ts:20`) — good. Callers
diverge:
- `users.service.ts:381` (admin email-change notify) — `logger.warn` + swallow. ✓
- `me/email/route.ts:93``Promise.allSettled`. ✓
- `document-signing-emails.service.ts:169,211,247`**throws** → 500.
Documenso already sent; user sees a 500 even though the workflow
succeeded. Wrap in try/catch + mark `delivery_status: 'failed'` so
the inbox panel surfaces a retry button.
- `queue/workers/email.ts` — BullMQ retries 5× then permanent failure.
No DLQ admin surface (webhooks have one at `webhooks.ts:281`; mirror it).
### H2. Storage timeout error lacks semantic name → bad classifier hint
`src/lib/storage/s3.ts:52`:
```ts
throw new Error(`S3 ${label} timed out after ${ms}ms`);
```
`error-classifier.ts` `ERROR_NAME_HINTS` looks for `TimeoutError`; this
throws plain `Error`. The path-based classifier catches "Storage backend"
first but loses the timeout-vs-misconfig distinction. Define a
`class TimeoutError extends Error` and throw it from `withTimeout`.
### H3. Documenso outage: error codes good, UI feedback poor
`documenso-client.ts:42-60` maps to `DOCUMENSO_TIMEOUT`, `_AUTH_FAILURE`,
`_UPSTREAM_ERROR`. Toasts render cleanly. Missing:
- The signing page doesn't show "Documenso is unreachable, your draft is
saved." Users refresh and assume the draft is gone.
- The webhook receiver has no per-port rate-limit on 5xx. Documenso retry
storms can land if our handler regresses.
### H4. Heavy components have no error boundaries except `/dashboard/page.tsx` widgets
`WidgetErrorBoundary` is used in 4 dashboard widgets. NOT wrapped around:
- `command-search.tsx` (1177 lines) — mounted in header; one render
throw kills the entire shell.
- `invoice-pdf-preview.tsx` (pdfme — known to throw on malformed font/image).
- `pageviews-chart.tsx`, `pipeline-funnel-chart.tsx`, charts outside
`/dashboard/page.tsx`.
- The signed-PDF iframe inside `documents/[id]/page.tsx` — when MinIO is
down, chrome-internal error renders in-place with no retry.
Wrap each in `WidgetErrorBoundary` with a sensible fallback.
### H5. New + public routes bypass `errorResponse()`
`grep "errorResponse"` shows 691 hits. The exceptions don't propagate
X-Request-Id and produce inconsistent shapes:
- `src/app/api/storage/[token]/route.ts` — bare `NextResponse.json({error:'Invalid or expired token'})`, no requestId.
- `src/app/api/public/website-inquiries/route.ts:75,122` — bare `{error:'Unauthorized'}` / `{error:'Unknown port'}`.
- `src/app/api/webhooks/documenso/route.ts:100``{ok:false, error:'Invalid secret'}` 200. Returning 200 is correct (no Documenso retry storms), but the literal string "Invalid secret" confirms the endpoint expects a secret. Drop the string.
- `src/app/api/auth/resolve-identifier/route.ts:91` — defensive 200 returning synthetic email. By design — keep.
---
## MEDIUM
### M1. Root `not-found.tsx` link target wrong for non-CRM users
Links to `/dashboard`. Portal users hit `/portal/dashboard`, unauth users
need `/login` or `/portal/login`. Detect cookie/route prefix.
### M2. Suspense boundaries are sparse — 9 across `src/app`, 0 in components
Only `set-password`, `portal/activate`, `portal/reset-password` wrap
`useSearchParams` in Suspense. Every detail page (yacht, company,
interest, berth, document, invoice, expense, reservation) flashes empty
header on direct URL visits because there's no `loading.tsx`. Only
`clients/[clientId]/loading.tsx` exists — replicate the pattern across
detail routes.
### M3. `error_events` capture is fire-and-forget — DB write failure swallowed silently
`void captureErrorEvent({…})` (errors.ts:146, 170). If the DB is up but
the insert fails (FK to a deleted user, etc.) the row is lost forever and
the super-admin can't trace the original error. Add a fallback that
writes to pino with `tag:'error_event_capture_failed'` so the super-admin
can grep server logs as a last resort.
### M4. PG-error 23505 sometimes leaks as 500 instead of `ConflictError`
`berth-reservations.service.ts:29-43` and `document-folders.service.ts:18-31`
explicitly map 23505 → `ConflictError`. Confirm `clients.service.ts`,
`companies.service.ts`, `yachts.service.ts` write paths do the same — at
least one or two likely bubble a raw PG error and 500 on duplicate email
/ duplicate mooring instead of a 409 with a friendly "this name is
already in use" message.
### M5. `/api/ready` doesn't exist
`/api/health` (liveness) returns 200 unconditionally — correct. The
comment promises `/api/ready` for deep checks, but `find` shows no such
file. `/api/public/health` does deep checks gated by `WEBSITE_INTAKE_SECRET`
— wire k8s readiness probe to it or stub `/api/ready`.
### M6. `formatErrorBanner` (admin inline forms) doesn't render "Copy ID" action
Lives in `toast-error.ts` next to `toastError()`. The toast version has a
button; the banner is plain text. Admin users hitting a 500 from inline
forms get the reference ID printed but can't click-to-copy. Either build
`<ErrorBanner err={…}>` as a React component or accept the gap.
### M7. Worker BullMQ failures have no user-visible surface beyond webhooks
`logger.error({jobId, err}, '<queue> job failed')` is uniform across all
workers (email, documents, notifications, reports, export, ai, webhooks).
Only `webhooks.ts:281` plumbs a `dead_letter` notification on permanent
failure. Notification/email/export workers should follow suit — for
example, a stuck GDPR export should email the user "your export failed,
retry from /settings/data."
### M8. Portal auth pages would lose brand on render error
Portal-auth pages wrap content in `BrandedAuthShell`. A throw inside the
shell or form lands at Next default page (no `(portal)/error.tsx`). Add
`(portal)/portal/error.tsx` that renders `<BrandedAuthShell>` around the
error so the brand survives.
---
## Summary
| Severity | Count |
| -------- | ----- |
| CRITICAL | 4 |
| HIGH | 5 |
| MEDIUM | 8 |
Highest leverage: ship the 4 missing route-segment files (C1), sweep the
14 bare `toast.error(err.message)` sites to `toastError()` (C2), make
`checkRateLimit` fail-open when Redis is down (C4). Together these mean
every user-visible degradation is branded + every 5xx surfaces a
copy-pasteable reference ID.
---
## 14. Documenso integration depth audit (documenso-auditor)
# Documenso Integration Depth Audit — Task #14
**Scope:** `documenso-client.ts`, `documenso-payload.ts`, `eoi-context.ts`, `app/api/webhooks/documenso/route.ts`.
Read-only. Severity: CRITICAL / HIGH / MEDIUM.
---
## CRITICAL
### C1. In-app EOI pathway bypasses per-port Documenso config
`generateAndSignViaInApp` in `document-templates.ts` calls `documensoCreate(...)` and `documensoSend(...)` **without `portId`** (lines 831843). `resolveCreds` then returns the global env triple `(DOCUMENSO_API_URL, DOCUMENSO_API_KEY, DOCUMENSO_API_VERSION)`.
Consequences on multi-tenant deployments:
- Per-port `apiVersion` ignored → a v2 port silently hits v1 endpoint paths (or vice versa); `createDocument`/`sendDocument` pick the wrong branch.
- Per-port `apiKey` ignored → auth fails on tenants whose key is only in `system_settings.documenso_api_key_override`.
- `redirectUrl` and `signingOrder` SEQUENTIAL/PARALLEL settings never plumbed — the in-app pathway passes no `meta` arg. Signers always land on Documenso's default thank-you page and v2 ports always sign PARALLEL regardless of admin choice.
Fix: thread `portId` and a `CreateDocumentMeta` built from `getPortDocumensoConfig(portId)` into both calls — mirror `generateAndSignViaDocumensoTemplate` at lines 894910.
### C2. `handleDocumentCompleted` idempotency has a real cross-channel race
The early-return at `documents.service.ts:1110` (`if (doc.status === 'completed' && doc.signedFileId) return;`) is **necessary but not sufficient**. Two write paths can race:
1. Webhook receiver → `handleDocumentCompleted`.
2. Background poll worker `jobs/processors/documenso-poll.ts:63` → same call (same args).
The route-level `documentEvents.signatureHash` dedup only catches webhook→webhook repeats. It does **not** catch webhook + poll, because the poll worker bypasses the webhook entry point and has no signatureHash row. Both can:
1. Resolve `doc` (status=`sent`, `signedFileId=null`).
2. Pass the gate.
3. `downloadSignedPdf``storage.put``db.insert(files)``db.update(documents).set({ status:'completed', signedFileId })`.
Outcome: two `files` rows, two MinIO blobs; the second `UPDATE` overwrites `signedFileId`, orphaning the first row + blob (no DB pointer, never GC'd).
Fix: wrap gate-and-write in a transaction with `SELECT ... FOR UPDATE` on the documents row, or a pre-claim `UPDATE documents SET status='completing' WHERE id=? AND status != 'completed' RETURNING *` that atomically reserves the row.
### C3. Webhook silently swallows handler errors → permanent event loss
`route.ts:264266` catches every handler throw, logs, returns 200 (intentional — "always 200"). But a transient storage/DB failure inside `handleDocumentCompleted` is lost forever — Documenso records the event as delivered and never retries. Poll worker is the only safety net.
Fix: on handler throw, return non-200 so Documenso retries (bounded budget); or push the raw body onto a BullMQ replay queue.
---
## HIGH
### H1. Multi-berth `Berth Range` Documenso template field still pending
`buildDocumensoPayload` writes `formValues['Berth Range']: context.eoiBerthRange` (line 157), and `eoi-context.ts:128135` populates it from `interest_berths.is_in_eoi_bundle=true` via `formatBerthRange()`. The **live Documenso v1 template does not yet have this field** (CLAUDE.md confirms). Documenso v1's `templates/{id}/generate-document` silently drops unknown `formValues` keys — multi-berth EOIs currently render with only the primary mooring in `Berth Number`.
The in-app pathway (`pdf/fill-eoi-form.ts:6267`) fails loudly when its AcroForm field is missing; the Documenso pathway fails silently. Add a startup `GET /api/v1/templates/{id}` preflight that warns when `Berth Range` is absent.
### H2. `placeFields` v2 path is unverified against a live Documenso 2.x instance
`documenso-client.ts:636` has an explicit "must be confirmed against a live Documenso 2.x instance — top v2 risk" comment. Concerns:
- Body uses `recipientId: String(f.recipientId)`; v2 may want numeric ID or string token — unverified.
- Geometry name mapping (`positionX/positionY/width/height` vs v1 `pageX/...`) is correct in shape, unverified in field naming.
- `fieldMeta` shipped verbatim; v2's `create-many` schema unpinned.
Any port flipped to `apiVersion='v2'` using upload-and-place is rolling the dice until realapi run is green.
### H3. v1 fallback for CHECKBOX/DROPDOWN/RADIO is broken — silently
`fieldTypeNeedsMeta` permits CHECKBOX/DROPDOWN/RADIO. On v1, `placeFields` strips `fieldMeta` (lines 663671 omit it) and v1's `/documents/{id}/fields` doesn't accept option metadata. A CHECKBOX placed on a v1 port renders as an unconfigured input with no options.
Code comment acknowledges ("falls back to blank-input behaviour"), but the placement UI gives no signal. Add a v1-aware preflight that disables these field types when `apiVersion='v1'`.
### H4. `sendDocument` v2 `redistribute` recipient scoping is unverified
`sendReminder` v2 (lines 391407) ships `{ envelopeId, recipientIds: [signerId] }` to `/api/v2/envelope/redistribute`. The leading comment **contradicts** the body: "redistributes to all pending recipients on the envelope. Single-recipient targeting requires admin-side filtering."
If v2 ignores `recipientIds`, every "remind one signer" click resends to **everyone**, including already-completed signers — embarrassment risk on multi-signer EOIs. Realapi verification needed; reconcile comment with implementation either way.
---
## MEDIUM
### M1. `apiVersion='v1'` template-flow caveat correct but locks out v2 features
`generateDocumentFromTemplate` is hard-coded to `/api/v1/templates/{id}/generate-document` regardless of `apiVersion`. v2 instances accept this via backward-compat. Risk: a v2-native admin who built a template in the v2 UI may have **field IDs** but no stable **field names**`formValues` keyed by name won't match. If Documenso drops v1 compat, every template-flow EOI breaks atomically. Plan now to capture per-template field-ID metadata in admin settings.
### M2. `getPageDimensions` cache + A4 assumption
`documenso-client.ts:597` returns `DEFAULT_PAGE_DIMENSIONS = { 595, 842 }` (A4 portrait, pt) unconditionally — the cache is dead code. Fine for the A4 EOI source PDF; for admin-uploaded contracts in Letter/A3/landscape, percent→pixel conversion is wrong by 530%, placing fields off-page or in the wrong band. Capture real page size via `pdf-lib` at upload time.
### M3. `normalizeDocument` recipient id collapses to `''` on missing fields
Line 75: `id: String(rec.recipientId ?? rec.id ?? '')`. When both keys are absent (malformed response), id becomes `''`; downstream maps keyed by recipient id collapse all phantoms into one bucket. Throw or filter when id is empty.
### M4. `applyPayloadRedirect` `/email$/i` regex is fragile
`documenso-client.ts:148` matches keys ending in `email`. A future field like `notificationEmailAddress` or `cc_email_2` would be missed and could leak past `EMAIL_REDIRECT_TO`. Either widen the heuristic, or declare email fields explicitly in `DocumensoTemplatePayload` and rewrite only those.
### M5. `voidDocument` 404-idempotency loses tenant signal
On 404, log + return silently. The local doc may still have `status='sent'`, so a retry re-attempts. Mostly benign — but set local `status='voided'` on 404 so DB converges with remote-not-found reality.
### M6. EOI hard-gate error code
`eoi-context.ts:206` produces `Cannot generate EOI - missing required client details: ...`. Labels are clean (good), but no structured code/field array — UI can't deep-link to the missing tab. Add `code: 'EOI_GATE'` + missing fields array.
### M7. Webhook `signatureHash` covers replay but not v2 timestamp drift
Confirmed body-sha256 dedups same-payload retries. If v2 ever varies `signedAt` on retry, the per-recipient `${signatureHash}:signed:${email}` keys differ → repeat processing. The per-recipient `document_events` index protects writes there, but `handleRecipientSigned` likely also advances interest stage — verify that side-effect is idempotent too.
---
## What's solid
- **`normalizeDocument` id↔documentId** symmetric; downstream consumes the legacy `id` form consistently — no stray reads of `documentId`/`recipientId`.
- **`canonicalizeEvent`** correctly maps `DOCUMENT_SIGNED``document.signed` and routes v2 aliases (`RECIPIENT_SIGNED`, `RECIPIENT_VIEWED`) to v1 equivalents with a telemetry log line.
- **`verifyDocumensoSecret`** timing-safe, iterates per-port + global env, rate-limits bad-secret IPs.
- **`handleDocumentCompleted` early-return** is the right shape for the common same-channel retry case. Cross-channel race (C2) is separate.
- **`eoi-context.eoiBerthRange`** plumbing correctly walks `interest_berths.is_in_eoi_bundle=true` and produces the compact range. Gap is template-side (H1).
- **SEQUENTIAL/PARALLEL `signingOrder`** correctly wired in `generateAndSignViaDocumensoTemplate` (`document-templates.ts:909`). Gap is the in-app pathway (C1).
- **`buildDocumensoPayload.meta.distributionMethod = 'NONE'`** — distribute invoked separately by `sendDocument`. Correct on both versions.
- **EOI hard-gate** matches Section 2 requirements (name/address/email); yacht + berth correctly optional.
---
## Pending — all complete
All 19 audit tasks finished. Every report is inlined above.
---
## Appendix: methodology + agent roster
Audit was run as a single `pn-crm-audit` Claude Code team. Each teammate was a separate Claude Opus 4.7 instance with read-only static-analysis scope (no file edits permitted by the brief). Time budget: 22 minutes per agent. Reports were written to `/tmp/audit-*.md` and consolidated here.
### Team members
| Agent | Task | Output |
| --------------------- | ----------------------------------- | --------------------------- |
| security-auditor | #1 Security + API + auth | /tmp/audit-security.md |
| ui-ux-auditor | #2 UI/UX + a11y | /tmp/audit-ui-ux.md |
| data-model-auditor | #3 Data model + migrations | /tmp/audit-data-model.md |
| services-auditor | #4 Services + realtime + storage | /tmp/audit-services.md |
| perf-test-auditor | #5 Performance + code-trim + render | /tmp/audit-perf-test.md |
| obs-i18n-docs-auditor | #6 Observability + i18n + docs | /tmp/audit-obs-i18n-docs.md |
| concurrency-auditor | #7 Concurrency + races | /tmp/audit-concurrency.md |
| gdpr-auditor | #8 GDPR + PII | /tmp/audit-gdpr.md |
| email-auditor | #9 Email deliverability | /tmp/audit-email.md |
| error-ux-auditor | #10 Error UX + failure modes | /tmp/audit-error-ux.md |
| reporting-auditor | #11 Reporting math | pending |
| onboarding-auditor | #12 Onboarding UX | pending |
| pdf-auditor | #13 PDF + brand assets | pending |
| documenso-auditor | #14 Documenso depth | /tmp/audit-documenso.md |
| copy-auditor | #15 Copy + terminology | pending |
| deps-auditor | #16 Deps + supply chain | pending |
| build-auditor | #17 Build + prod readiness | pending |
| recommender-auditor | #18 Berth recommender | pending |
| search-auditor | #19 Search relevance | pending |
---
## 11. Reporting + analytics math correctness (reporting-auditor)
# Task #11 — Reporting + Analytics Math Correctness
Scope: dashboard widgets, kanban "active deals", pipeline-report PDF, revenue-report PDF, `dashboard.service.ts`, `report-generators.ts`, `analytics.service.ts`. Read-only audit.
Canonical pipeline stages live in `src/lib/constants.ts``PIPELINE_STAGES`:
`open, details_sent, in_communication, eoi_sent, eoi_signed, deposit_10pct, contract_sent, contract_signed, completed`. `STAGE_WEIGHTS` matches.
---
## CRITICAL
### C1. Hot-deals card ranks/labels on **non-existent** stage names
`src/lib/services/dashboard.service.ts:198-208` (`getHotDeals`) builds a CASE that references `'in_comms'` and `'deposit_10'`. The DB column `interests.pipeline_stage` stores `'in_communication'` and `'deposit_10pct'`. Both real stages fall through to `ELSE 0`, collapsing the rank ladder so any `eoi_sent` deal outranks every `in_communication`/`deposit_10pct` deal, and ordering inside the top tier becomes "newest updatedAt wins" instead of "furthest along."
The frontend mirror in `src/components/dashboard/hot-deals-card.tsx:26-36` (`STAGE_LABELS`) uses the same wrong keys (`deposit_10`, `in_comms`), so the badge for those two stages renders the raw enum string `deposit_10pct` / `in_communication` instead of "Deposit 10%" / "In Comms." Fix both files; prefer importing `STAGE_LABELS` from `@/lib/constants` rather than re-declaring it.
### C2. Revenue PDF "TOTAL COMPLETED REVENUE" silently includes **lost & cancelled** deals
`setInterestOutcome` in `interests.service.ts:919-943` forces `pipelineStage = 'completed'` for **every** outcome (won, lost\_\*, cancelled). `fetchRevenueData` in `report-generators.ts:126-140` then sums berth prices for `pipelineStage='completed' AND archivedAt IS NULL` with **no `outcome` filter**, and the PDF prints the result as `TOTAL COMPLETED REVENUE` (`revenue-report.ts:97`). Result: a marina with 1 won + 10 lost deals at €1M berths reports €11M completed revenue. Add `eq(interests.outcome, 'won')` to the `completedRevenue` query (and probably to the per-stage breakdown).
### C3. Pipeline PDF `stageCounts` query has **no `GROUP BY`**
`report-generators.ts:54-60`:
```ts
db.select({ stage: interests.pipelineStage, count: count() })
.from(interests)
.where(...); // ← no .groupBy()
```
Postgres rejects a non-aggregated column without `GROUP BY` (`42803`). Either the pipeline PDF report has been crashing silently in the worker queue for any port with rows, or every run produces a single row that misses every stage but one. Add `.groupBy(interests.pipelineStage)`.
---
## HIGH
### H1. "Active interest" means **four different things** across surfaces
| Surface | Filter |
| ----------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| `getKpis` / `getPipelineCounts` / `getRevenueForecast` (dashboard tiles + forecast) | `archivedAt IS NULL AND (outcome IS NULL OR outcome='won')` |
| `computePipelineFunnel` (analytics funnel) | same — but additionally bounded by `createdAt BETWEEN range` |
| `listInterestsForBoard` (kanban) — `interests.service.ts:194` | `archivedAt IS NULL` only ⇒ **lost & cancelled cards still appear** on the board (they all sit in the `completed` column because of C2) |
| `getHotDeals` | `archivedAt IS NULL AND outcome IS NULL` (also excludes **won** — intentional per comment but worth flagging) |
| `fetchPipelineData` / `fetchRevenueData` (PDF reports) | `archivedAt IS NULL` only ⇒ includes lost & cancelled |
| `computeRevenueBreakdown` (invoices) | unrelated definition — by invoice status |
A rep who reads "12 Active Deals" on the tile then opens the kanban can see 17 cards, because the kanban silently includes 5 lost deals routed to the `completed` column. Consolidate into a single `activeInterestsWhere(port)` helper and reuse everywhere.
### H2. Occupancy rate uses **two different sources**, same dashboard
- `getKpis` (KPI tile) + `fetchOccupancyData` (PDF) compute occupancy from `berths.status IN ('sold','under_offer')`.
- `computeOccupancyTimeline` (chart on the analytics page) computes occupancy from `berth_reservations` overlap with each day, with `total = COUNT(berths)`.
The two are unrelated: a berth marked `sold` with no active reservation contributes to the tile but not the timeline; a berth marked `available` with an active reservation contributes to the timeline but not the tile. Reps will see the tile read 64% and the chart's right-most point read 12% on the same day. Pick one definition (status-based is the documented one in CLAUDE.md) and align the timeline.
### H3. Revenue PDF stage breakdown is **unweighted**; dashboard forecast is weighted
`fetchRevenueData.stageRevenue` (`report-generators.ts:107-118`) does `SUM(berths.price)` per stage with no `pipeline_weights` multiplier. The dashboard `RevenueForecast` widget multiplies by `pipeline_weights[stage]`. So:
- Tile shows €420K (weighted).
- Revenue PDF "Revenue by Pipeline Stage" for the same data shows €1.6M (unweighted).
The two are reconcilable in principle but no rep will guess that. Either weight the PDF the same way, or rename the PDF column to "Berth Price by Stage (gross)".
### H4. `pipeline_weights` defaults duplicated in two source files
`src/lib/constants.ts:68` (`STAGE_WEIGHTS`) and `src/components/admin/settings/settings-manager.tsx:76-86` hard-code the same object. Drift between the two means admins editing settings could see different defaults than the forecast actually uses. The settings form should `import { STAGE_WEIGHTS } from '@/lib/constants'` and spread it as `defaultValue`.
### H5. `getRevenueForecast` silently zeroes out stages with **missing weight keys**
`dashboard.service.ts:139` does `weights[stage] ?? 0`. If an admin saves `pipeline_weights` as `{ "in_comms": 0.2, ... }` (legacy key) or simply omits a stage, every active interest at the missing stage contributes €0 to the forecast — no warning, no fallback to `STAGE_WEIGHTS[stage]`. Validate the saved JSON against `PIPELINE_STAGES` at write time, OR fall back to the constant per-key (`weights[stage] ?? STAGE_WEIGHTS[stage]`).
---
## MEDIUM
### M1. Interests with no primary berth disappear from "pipeline value"
`getKpis` and `getRevenueForecast` use `INNER JOIN interest_berths ON isPrimary=true`. An interest without a primary-berth link (legitimate while the rep is still sourcing) contributes 0 to `pipelineValueUsd` and to `totalWeightedValue`, but is still counted in `activeInterests` and on the kanban. Mismatch between deal count and value. Surface a footnote (e.g. "5 deals not yet matched to a berth") or LEFT JOIN with a price-coalesce.
### M2. "Top Interests by Value" PDF includes lost deals
`fetchPipelineData.topInterestsRows` (lines 68-83) orders by `berths.price DESC NULLS LAST` with no outcome filter. A €4M lost deal will sit at the top of the report. Add `(outcome IS NULL OR outcome='won')`.
### M3. PDF stage order hardcoded inside both templates
`pipeline-report.ts:58-68` and `revenue-report.ts:55-65` redeclare the canonical stage order. Renaming a stage in `constants.ts` will leave the renamed stage appended to the "unknown stages" tail block instead of in its proper position. Import and iterate `PIPELINE_STAGES`.
### M4. `selectDistinct` in `pipelineValueUsd` is correct but fragile
`dashboard.service.ts:39-47` `selectDistinct({ berthId, price })` happens to dedupe correctly because `berthId` is unique. If a future schema lets two `interest_berths` rows reference the same berth as primary (the partial unique index permits this if the other row has `isPrimary=false`), the join would still emit one row per primary-only match. Today's behaviour is fine; a comment in the code claims correctness but doesn't explain _why_. Add a one-line note tied to the partial unique index.
### M5. `getHotDeals` ordering tiebreaker uses `updatedAt` while UI shows `lastContact`
The query orders by `desc(rank), desc(updatedAt)` (`dashboard.service.ts:234`) but the card surfaces `last touched X ago` from `dateLastContact`. When a stage rank ties, the card with the most recent **edit** (rename, tag change, stage move) wins, not the most recent **contact**. Reps will be confused why an interest with 30-day-old `lastContact` sits above one with 2-day-old contact. Either order by `coalesce(dateLastContact, updatedAt)` or drop the "last touched" copy.
### M6. Source-conversion total includes archived-but-active deals only? No — also includes "still open"
`getSourceConversion` denominator is "every non-archived interest of that source" (`dashboard.service.ts:262`). For a source with 100 leads / 5 won / 0 lost / 95 still open, conversion = 5%. A source with 5 leads / 5 won shows 100%. The metric isn't wrong, but the description text "Won deals as a percentage of leads per source" implies a closed funnel; consider switching denominator to `won + lost` for the "true" rate, or rename the label.
---
## Summary
3 CRITICAL bugs (hot-deals stage typos, lost-revenue mislabelled "completed", missing `GROUP BY` in pipeline PDF), 5 HIGH inconsistencies (active-deal definition splits 4 ways; occupancy split 2 ways; weighted vs unweighted revenue; duplicated weight defaults; silent zero-weighting), and 6 MEDIUM polish issues. The single most leveraged fix is consolidating one `activeInterestsWhere()` helper used by every surface, plus adding a `outcome='won'` filter to the revenue PDF and a `GROUP BY` to the pipeline PDF.
---
## 13. PDF + brand-asset correctness (pdf-auditor)
# PDF + brand-asset correctness — audit
Scope: `src/lib/pdf/**`, `src/lib/templates/{merge-fields,berth-range}.ts`,
`src/lib/services/{documenso-payload,brochures,berth-pdf}.service.ts`,
`docs/eoi-documenso-field-mapping.md`, `assets/eoi-template.pdf`.
Severity bands: **CRITICAL** = customer-visible silent data loss /
crash; **HIGH** = visible quality regression / wrong number on a
customer-facing artefact; **MEDIUM** = polish + future-proofing.
---
## CRITICAL
### C-1. Live Documenso template still missing `Berth Range` field
- `src/lib/services/documenso-payload.ts:157` always emits the
`Berth Range` formValue, and `formatBerthRange()` produces compact
range strings for the multi-berth bundle.
- `docs/eoi-documenso-field-mapping.md:34` flags that the live
template (id `8`) does **not** yet have the `Berth Range` field.
Documenso silently ignores unknown `formValues` keys.
- Net effect: every multi-berth EOI shipped via the Documenso pathway
currently renders only the primary `Berth Number`. The expanded
range (e.g. `A1-A3, B5-B7`) is dropped end-to-end, with no warning
on the Documenso side — the bundle context is lost from the signed
PDF.
- Same field is also addressed defensively in the in-app pathway
(`src/lib/pdf/fill-eoi-form.ts:60-72`), which logs a warning, but
only when the in-app template is the one being used.
- **Action:** Add the `Berth Range` text field to Documenso template
`8` (mirror the AcroForm field name + size on the source PDF). Once
added, single-berth EOIs are unaffected because `formatBerthRange`
collapses a single mooring to its raw form.
### C-2. tiptap→pdfme page break is wrong for letter / mixed for A4
- `src/lib/pdf/tiptap-to-pdfme.ts:51-54`:
- `PAGE_WIDTH_MM = 170` is correct for A4 (210 2×20) but is
treated as the only page format.
- `PAGE_BREAK_THRESHOLD = 250` is hard-coded; A4 page height is
297 mm and the threshold of 250 leaves 47 mm of unused space at
the bottom and ignores the real bottom margin (≈ 20 mm).
- `eoi-standard-inapp.ts:67` declares `@page { size: letter; ... }`,
i.e. the seeded HTML template is _Letter-sized_ while the serialiser
is working in A4 millimetre coordinates. The template body is
authored at a different page size than the engine that lays it out.
- Net effect: long custom templates either truncate (overflow into
the bottom margin, content clipped by pdfme when fields run past
page height) or break at the wrong vertical position. The bug is
invisible in the seeded template because its content is short, but
any port that edits the template to add a few clauses sees clipped
output.
- **Action:** Make page format a per-template attribute (Letter vs
A4), drive both the page width _and_ the break threshold from it
(Letter content height ≈ 254 mm, A4 ≈ 277 mm with 10 mm bottom
margin), and reject HTML-template `@page size:` values that
disagree with the per-template setting.
### C-3. tiptap→pdfme silently drops inline italic / underline + the
whole image node
- `extractParagraphContent` (`tiptap-to-pdfme.ts:146-164`) only
records `bold` and ignores `italic` and `underline` marks. The
validator accepts these marks (they're not in `UNSUPPORTED_NODES`)
so an admin saves a template with italics, the preview renders
bold-only, and they ship the wrong artefact to a client.
- `processNode` for `image` (line 354) does `state.y += 20` and
never adds a field. The serialiser reserves 20 mm of whitespace
and drops the image entirely. The "Insert image" affordance in
the template editor (if exposed) is non-functional today.
- The validator does NOT list the visible mark names it supports, so
admins cannot reason about what's safe to use.
- **Action:** Either honour italic/underline via per-segment fields,
or reject them at validation time the same way `blockquote` is
rejected. For images, either implement the `image` pdfme schema
or reject `image` nodes outright.
---
## HIGH
### H-1. No font registration → unsupported glyph silent fallback
- `src/lib/pdf/generate.ts` calls `pdfme/generator.generate({ template, inputs })`
with no `options.font`. pdfme ships only Roboto by default.
- The tiptap serialiser sets `fontName: 'Helvetica' | 'Helvetica-Bold'`
(`tiptap-to-pdfme.ts:205-237`). pdfme without a registered Helvetica
font silently falls back to its embedded Roboto; the bold variant
is also a substitution. This is invisible in dev because Roboto
has full Latin + Latin-1 coverage, but non-Latin glyphs (Greek,
Cyrillic, Hebrew for AED-tagged clients, the `د.إ` AED symbol from
`currency.ts:14`) tofu out to `□`.
- The currency dropdown advertises AED and JPY, both of which use
non-Latin glyphs that Roboto does NOT cover (`د.إ` Arabic, `¥`
is fine but `د.إ` isn't).
- **Action:** Register a Unicode-coverage font (Noto Sans + Noto
Sans Arabic + Noto Sans CJK) once and pass it to `generate()`.
Mirror the same font on the in-app EOI when that pipeline is
built. Until then, the AED currency code in `SUPPORTED_CURRENCIES`
is a footgun on every PDF that renders price.
### H-2. Locale inconsistency in money + date formatting
- Mixed locale strategy in the reporting + summary templates:
- `revenue-report.ts:78,86``Number(...).toLocaleString(undefined, ...)`
(default locale; in Node 20 inside Docker this is `en-US.UTF-8`
via the standalone image's `LANG`; on the dev mac it picks the
OS locale → different decimal/thousands separators server-side).
- `pipeline-report.ts:93``Number(...).toLocaleString()` (default
locale, no formatting opts).
- Almost every other template hard-codes `'en-GB'` for dates.
- The interest-summary and berth-spec templates render `Price` as
`${currency} ${Number(price).toLocaleString()}` — they bypass
`formatCurrency()` and therefore drop the proper currency symbol
formatting (`USD 45,000` instead of `$45,000.00`). `invoice-template.ts`
uses `formatCurrency()` correctly; the inconsistency is a UX bug.
- Pipeline report renders "Berth Price" with no currency at all
(`pipeline-report.ts:92-94`): a 45 000 figure is meaningless
without it.
- **Action:** Route every money render in `src/lib/pdf/templates/**`
through `formatCurrency()` (`src/lib/utils/currency.ts:37`), with
an explicit `locale: 'en-GB'` to match the dates. Same for the
reports' date stamps.
### H-3. Page overflow in fixed-height schemas
- Every template in `src/lib/pdf/templates/` uses fixed `position`
and `height` slots:
- `client-summary-template.ts` reserves 80 mm for the interests
list (line 51) and 60 mm for recent activity (line 60). pdfme
truncates text that exceeds the slot height; there is no
"overflow → next page" mechanism in the template definition.
- `interest-summary-template.ts:65-69` reserves 85 mm for the
timeline; with 30 events at 8 pt that's ~3 lines/event = clipped
after ~10 events.
- `activity-report.ts:46` reserves 120 mm for `activityDetails`,
and the data layer slices to `data.logs.slice(0, 30)`
(line 70) — the slice masks the bug, but if the report layer
sends more logs the bottom rows are clipped.
- `pipeline-report.ts:38-50` allocates 100 mm for summary and
100 mm for details; both can spill on ports with many stages
- many top interests.
- pdfme's failure mode is silent clipping, not visible truncation
with `…` or a "continued on next page" marker.
- **Action:** Either move large lists onto multi-page schemas (push
fields onto subsequent `schemas[i]`) or add explicit pagination
inside `build*Inputs` with a deterministic "showing N of M" tail.
### H-4. Numeric/date inputs pass `undefined`/`null` through `new Date()`
- `invoice-template.ts:117` renders `Due: ${invoice.dueDate}` raw.
When `dueDate` is null the field reads `Due: null`. Other templates
use `formatDate()`-style helpers that return `'N/A'`, but the
invoice template doesn't.
- `client-summary-template.ts:97,143` and `interest-summary-template.ts`
call `new Date(client.createdAt as string | Date)` without
guarding against `undefined`. `new Date(undefined)` yields
`Invalid Date` whose `toLocaleDateString` returns `'Invalid Date'`
— that string ends up in the PDF.
- **Action:** Add a single `formatDate(value, fallback='—')` helper
in `src/lib/utils/date.ts`, reuse across all templates; the
existing private one in `interest-summary-template.ts:83-86`
should be hoisted.
---
## MEDIUM
### M-1. No accessibility / tagged PDF output
- All PDFs we produce are untagged (pdfme uses raw pdf-lib under the
hood; `generate.ts` does not call `setTitle`, `setLanguage`,
`setProducer`, or anything to enable `StructTreeRoot`).
- WCAG 2.1 7.1 / PDF/UA-1 compliance is unmet. For a port that
contracts with a public-sector tenant or runs accessibility
reviews on outbound EOIs, this is a procurement blocker.
- The in-app EOI HTML template has zero `aria-*` attributes and a
table-based layout (`eoi-standard-inapp.ts:184-209`).
- **Action:** At minimum set `Title`, `Author`, `Subject`, `Lang=en-GB`
metadata in the in-app EOI fill path (`fill-eoi-form.ts`) — pdf-lib
supports `doc.setTitle()` etc. without adding accessibility tags.
Track tagged-PDF / PDF/UA as a follow-up item.
### M-2. EOI in-app source PDF — silent field-name drift
- `fill-eoi-form.ts:42-50` swallows every `getTextField()` /
`getCheckBox()` exception so a re-cut template whose AcroForm
field names changed (e.g. `Berth Number``Berth_Number`) will
produce a "successful" PDF with empty fields. Only `Berth Range`
is special-cased to log when missing.
- **Action:** Promote the silent-skip pattern to also log a warning
per missing field (already done correctly for the new `Berth
Range` field — apply same treatment to `Name`, `Email`,
`Address`, `Yacht Name`, `Length`, `Width`, `Draft`, `Berth
Number`, `Lease_10`, `Purchase`). Without it, the only way to
notice a corrupted template is QA on a signed PDF.
### M-3. Form not flattened → signer can edit pre-filled fields
- `fill-eoi-form.ts:124` saves the doc unflattened. The comment on
line 94 explicitly justifies this ("recipient can still tweak
fields if needed before signing"). For an EOI/LOI this is risky:
the signer can edit the address, yacht dimensions, or berth
number after the fact, and the unflattened PDF carries the
edits without the developer/approver re-acknowledging.
- Documenso pathway is fine — Documenso flattens server-side
before producing the signed artefact — but the in-app pathway
emits the raw filled AcroForm to the storage backend as-is.
- **Action:** Flatten the AcroForm (`form.flatten()` before
`doc.save()`) for the in-app pathway, OR mark the relevant
fields as read-only via `field.enableReadOnly()`. The "tweak
before signing" justification belongs to a _draft_ preview, not
the production artefact.
### M-4. `formatBerthRange` warning is noisy at warn-level
- `berth-range.ts:64` logs `WARN` per non-canonical mooring. The
CLAUDE.md mooring spec (`^[A-Z]+\d+$`) was data-normalised in
Phase 0, but historical archived rows + the `(deleted)` /
`(archived)` suffix scheme on entity folders can leak into the
bundle. Every multi-berth EOI containing a legacy mooring spins
a stack of warnings.
- **Action:** Downgrade the per-mooring warning to `debug`; emit a
single `warn` summary per `formatBerthRange()` call when the
passthrough list is non-empty.
### M-5. `berth-spec-template.ts` waitingList truncation
- 50 mm × 8 pt ≈ 12 lines (`berth-spec-template.ts:67-70`); the
waiting-list join key is `position` ordered 1..N and there is no
data-side cap. Ports with > 12 waitlisted clients silently lose
the tail of the list on the spec PDF.
- Same shape problem as H-3 but lower impact (berth-spec is internal).
### M-6. `assets/eoi-template.pdf` — opacity / single source
- The whole in-app pathway depends on a single committed binary at
`assets/eoi-template.pdf`. There is no sha256 pinned in
`assets/README.md`, no script that regenerates it from a known
good source, and the AcroForm field shape is documented only in
the mapping doc + the JSDoc of `loadEoiTemplatePdf`. A swap of
this file by anyone with repo access changes legal output
silently.
- **Action:** Add `EXPECTED_SHA256` to `assets/README.md` + a
startup-time check (or test) that the source PDF's sha matches
before falling back to `EOI_TEMPLATE_PDF_PATH`. Same applies to
any shipped brochure default.
### M-7. Reports / pdfme schemas — no `portName` brand asset
- Every report template hard-codes `'Port Nimara'` as the fallback
in `build*Inputs`. The CRM is multi-tenant; an admin generating
a report for a different port falls back to the wrong brand if
the port lookup fails (e.g. report job runs without a hydrated
port). Default should be the empty string or `'(port)'`, not a
competitor port's brand.
### M-8. Brochures + per-berth PDF — no upload-time render audit
- These are user-uploaded PDFs, not engine-rendered, so the
template-quality items above don't apply. The relevant integrity
controls (magic-byte check, sha256, size cap, version snapshot)
are in place in `berth-pdf.service.ts:217-264` and the brochure
upload flow. No findings for these two flows.
---
## Summary
| # | sev | file | item |
| --- | ---- | --------------------------------- | ------------------------------------------------------------------- |
| C-1 | CRIT | Documenso template (live) | `Berth Range` field missing — multi-berth ranges dropped end-to-end |
| C-2 | CRIT | `tiptap-to-pdfme.ts` | A4 vs Letter page mismatch + hard-coded 250 mm break threshold |
| C-3 | CRIT | `tiptap-to-pdfme.ts` | italic/underline marks and `image` nodes silently dropped |
| H-1 | HIGH | `generate.ts` + tiptap serialiser | no font registration → AED/JP/Greek/Cyrillic glyphs missing |
| H-2 | HIGH | reports + summaries | locale-default `toLocaleString` server-side + currency bypass |
| H-3 | HIGH | every pdfme template | fixed-height slots clip overflow with no pagination |
| H-4 | HIGH | `invoice-template.ts` + summaries | raw null/undefined date passthrough renders "Invalid Date" |
| M-1 | MED | all PDFs | no tagged-PDF / PDF/UA metadata |
| M-2 | MED | `fill-eoi-form.ts` | silent field-name drift in source EOI PDF |
| M-3 | MED | `fill-eoi-form.ts` | in-app EOI ships unflattened AcroForm |
| M-4 | MED | `berth-range.ts` | noisy per-mooring warn log |
| M-5 | MED | `berth-spec-template.ts` | waitingList overflow |
| M-6 | MED | `assets/eoi-template.pdf` | no sha pinning of source binary |
| M-7 | MED | report templates | wrong-port fallback brand `'Port Nimara'` |
| M-8 | MED | brochure + per-berth uploads | no issues — upload integrity controls in place |
Approx word count: ~1380.
---
## 15. Customer-facing copy + terminology audit (copy-auditor)
# Task #15 Customer-facing copy + terminology audit
Scope: CRM (`src/components`, `src/app/(dashboard)`), client portal (`src/app/(portal)`, `src/components/portal`), branded email templates (`src/lib/email/templates`), PDF templates (`src/lib/pdf/templates`), public marketing site (`website/`). Read-only audit; no edits.
---
## CRITICAL
### C1. Four interchangeable nouns for the same domain entity
The same record is called **interest**, **lead**, **prospect**, and **deal** across surfaces. Sales reps and clients see all four within a single session.
- Entity / schema / URL: `interest` (everywhere — DB, `/interests`, portal nav, page titles).
- "Lead":
- `src/components/clients/client-interests-tab.tsx:30` `LEAD_CATEGORY_LABELS = { hot_lead: 'Hot lead', … }` and the column header literally rendered as `<dt>Lead</dt>`.
- `src/components/interests/interest-tabs.tsx:~736` section heading `<h3>Lead</h3>` + `<EditableRow label="Lead Category">`.
- `src/components/berths/berth-interests-tab.tsx:44` `hot_lead: 'Hot Lead'` (Title Case mismatch with sibling above).
- `src/components/dashboard/lead-source-chart.tsx` + `source-conversion-chart.tsx` widget title "Lead Source Attribution".
- "Prospect":
- `src/components/berths/berth-detail-header.tsx:~275` form label `Linked prospect (optional)` + helper `Link this status change to the prospect (interest) it relates to.` — explicitly parenthesises the canonical name as a synonym.
- Residential uses `prospect` as a _stage value_ (`Prospect` chip in `residential-clients-list.tsx`, residential-client tabs) — confusing because elsewhere "prospect" means the record itself.
- "Deal":
- `src/components/berths/berth-tabs.tsx` tab label `Deal Documents`, API path `/api/v1/berths/[id]/deal-documents`.
- `src/components/clients/bulk-archive-wizard.tsx` placeholder `Why are you archiving this late-stage deal?` and `smart-archive-dialog.tsx` heading `Late-stage deal — confirmation required`.
- `src/components/dashboard/hot-deals-card.tsx`, widget label "Hot deals".
- Pervasive in code comments inside `interest-tabs.tsx`, `inline-stage-picker.tsx` — comments will leak into future copy.
Recommendation: pick one client-facing noun (the domain choice is `interest`; "deal" is fine as marketing/internal shorthand for _hot_ interests but should never appear in fields/labels). Rename `Deal Documents``Interest Documents`, "Linked prospect" → "Linked interest", `<dt>Lead</dt>` / `<h3>Lead</h3>``Buyer profile` or `Category`. Residential `prospect` is a stage so leave alone but consider renaming to `enquiry` or `new` to free up the word.
### C2. Raw machine status strings leak to the client portal
`src/app/(portal)/portal/interests/page.tsx:80` renders
```
<span>EOI: {interest.eoiStatus.replace(/_/g, ' ')}</span>
```
and line 65 the same pattern for `leadCategory`. Clients see `EOI: waiting for signatures`, `EOI: partially signed`, `hot lead`, etc. — the underscores are stripped but the enum vocabulary is not translated. "hot lead" exposed to the client is also a privacy/optics issue (we are telling the prospect we classified them).
Fix: add a `PORTAL_EOI_STATUS_LABEL` map (e.g. `waiting_for_signatures → "Awaiting your signature"`, `signed → "Signed"`); never render `leadCategory` in the portal at all.
### C3. Signing-status labels diverge across three surfaces
For the same enum (`draft | sent | partially_signed | completed | expired | cancelled`):
| Surface | Label set |
| ------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| `interest-eoi-tab.tsx` / `interest-contract-tab.tsx` / `interest-reservation-tab.tsx` | Draft / **Awaiting signatures** / Partially signed / **Signed** / Expired / Cancelled |
| `documents-hub.tsx` `STATUS_PILL_MAP` + `document-list.tsx` | Renders raw enum (`<Badge>{doc.status}</Badge>`) — `sent`, `partially_signed`, `completed` displayed verbatim |
| `signing-progress.tsx` `STATUS_LABELS` | Only `Pending / Signed / Declined` — missing Sent, Expired, Cancelled |
| `notification-digest.ts` email | `eoi_signed: 'EOI signed'`, `eoi_completed: 'EOI completed'` — "signed" vs "completed" used as if different events |
| Realtime toast (`realtime-toasts.tsx`) | `EOI fully signed` (yet another phrase) |
A user clicks a status pill in Documents Hub (shows `partially_signed`), opens the interest EOI tab (shows `Partially signed`), gets a toast that says `EOI fully signed`, and an email that says `EOI completed` — four phrasings for one document. Centralise via `src/lib/labels/document-status.ts` (already a pattern for `seed-data` etc.) and import everywhere.
---
## HIGH
### H1. "Save" button verbiage has six forms
Inventory of submit buttons across `src/components/`:
- `Save` — inline editors, addresses-editor, contacts-editor, inline-phone-field, settings-manager, image-cropper.
- `Save Changes` (Title Case) — client-form, yacht-form, berth-form, expense-form, company-form, interest-form, reminder-form, role-form, tag-form, custom-field-form, port-form, webhook-form, template-form.
- `Save changes` (sentence case) — admin/users/user-form, interest-contact-log-tab.
- `Save profile`, `Save username`, `Save preferences`, `Save overrides`, `Save template`, `Save view` — descriptive variants.
- `Saving...` (ASCII three dots) vs `Saving…` (single ellipsis char) — both appear, ~50/50 split.
Decide: sentence case (`Save changes`) and standardise the loader as `Saving…` (Unicode ellipsis matches Prettier-friendly UTF-8 elsewhere in the codebase). The same form-pattern with a different casing in adjacent admin sections (user-form `Save changes` vs role-form `Save Changes`) is a likely playwright-visual diff source too.
### H2. "New X" vs "Create X" mismatch on the same surface
Empty-state CTA and form submit button often disagree:
- Clients list action: `label: 'New Client'` → opens sheet titled `New Client` → submit button `Create Client`.
- Same pattern for Yacht, Expense, Company, Role, Tag, Template, Webhook, Port.
- `aria-label="New interest"` on `interest-list.tsx`, button text `Create Interest`.
Pick one verb per action lifecycle (`New …` for affordances, `Create …` for confirm, OR unify to a single `Create …` throughout). The current pattern teaches the user two words for the same action.
### H3. Public marketing form CTAs are all `Submit`
All five website forms (`website/components/pn/specific/website/{form,contact,berths-item,supplement-eoi,register,news-item}/form.vue`) use a bare `Submit` button. The matching confirmation email subjects say "Thank You for Your Interest" and the PDF title is "Expression of Interest". The CTA doesn't mention what the user is submitting.
Recommendation: replace with action-specific verbs — `Register interest`, `Send enquiry`, `Request a call back` (already used as helper text above the `Submit` on `berths-item/form.vue`). Loading state `Submitting form...` is also redundant — `Sending…` is shorter and matches the CRM email send loader.
### H4. EOI vs Expression of Interest abbreviation discipline
Both forms appear, but the split is currently _inverse_ to what client-facing surfaces should do:
- **Client-facing surfaces** (portal `/portal/documents` page, email template `documentLabel`, website pages, PDF body, Documenso template-form select option) correctly spell out _Expression of Interest_.
- **But** the portal interests page (`/portal/interests`) and the portal documents page header both say `EOIs` alongside `Expression of Interest``text-sm text-gray-500 mt-1`: "Your contracts, **EOIs**, and signed agreements".
- **Realtime toast** to staff says `EOI fully signed` — fine for staff but the same toast also fires for portal users if they have a session? (Worth verifying; if so, full form needed.)
- **PDF body** (`eoi-standard-inapp.ts:177`) introduces the abbreviation correctly: _This Expression of Interest (the "EOI")_ — good. Other PDFs (`interest-summary-template.ts`) use raw `EOI status: …` without that introduction.
Rule of thumb: portal/email/marketing → full form; CRM internal UI → `EOI`. Audit the portal pages to remove all `EOI` mentions.
### H5. Email greeting + sign-off tone drift
Across `src/lib/email/templates/`:
- Greeting: `Dear {name},` (portal-auth, crm-invite, inquiry-client-confirmation, residential-inquiry, document-signing — all three modes), `Hello {name},` (admin-email-change), `Hi {name},` (notification-digest), `Welcome,` (fallback in crm-invite/portal-auth), `Dear Administrator,` (inquiry-sales-notification).
- Sign-off: `Best regards,` (inquiry-client, residential-inquiry), `Thanks,` (admin-email-change), `Thank you,` (document-signing), `The {portName} team` (most), `{senderName}` (signing-invitation when provided).
Pattern: client-touching emails should land on one greeting (`Dear {name},`) and one sign-off (`Best regards, / The {portName} team`). The casual `Hi {name},` on the notification-digest is fine because it's internal to staff, but `Hello` on admin-email-change is just a third style for the same internal audience.
---
## MEDIUM
### M1. "Signing envelope" jargon leaks into a user-facing dialog
`smart-archive-dialog.tsx` exposes:
- `<option value="leave">Leave envelope pending</option>`
- `<option value="void_documenso">Void the signing envelope</option>`
"Envelope" is Documenso/DocuSign internal vocabulary. Replace with `Leave signing request pending` / `Cancel the signing request`. The Documenso admin page is OK to keep `envelope` (dev-facing settings).
### M2. Override / Confirm overloaded action verb
`interest-stage-picker.tsx:179` shows `{overrideEffective ? 'Override stage' : 'Confirm'}`. The non-override label is too generic; users land on a stage-change dialog and the primary button just says "Confirm". Suggest `Move to {stage}` (parameterised) or `Update stage`.
### M3. Loading state punctuation inconsistency
`Saving...` (ASCII) vs `Saving…` (Unicode `…`). Easy global codemod; matters for Playwright visual diffs and for screen-reader pronunciation (three dots gets read out as "dot dot dot").
### M4. Reminder/alert verb spread
`Acknowledge` / `Dismiss` / `Mark complete` / `Resolve` (audit log) — four near-synonyms for "I dealt with it". Reminders use `Mark complete`, Alerts use `Acknowledge` + `Dismiss`. Acceptable if the semantics differ (acknowledge = seen, complete = done) but the current copy doesn't make that distinction clear.
### M5. "Hot Lead" / "Hot lead" casing within the same domain
- `client-interests-tab.tsx` and `interest-card.tsx`: `Hot lead`.
- `berth-interests-tab.tsx` and `interest-filters.tsx`: `Hot Lead`.
- `dashboard/hot-deals-card.tsx`: `EOI Signed`, `EOI Sent` (Title Case).
- General CRM trend is sentence case — Title Case in these three files is the outlier.
---
## Suggested follow-ups
1. Add `src/lib/labels/document-status.ts`; refactor `documents-hub`, `document-list`, `interest-{eoi,contract,reservation}-tab`, `signing-progress`, `notification-digest` to import it. (C3)
2. Portal: never render `eoiStatus` / `leadCategory` raw; map first. (C2)
3. Rename `Deal Documents` tab + `/deal-documents` route to `Documents`. (C1)
4. Codemod `Save Changes``Save changes`, `Saving...``Saving…`, unify `New X` vs `Create X`. (H1, H2, M3)
5. Website: replace bare `Submit` on five forms with action-specific verbs. (H3)
6. Portal/email/PDF: drop bare `EOI` abbreviation in favour of `Expression of Interest`. (H4)
7. Standardise email greeting/sign-off pair per audience tier. (H5)
8. Replace `envelope` jargon in `smart-archive-dialog.tsx`. (M1)
Verified clean: `Inquiry` spelling consistent (American); `crm-invite.ts` use of "CRM" is staff-only and intentional; reports PDFs only use enum strings internally.
---
## 16. Dependency + supply-chain hygiene audit (deps-auditor)
# Dependency + Supply-Chain Hygiene Audit
**Repo:** new-pn-crm @ `feat/documents-folders` · **Date:** 2026-05-12 · **Auditor:** task #16
**Inputs:** `pnpm audit`, `pnpm outdated`, `pnpm licenses list [--prod]`, `pnpm why`,
`pnpm install --frozen-lockfile`, `package.json`, `pnpm-lock.yaml`, `Dockerfile*`.
**Headline:** No known CVEs (`pnpm audit` → 0 across info/low/moderate/high/critical),
no GPL/AGPL anywhere in the tree, lockfile is intact and reproducible.
Real risk concentrates in two places: a **Node 20 base image at/past EOL**, and
a **`@types/node` major-version mismatch** that lets the type-checker greenlight
runtime APIs that don't exist in Node 20. Everything else is incremental.
---
## CRITICAL
### C1 — `@types/node@^25.6.2` against Node 20 runtime
- **What:** `package.json` line 111 pins `@types/node` to `^25.6.2`; resolved version
is `25.6.2`. Every Dockerfile and the esbuild target (`--target=node20`) ships Node 20.
- **Why it bites:** Node 25 is the _Current_ release line — it includes APIs added
_after_ Node 20 (e.g. recent `node:sqlite` evolution, `node:test` additions,
`process.permission`, newer `fs.glob` shapes, updated `Web*` globals). TypeScript
cannot tell you you've called something that won't exist on the runtime — the
build passes, the prod worker crashes at first call.
- **Severity:** CRITICAL — silent landmine, no compile warning, no audit signal.
- **Fix:** Downgrade to `@types/node@^20`. If you genuinely want to consume Node-22
APIs, also bump the base images to `node:22-alpine` (see C2) and `--target=node22`.
### C2 — Node 20 LTS at end-of-life
- **What:** All three Dockerfiles (`Dockerfile`, `Dockerfile.dev`, `Dockerfile.worker`)
use `FROM node:20-alpine` (no minor pin). Node 20 entered Maintenance LTS Oct 2025
and reaches **EOL on 2026-04-30** — i.e. ~2 weeks before today (2026-05-12). The
image will still build, but Node 20 no longer receives security patches from
upstream. Alpine's package security advisories will continue for OS libs only.
- **Severity:** CRITICAL — the base image is the largest surface in the SBOM and is
now unpatched against new V8/Node CVEs.
- **Fix:** Move to `node:22-alpine` (Active LTS through Apr 2027). Pin the minor
digest (`node:22.11-alpine@sha256:…`) for reproducibility. Bump esbuild
`--target=node22` in `build:server` / `build:worker` scripts. No app-code change
expected — the codebase already uses ESM-native idioms.
---
## HIGH
### H1 — `@types/pdfkit` mis-classified as a runtime dependency
- **What:** `package.json` line 62 puts `@types/pdfkit@^0.17.6` under `dependencies`
alongside `pdfkit`. Type packages are compile-time only.
- **Impact:** Slightly bloats prod `node_modules` and the Docker prod image; more
importantly it's a **classification smell** — anyone reasoning about supply-chain
surface will look at `dependencies` and assume it's executed.
- **Fix:** Move to `devDependencies` alongside the other `@types/*`.
### H2 — Deprecated transitive: `glob@10.5.0`
- **What:** `pnpm-lock.yaml` carries `glob@10.5.0` with the upstream notice
_"Old versions of glob are not supported and contain widely publicized security
vulnerabilities … please update."_ `pnpm why glob` traces it to
`archiver-utils@5.0.2 ← archiver@7.0.1` (a direct dependency, used by GDPR
exports per CLAUDE.md).
- **Impact:** `glob` < 11 has known prototype-pollution-class issues. `pnpm audit`
doesn't flag them because the advisories require an exploitable callpath, but
the deprecation notice is the upstream signal to upgrade.
- **Fix:** Bump `archiver` to `^8.0.0` (already shown in `pnpm outdated`).
Archiver 8 pulls a current `glob` and the API is source-compatible for the
way `src/lib/services/gdpr-export.service.ts` uses it. Verify with the GDPR
export Playwright case after upgrade.
### H3 — Deprecated transitive: `@esbuild-kit/{core-utils,esm-loader}`
- **What:** Both are marked _"Merged into tsx: https://tsx.is"_ by upstream. They
come from `drizzle-kit@0.31.10` and `better-auth@1.6.9` — not directly fixable
here.
- **Severity:** HIGH (visibility only; no known exploit). The packages still
function but receive no upstream maintenance.
- **Fix:** Track `drizzle-kit` and `better-auth` releases; both maintainers have
open PRs migrating to bare `tsx`. No local change today — file as a watch item.
### H4 — `pnpm.overrides` uses floating ranges
- **What:** `package.json` `pnpm.overrides`:
```
vite: "8.0.5" // pinned ✓
esbuild: ">=0.25.0" // floating ✗
postcss: ">=8.5.10" // floating ✗
```
- **Impact:** The `>=` overrides re-resolve on every `pnpm install --no-lockfile`
/ `pnpm update`. They were added as CVE-fix safety nets, but their floating
shape defeats the lockfile's reproducibility guarantee on the very transitives
that prompted the override in the first place.
- **Fix:** Replace `">=0.25.0"` / `">=8.5.10"` with the actual resolved
versions (currently `0.27.7` and `8.5.14`), or use exact pins. Re-evaluate
whenever you bump esbuild/postcss.
---
## MEDIUM
### M1 — Major-version upgrades available
Captured from `pnpm outdated` (today vs latest):
| Package | Current | Latest | Risk |
| ------------------------------- | ------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `next` | 15.5.18 | 16.2.6 | App Router breaking changes in 16; defer until React/Next stabilise together |
| `eslint` + `eslint-config-next` | 9 / 15 | 10 / 16 | Lint-only; do alongside Next 16 |
| `zod` | 3.25.76 | 4.4.3 | **Wide blast radius** — every `src/lib/validators/*.ts` + `createTemplateSchema` `VALID_MERGE_TOKENS` allow-list logic. Plan as its own task |
| `tailwindcss` | 3.4.19 | 4.3.0 | Config migration (Tailwind 4 = Lightning CSS) — schedule with design tokens work |
| `@hookform/resolvers` | 3.10.0 | 5.2.2 | API change for zod resolver — paired with zod 4 |
| `react-day-picker` | 9 | 10 | Verify in calendar/date pickers |
| `archiver` | 7.0.1 | 8.0.0 | Clears H2 — do first, it's narrow |
| `esbuild` (dev) | 0.27.7 | 0.28.0 | Patch-y; trivial |
Minor upgrades (`bullmq`, `better-auth`, `@tanstack/react-query`, `vitest`,
`@playwright/test`, `@types/node` patch, `libphonenumber-js`, `tailwind-merge`,
`lint-staged`, `react-grab`) are all single-digit-bumps, no risk.
### M2 — `dotenv` lives in `devDependencies` but is imported by production-runnable scripts
- **What:** `dotenv` is `devDependencies` (line 117) but is imported by
`scripts/backfill-document-folders.ts` (documented in CLAUDE.md as a
deploy step), `scripts/import-berths-from-nocodb.ts`, `scripts/db-reset.ts`,
`src/lib/db/seed.ts`, etc.
- **Impact:** Anyone who runs `pnpm install --prod` and then `pnpm
db:backfill:doc-folders` (a documented deploy command) fails at module
resolution. Not exploited today because deploy runs `pnpm install` without
`--prod` for those steps, but the contract is implicit.
- **Fix:** Either (a) move `dotenv` to `dependencies`, or (b) document in
CLAUDE.md that the backfill must be run from a full-deps image / dev
workstation. (a) is the smaller foot-gun.
### M3 — `node:20-alpine` is unpinned (floats on minor + digest)
- **What:** No minor or digest pin on the `FROM` lines.
- **Impact:** Two builds an hour apart can land on different base layers; SBOM
drifts without code changes; pre-existing CVE-fix bumps reach prod
un-noticed (mostly a good thing, but caught me out audits).
- **Fix:** Use `node:22.11-alpine@sha256:<digest>` once you move to 22. Re-pin
monthly as part of dependency hygiene.
### M4 — No `engines` field in `package.json`
- **What:** `package.json` has no `engines.node` / `engines.pnpm`. The `packageManager`
field pins pnpm to `10.33.2`, but Node is implicit.
- **Impact:** `pnpm install` doesn't enforce the runtime; a contributor on
Node 18 will install successfully and only fail later. CI hides this because
Docker is the source of truth.
- **Fix:** Add `"engines": { "node": ">=22 <23", "pnpm": ">=10" }` and turn on
`engine-strict=true` in `.npmrc` if you want hard enforcement.
---
## LICENSE AUDIT (prod tree)
**No GPL or AGPL anywhere.** Non-permissive licenses found:
| Package | License | Disposition |
| ------------------------------------------------------- | --------------------- | ---------------------------------------------------------- |
| `@img/sharp-libvips-darwin-arm64` (and other arches) | LGPL-3.0-or-later | OK — dynamic link, native binding; LGPL §5 covers this use |
| `dompurify` | MPL-2.0 OR Apache-2.0 | OK — dual; you may rely on Apache-2.0 |
| `@zone-eu/mailsplit` (transitive of `mailparser`) | MIT OR EUPL-1.1+ | OK — dual; MIT chosen |
| `caniuse-lite` | CC-BY-4.0 | OK — data only, attribution satisfied by upstream notices |
| `postgres` (driver) | Unlicense | OK — public-domain-style |
| `axe-core` (dev only) | MPL-2.0 | OK — dev/test, not redistributed |
| `lightningcss`, `lightningcss-darwin-arm64` (dev/build) | MPL-2.0 | OK — build-time, MPL is file-scoped |
| `tslib` | 0BSD | OK |
No `UNLICENSED` / "Custom" / SSPL packages.
---
## LOCKFILE + AUDIT INTEGRITY
- `pnpm audit`**No known vulnerabilities.**
- `pnpm audit --json metadata` shows 989 deps, **0** vulns, **0** dev (because
the audit metadata reports `dependencies` after filtering — clean).
- `pnpm install --frozen-lockfile`**"Lockfile is up to date"**, no warnings,
no peer-dep `Unmet`/`Conflict` lines. Husky `prepare` hook ran clean.
- `pnpm-lock.yaml` has 139 `peerDependencies:` entries — all satisfied (no
`peer:missing` markers in the resolved graph).
- No "phantom" deps detected — only the two deprecated chains in H2/H3.
---
## RECOMMENDED FIX ORDER
1. **C1 + C2 together** — bump base image to `node:22.11-alpine`, drop
`@types/node` to `^22`, esbuild `--target=node22`. Smoke test build +
worker.
2. **H1** — move `@types/pdfkit` to `devDependencies` (1-line PR).
3. **H2**`archiver@^8.0.0`; run GDPR-export Playwright case.
4. **H4** — replace floating overrides with exact pins.
5. **M2** — promote `dotenv` to `dependencies`, OR document deploy contract.
6. **M4** — add `engines` field.
7. **M1 majors** — schedule one-at-a-time, starting with `archiver` (done in
#3) and `esbuild`. Next 16 / Zod 4 / Tailwind 4 are each their own
project.
Total touch points: ~6 single-line PRs + 3 scheduled major-bump tracks.
---
## 17. Build + deploy + prod readiness audit (build-auditor)
# Audit #17 — Build + Deploy + Prod Readiness
Scope: `Dockerfile`, `Dockerfile.dev`, `Dockerfile.worker`, `docker-compose.yml`, `docker-compose.prod.yml`, `docker-compose.dev.yml`, `next.config.ts`, `src/lib/env.ts`, `.env.example`, plus the entry points `src/server.ts` / `src/worker.ts` and health endpoints.
Branch at audit time: `feat/documents-folders`.
---
## CRITICAL
### C1 — No `.dockerignore` in repo root
`cat .dockerignore` returns no such file. Build context at audit time:
- `node_modules` = 4.9 GB
- `.next` = 2.7 GB
- `.git` = 41 MB
- Plus `storage/`, `playwright-report/`, `test-results/`, `tests/`, `scripts/`, screenshots, `.env*`.
Every `docker build` ships ~7.6 GB to the daemon. Worse, `Dockerfile.dev` and the builder stage of `Dockerfile` / `Dockerfile.worker` all do `COPY . .`, which means:
- `.env`, `.env.local`, `.env.dev` (if present) end up in build-layer history of the builder stage. The runner stage doesn't re-copy them, but intermediate layers are cacheable and pushable; a careless `--target builder` push leaks secrets.
- Local `node_modules` (built on macOS) get shipped to an Alpine builder and then ignored — silent waste, and `node_modules/sharp` darwin binaries collide with the musl install.
- Test snapshots / fixtures get baked into the trace.
**Fix:** add `.dockerignore` covering at minimum: `node_modules`, `.next`, `.git`, `.env*`, `dist`, `storage`, `playwright-report`, `test-results`, `tests`, `coverage`, `*.log`, `.DS_Store`, `.vscode`, `.idea`, `.husky`, `docker-compose.*.yml` (not needed inside the image).
### C2 — `EMAIL_REDIRECT_TO` has no production refusal guard
`CLAUDE.md` is explicit: _"must be unset in production"_. The Zod schema in `src/lib/env.ts:41` accepts it unconditionally as `z.string().email().optional()`. If a staging `.env` leaks into a prod deploy (a very common ops mistake with the current `env_file: .env` setup — see M4), every outbound client email, EOI signing invite, and webhook delivery silently routes to the staging mailbox and the production user sees… nothing.
Evidence of the blast radius — `EMAIL_REDIRECT_TO` short-circuits:
- `src/lib/email/index.ts:131` — all SMTP recipients rewritten.
- `src/lib/services/documenso-client.ts:118-180` — all Documenso recipient lists + template formValues overridden.
- `src/lib/queue/workers/webhooks.ts:94-107` — webhook deliveries fully suppressed.
**Fix:** add a `superRefine` (or schema-level cross-check) that hard-fails when `NODE_ENV === 'production' && EMAIL_REDIRECT_TO` is set. Belt-and-braces: log a `logger.fatal` and `process.exit(1)` from `src/lib/email/index.ts` boot if the condition is reached.
### C3 — Custom server depends on `socket.io` that may not be in the standalone trace
`Dockerfile` runner stage copies only `.next/standalone/`, `.next/static/`, `public/`, and `dist/server.js` (renamed `server-custom.js`). There is no separate `pnpm install --prod` in the runner — every runtime dep must arrive via Next's `output: 'standalone'` tracer.
`src/server.ts` imports `@/lib/socket/server`, which `import { Server } from 'socket.io'` and `@socket.io/redis-adapter`. esbuild bundles `server.ts` with `--packages=external`, so at runtime `server-custom.js` does `require('socket.io')` against `/app/node_modules`. **Neither `socket.io` nor `@socket.io/redis-adapter` is in `next.config.ts:66 serverExternalPackages`**, and no Next route ever imports `@/lib/socket/server` (the socket server is only instantiated by the custom entry point), so the Next tracer has no reason to include them in `.next/standalone/node_modules`.
If this has been working in prod it's only because the packages get pulled in transitively via something Next does see. The dependency is invisible to the build system — a Next minor upgrade could drop them from the trace tomorrow.
**Fix:** add both to `serverExternalPackages`, _and_ extend `outputFileTracingIncludes` for the custom-server bundle, or COPY them explicitly into the runner from the deps stage:
```
COPY --from=deps --chown=nextjs:nodejs /app/node_modules/socket.io ./node_modules/socket.io
COPY --from=deps --chown=nextjs:nodejs /app/node_modules/@socket.io ./node_modules/@socket.io
```
(Same risk applies to anything else only the custom server imports — audit `src/server.ts` import graph.)
---
## HIGH
### H1 — CSP keeps `'unsafe-inline'` on `script-src` in production
`next.config.ts:31``script-src 'self' 'unsafe-inline'` regardless of `isProd`. Only `'unsafe-eval'` is gated. With `'unsafe-inline'` on, the entire XSS defence of CSP is defanged — any reflected/stored XSS still executes inline. The comment claims it's for Tailwind/Radix runtime styles, but those affect `style-src`, not `script-src`. Move to nonce or hash-based script policy in prod.
### H2 — `NEXT_PUBLIC_APP_URL` not in Zod schema, but baked at build time
`.env.example:67` lists `NEXT_PUBLIC_APP_URL` but `src/lib/env.ts` does not validate it. The builder stage runs `pnpm build` with `SKIP_ENV_VALIDATION=1`, so Next inlines an empty string when the var is missing. `src/providers/socket-provider.tsx:67` then runs `io(process.env.NEXT_PUBLIC_APP_URL!, {...})``io('', {...})` → browser falls back to `window.location.origin`, which silently _works_ in most cases but breaks the moment the CRM is fronted by a different origin than the socket gateway. `src/lib/auth/client.ts:12` has the same risk for the auth base URL during SSR.
**Fix:** add `NEXT_PUBLIC_APP_URL: z.string().url()` to the schema and pass it into the builder stage (via `--build-arg` + `ARG NEXT_PUBLIC_APP_URL` in `Dockerfile`). Drop `SKIP_ENV_VALIDATION=1` from the builder stage, or at least surface a build-time warning for missing `NEXT_PUBLIC_*` vars.
### H3 — `Dockerfile.dev` runs as root and re-installs on every layer rebuild
- No `USER` directive → dev container is root inside the bind-mounted `/app`. Any `pnpm dev`-spawned child can write to host-mounted files as root.
- Combined with C1 (no `.dockerignore`), the `COPY package.json pnpm-lock.yaml ./` followed by `pnpm dev` over a bind mount means the host `node_modules` shadow the in-container install on macOS (different platform), and dev images frequently break on `sharp`/`tesseract.js` until rebuilt.
**Fix:** create a `node` user (or reuse uid 1001), chown `/app`, drop privileges, and ship a working `.dockerignore` so the build context isn't 7.6 GB.
### H4 — `docker-compose.prod.yml` has no resource limits, no log rotation
`crm-app`, `crm-worker`, `postgres`, `redis` all run with default `unlimited` memory and the default `json-file` log driver. On a small VPS one runaway worker OOMs the host. The default log driver has no rotation, so disks fill silently. Add `deploy.resources.limits` (or top-level `mem_limit` in non-swarm mode) and `logging: driver: json-file, options: { max-size: "10m", max-file: "5" }` to every service.
### H5 — Compose healthcheck targets `localhost:3000`, but `env.PORT` is configurable
`docker-compose.prod.yml:45` and `.yml:43` hardcode `http://localhost:3000/api/health`. If a deploy sets `PORT=8080` via `.env`, the container listens on 8080, the healthcheck stays on 3000 → permanent "unhealthy" → restart loop. Either drop PORT from `env.ts` (the schema validates it but compose ignores it) or templatize the healthcheck (`wget … http://localhost:${PORT:-3000}/api/health`).
---
## MEDIUM
### M1 — Worker healthcheck only pings Redis
`Dockerfile.worker:38-39` checks `Redis.ping()`. A wedged BullMQ consumer (silent disconnect from the queue stream but TCP alive) passes this probe while jobs queue forever. Upgrade to read a sentinel BullMQ heartbeat key the worker writes on each job loop, or expose a tiny HTTP `/healthz` from the worker that asserts `queue.client.status === 'ready'` on the named queues.
### M2 — Worker re-installs deps in the runner stage
`Dockerfile.worker:31-32` does `pnpm install --frozen-lockfile --prod` in the runner — network round-trip on every build, even though the `deps` stage already has the full tree. Move to `COPY --from=deps /app/node_modules ./node_modules` then `pnpm prune --prod`, or use `pnpm deploy --prod --filter <pkg>`. Save ~3060s per build and removes a network failure mode.
### M3 — `next.config.ts` `serverExternalPackages` likely incomplete
`socket.io`, `@socket.io/redis-adapter`, `imapflow`, `mailparser`, `pdf-lib`, `pdfme`, `sharp`, `tesseract.js` are all heavy native/CJS-leaning deps used server-side. Only 8 are listed. Anything missing risks bundling into the Next route trace (slower cold start, larger lambda/standalone size, possible runtime require failures for native bindings). Audit the import graph and add the rest.
### M4 — `env_file: .env` puts every secret into the container env
`docker-compose.prod.yml:36,59`. Anyone with `docker inspect` or `/proc/<pid>/environ` access on the host reads `BETTER_AUTH_SECRET`, `EMAIL_CREDENTIAL_KEY`, `DOCUMENSO_API_KEY`, `DOCUMENSO_WEBHOOK_SECRET` in plaintext. Switch to docker secrets (`/run/secrets/...`) or a sidecar mount and have `env.ts` read from file paths when the `_FILE` suffix is present.
### M5 — `.env.example` missing schema entries
Not in `.env.example`: `MULTI_NODE_DEPLOYMENT`, `WEBSITE_INTAKE_SECRET`, `EMAIL_REDIRECT_TO` (intentional per docs but the doc note exists; add a commented `# EMAIL_REDIRECT_TO=` line so devs _know_ it's an option), `DOCUMENSO_CLIENT_RECIPIENT_ID` / `DOCUMENSO_DEVELOPER_RECIPIENT_ID` / `DOCUMENSO_APPROVAL_RECIPIENT_ID` (env.ts has all three with defaults, but they should still be documented), `PORT`. The `EMAIL_CREDENTIAL_KEY` placeholder is 64 zeros — fine for dev but worth a comment that prod must rotate.
### M6 — Node 20-alpine, no PID-1 init
Both Dockerfiles use `node:20-alpine` (still LTS, but the `node:22-alpine` LTS is current). Neither installs `tini`/`dumb-init` — Node handles SIGTERM itself in these entrypoints so it's not broken, but if any child process is ever spawned (e.g. tesseract worker pool) zombie reaping is on Node. Cheap upgrade: `RUN apk add --no-cache tini && ENTRYPOINT ["/sbin/tini", "--"]`.
### M7 — `Dockerfile` runner has no HEALTHCHECK directive (only compose has one)
Image-level `HEALTHCHECK` makes the image self-describing — useful for non-compose orchestrators (swarm, nomad, k8s readinessProbe via `exec`). Add the same `wget … /api/health` line to the app Dockerfile as the worker Dockerfile already does for Redis.
### M8 — CSP `connect-src https:` / `img-src https:` are wide
Tighten to an allow-list once per-port branding exposes the configured S3 host.
### M9 — Builder stage never sets `NODE_ENV=production`
`Dockerfile:14-15` sets `NEXT_TELEMETRY_DISABLED=1` + `SKIP_ENV_VALIDATION=1` but not `NODE_ENV`. `next.config.ts:3` branches on `isProd` for CSP — make this deterministic with `ENV NODE_ENV=production` above `RUN pnpm build`.
---
## Quick-win checklist
1. Add `.dockerignore` (C1).
2. Refuse-to-start when `EMAIL_REDIRECT_TO` set in prod (C2).
3. Pin socket.io into the standalone trace (C3).
4. Remove `'unsafe-inline'` from script-src in prod CSP (H1).
5. Validate `NEXT_PUBLIC_APP_URL` at build (H2).
6. Add compose resource + log limits (H4).
7. Templatize healthcheck PORT (H5).
---
## 18. Berth recommender quality audit (recommender-auditor)
# Audit — `src/lib/services/berth-recommender.service.ts`
Read-only audit. Scope per task #18: tier ladder, heat weights, max-oversize cap,
fallthrough policy paths, port-isolation defense-in-depth, CTE correctness,
cooldown / late-stage settings, n+1 risk, edge cases.
Code as of feat/documents-folders @ 660553c.
---
## CRITICAL
None blocking ship. The recommender's entry-point port guard
(`interestInput.portId !== args.portId``CodedError`) and the `feasible`
CTE's `b.port_id = $portId` correctly fence cross-tenant queries at the top
level. The remaining issues are correctness / defense-in-depth.
---
## HIGH
### H1. `active_interest_count` lacks `i.id IS NOT NULL` defense-in-depth filter
`aggregates` CTE (lines 475479):
```sql
COUNT(*) FILTER (
WHERE i.archived_at IS NULL
AND i.outcome IS NULL
AND ib.is_specific_interest = true
) AS active_interest_count
```
The LEFT JOIN on `interests i ON i.id = ib.interest_id AND i.port_id = $portId`
intentionally sets `i.id = NULL` when an `interest_berths` row points at a
cross-port interest (orphan / legacy data). For an `ib` row with
`is_specific_interest = true` whose `i.id` was nulled by the port-filter,
the FILTER evaluates `archived_at IS NULL → TRUE`, `outcome IS NULL → TRUE`,
`is_specific_interest = true → TRUE` — and the row is **counted as an active
interest against the feasible berth**, mis-classifying it as Tier C (or D if
combined with the H2 issue below).
`total_interest_count` correctly guards with `FILTER (WHERE i.id IS NOT NULL)`
and the inline comment promises "FILTER also enforces port isolation
defense-in-depth," but only `total_interest_count` carries that guard. The
documented project precedent for the documents-hub aggregator is "defense-in-
depth `port_id` filter at every join — entry-point check alone is rejected."
The recommender should mirror that.
**Fix:** Add `AND i.id IS NOT NULL` to the `active_interest_count` filter
(also worth adding to `max_active_stage` for consistency — see M3).
### H2. `max_active_stage` not filtered by `is_specific_interest = true`
Lines 483496:
```sql
COALESCE(
MAX(CASE i.pipeline_stage ...) FILTER (
WHERE i.archived_at IS NULL AND i.outcome IS NULL
),
0
) AS max_active_stage,
```
The inline comment on `active_interest_count` is explicit: "An EOI-bundle-only
link (`is_specific_interest=false, is_in_eoi_bundle=true`) is legal coverage,
not a pitch, and shouldn't demote the berth." That intent is honoured by
`active_interest_count` but **violated by `max_active_stage`**, which sums
over all open `ib` rows regardless of the `is_specific_interest` flag.
Concrete failure: berth X is part of an EOI bundle for interest A
(at `deposit_10pct`, EOI-bundle-only — legal coverage, not a pitch). No
specific-interest link on X. The recommender computes
`active_interest_count = 0` (correct) but `max_active_stage = 6` (deposit_10pct).
`classifyTier` looks at `activeInterestCount > 0 && maxActiveStage >= 6`. The
first clause is false → tier A (correct). So in this specific case the bug is
masked.
But mixed case: berth X has both an EOI-bundle-only deep-stage link AND a
specific-interest link at `details_sent`. `active_interest_count = 1`,
`max_active_stage = 6` (from the bundle link) → Tier D. Per the documented
semantics it should be Tier C (the only pitch is at `details_sent` = 2).
This **falsely sends late-stage warnings into the UI** and, when
`tier_ladder_hide_late_stage = true`, hides the berth that should still be
recommendable.
**Fix:** Add `AND ib.is_specific_interest = true` to the `max_active_stage`
FILTER, to match `active_interest_count`.
---
## MEDIUM
### M1. Tier-B heat suppressed when berth has any active interest
`recommendBerths` (line 587598): heat is only computed when `tier === 'B'`.
A berth with strong fall-through history plus a single fresh tire-kicker
active interest is classified C (active > 0, no late stage), heat = null,
and all the recovery signal (recency / furthest stage / interest count /
EOI count) becomes invisible in the UI. The pipeline reason chip degrades to
`"1 active interest in early stage"` and the rep loses context about whether
the berth has a history of falling through at `contract_signed`.
Defensible as a design choice — the tier already encodes "needs attention" —
but documenting the trade-off (or surfacing a "history" indicator
independent of tier) would close the gap.
### M2. `pipeline_stage = 'completed'` (stage 9) absent from CASE expressions
Both `max_active_stage` and `fallthrough_max_stage` CASE blocks enumerate
`open … contract_signed` (18) with `ELSE 0`. The schema comment at
`src/lib/db/schema/interests.ts:18` lists `completed` as the ninth stage.
An interest at `pipeline_stage='completed'` with `outcome IS NULL` (defective
but possible) falls into the `ELSE 0` branch, producing the same maxStage as
"no data." Practically not harmful because `won` outcomes drop the row from
the active filter, but the silent collapse to 0 is fragile if the data
ever drifts. Either add the `completed` arm explicitly or replace the CASE
with a join against a stage-order lookup so the JS constant and the SQL
arm stay in lock-step.
### M3. CTE LEFT JOIN allows null-side rows into all aggregates
Same root cause as H1, narrower impact:
- `lost_count`: filter requires `i.outcome IS NOT NULL` → safe.
- `latest_fallthrough_at`, `fallthrough_max_stage`: same `outcome IS NOT NULL`
guard → safe.
- `eoi_signed_count`: `i.eoi_status = 'signed'` → null-on-null → safe.
- `max_active_stage`: filter is `i.archived_at IS NULL AND i.outcome IS NULL`
→ both NULLs match → row is included with CASE returning 0 → `COALESCE(...,
0)` masks it. Safe in practice but only by accident.
Adding the `i.id IS NOT NULL` predicate to every active-side filter is
cheap, matches the documents-hub precedent, and makes the intent
self-documenting.
---
## LOW
### L1. Negative / zero admin values not validated
`asNumber` accepts any finite number. `topNDefault = 0` returns an empty
recommendation list; `maxOversizePct = -50` produces a multiplier of 0.5
that combines with the `length_ft >= desiredLengthFt` filter to make every
berth infeasible; `fallthroughCooldownDays = -30` puts the cutoff in the
future and silently disables the cooldown (every fall-through is "before"
the future cutoff). Consider clamping at parse time (`Math.max(0, n)` for
non-negative settings, `Math.max(1, n)` for `topN`).
### L2. `outcome::text` cast is a no-op
`interests.outcome` is declared `text(...)` (not an enum) — the explicit
`::text` cast inside `LIKE 'lost%'` is redundant. Harmless; safe to drop.
### L3. Hard-coded heat normalisation constants
`computeHeat` uses 5 (interest count) and 3 (EOI count) as the
"saturate-at" caps and `30 / 365` days for the recency curve. These are
not admin-tunable. Per-port behaviour expectations may differ — a port
that sees 20+ interests on hot berths will have `interestCount`
saturating early. Promote to settings if tuning lands as a real need;
otherwise document the assumption.
### L4. Width-only feasibility cap uses 8× L/W heuristic
When `desiredLengthFt` is null but `desiredWidthFt` is set, the upper
length cap is `width * 8 * (1 + oversizePct/100)`. Inline comment owns
this as a pragmatic guard. Worth a unit test pinning the ratio so a
future tweak doesn't silently widen the cap.
---
## Architecture / structure — clean
- **Tier ladder** (`classifyTier`): A/B/C/D mapping is correct and matches
the doc-string. Tier C/D requires `activeInterestCount > 0`; D needs
`maxActiveStage >= LATE_STAGE_THRESHOLD (= 6, deposit_10pct)`. Tier B
requires `lostCount > 0`. Tier A is the fall-through default. Verified.
- **Heat defaults** (30 / 40 / 15 / 15) sum to 100, and `computeHeat`
re-normalises via `norm = 100 / weightSum` so admin tuning that doesn't
sum to 100 still produces a 0..100 score. Final `Math.max(0, Math.min(
100, ...))` clamps. Verified.
- **Max-oversize cap arithmetic**: `oversizeMultiplier = 1 + pct/100`,
applied as `length_ft <= desired * multiplier`. Inclusive upper bound;
the lower bound `length_ft >= desired` is also inclusive. Symmetric and
correct.
- **Fallthrough policy paths**:
- `immediate_with_heat` → no cooldown filter, heat surfaces immediately.
- `cooldown` → tier B berths whose `latestFallthroughAt > now -
cooldownDays` are skipped; non-B berths unaffected.
- `never_auto_recommend` → tier B berths skipped entirely (heat still
computed but never reaches the output).
All three paths correct.
- **`tier_ladder_hide_late_stage`**: default `true``showLateStage = false`
→ tier D rows dropped at line 564. Caller can override via the `showLateStage`
arg. Correct.
- **N+1 risk**: scoring loop is pure JS over the pre-fetched rowset. The
three-query shape (`loadRecommenderSettings`, `loadInterestInput`, main
CTE) is constant. No issue.
---
## Edge cases — verified
- **No history**: LEFT JOIN yields one null-side row, all FILTER
predicates short-circuit, counts = 0, COALESCEs return 0 → Tier A. ✓
- **All-lost history**: `active = 0, lost > 0` → Tier B; cooldown /
never paths each gate correctly; heat computes from fall-through fields. ✓
- **Mixed open + lost**: `active > 0` dominates → Tier C/D, heat = null
(see M1 trade-off). ✓ (with caveat)
- **Won outcome**: not matched by `outcome LIKE 'lost%' OR outcome =
'cancelled'`, doesn't inflate lost_count or contaminate fallthrough
stage. ✓
- **Cross-port leakage**: prevented at the entry point and the `feasible`
CTE; partial defense-in-depth gap at the aggregates layer (H1, M3).
---
## 19. Search relevance audit (search-auditor)
# Search relevance audit — task #19
**Scope:** `src/lib/services/search.service.ts`, `src/lib/services/search-nav-catalog.ts`, `src/components/search/command-search.tsx`, `src/hooks/use-search.ts`, plus the `resolve-id` route used by paste detection.
**Method:** Read each file in full, traced the ranking formulas, simulated the three test queries against `scoreEntry`, audited the graph-expansion merge for permission leakage, and spot-checked the catalog for duplicates.
---
## Spot-check results (the three required queries)
All three pass — but with a duplicate-result wrinkle (see HIGH-2).
| Query | Top entry | Score | Why |
| --------------- | ------------------------------------------------ | ----- | -------------------------------------------------------------------------- |
| `ai` | `/admin/ai` "AI configuration" | 80 | `label.startsWith("ai")` |
| `smtp` | `/settings/email` "Email accounts (SMTP / IMAP)" | 60 | label.includes('smtp') beats keyword-exact (50) on the `/admin/email` twin |
| `client portal` | `/admin/settings` "System Settings" | 50 | exact keyword match |
Runner-ups for `ai`: System Settings (35, `ai interest scoring` keyword prefix) and Profile (20, "avatar" substring). Acceptable noise floor.
---
## CRITICAL
None. The system is solid overall — sanitization is correct, port isolation is consistent, the affinity boost is bounded, and paste detection is port-scoped via the resolve-id endpoint (good — prevents cross-tenant navigation on super-admin paste).
---
## HIGH
### HIGH-1 — Graph expansion bypasses per-bucket permission gates (authorization leak)
`search()` (line 18091865) gates each direct-match bucket via `can(opts, '<x>.view')`. Then `expandGraph` runs unconditionally on whichever direct matches survived, and its output is pushed into `mergedClients` / `mergedInterests` / `mergedYachts` / `mergedCompanies` / `mergedBerths` via `mergeWithExpansion` (lines 19111915) — **without re-checking the destination bucket's permission**.
Concrete leak: a user with `berths.view` but **no** `interests.view` who searches `A12`:
- direct: berth A12 surfaces
- expansion: `interestsFromBerths` → populates `expandedInterests` → merged into `mergedInterests` → returned to the client
- The dropdown renders rows with the client's full name + pipeline stage from interests the user cannot otherwise read
Similar leaks: berth name via yacht-direct match → `expandedBerths`; client names via company-direct match → `expandedClients`; etc.
Fix: gate the expansion writes — only push `expanded.X` into `mergedX` when `can(opts, '<X>.view')`. Cleanest: pass the `can(...)` results into `mergeWithExpansion` as a "destination allowed" boolean.
### HIGH-2 — Six catalog labels are duplicated under different hrefs
The catalog has both `/settings/X` and `/admin/X` entries with near-identical labels, so common queries return two visually-similar rows pointing at different pages:
| Query | Hits |
| ------------------- | ------------------------------------------------------------------------------------ |
| `tags` | `/settings/tags` "Tags" + `/admin/tags` "Tags" |
| `branding` | `/settings/branding` + `/admin/branding` |
| `templates` | `/settings/templates` "Document templates" + `/admin/templates` "Document templates" |
| `storage` | `/settings/storage` + `/admin/storage` |
| `analytics`/`umami` | `/website-analytics` + `/admin/website-analytics` |
| `email`/`smtp` | `/settings/email` + `/admin/email` |
For users who have both `manage_settings` permissions, the dropdown shows two indistinguishable rows. Recommendation: either (a) collapse to one canonical entry per concept, or (b) disambiguate the label suffix (e.g., "Email accounts (admin)" vs "Email accounts (self-serve)"). The duplication reflects the underlying double-page structure, which deserves its own product decision.
### HIGH-3 — `looksLikeEmail` / `wantPhone` are computed then discarded
Lines 18041807 compute `wantEmail` and `wantPhone`, then lines 18851886 do `void wantEmail; void wantPhone;` with a TODO-style comment. Dead code paid for on every request. Either delete or wire it into the bucket reordering the comment promises.
---
## MEDIUM
### M-1 — `applyAffinity` re-sorts AFTER `mergeWithExpansion`, breaking the direct-first guarantee
`mergeWithExpansion` (line 1754) carefully puts direct matches before expansion rows. Then `apply()` (line 1905) re-sorts the merged list by recently-touched membership — a recently-touched related-via row can leapfrog a direct (non-touched) match. Either intentional (and should be documented) or a bug (and the merge ordering is wasted work). The current behavior surprises me: I expect direct matches to always win at the top.
### M-2 — `searchOtherPorts` mixes tsvector + trigram + ILIKE inconsistently
Clients section uses tsvector OR ILIKE; berths section uses `b.mooring_number % ${query}` (pg_trgm operator with the default 0.3 threshold). Berths are short codes — trigram on them is unreliable ("A12" trigram similarity to "B12" is ~0.5, both surface). Standardize: berths should match via prefix only (consistent with the in-port `searchBerths`).
### M-3 — `searchNotes` interest-branch source_label drops when no primary berth
Line 1166: `b.mooring_number AS source_label` is null when the interest has no primary berth, so the row's `sourceLabel` falls back to the generic "Interest" via `labelForSource`. The interest's client name would be a far more useful label (the interests bucket uses it). Patch: COALESCE with the client name via an extra JOIN.
### M-4 — Paste-detection regex hard-codes invoice numbering shape
`INVOICE_RE = /^INV-\d{6}-\d+$/i` (line 92) assumes the legacy 6-digit prefix. The resolve-id endpoint also accepts `invoice_number` lookup, so non-matching shapes silently fall through to free-text search. Not a security issue, but if invoice numbering changes the paste shortcut breaks invisibly. Consider expanding to `/^INV[-_/].+$/i` and letting the resolve-id endpoint be the source of truth.
### M-5 — Non-ASCII characters in names are stripped by tsquery sanitizer
`buildPrefixTsquery` (line 278) strips `[^a-z0-9_]`, so `Šibenik`, `Łukasz`, `Müller` all reduce to empty tokens. The trigram fallback `similarity()` saves most of these (it's diacritic-tolerant for >0.3 similarity), but exact-prefix matching on accented names is lost. For Croatian / Polish / German tenant names this matters. Consider `unaccent()` before sanitization or relax the regex to `\p{L}`.
### M-6 — `expandGraph` issues N+1-style queries for each direct ID set
The `LIMIT ${perBucketCap * direct.<X>Ids.length}` pattern (e.g., line 1387, 1463, 1486) scales the row cap by direct-match count. With limit=5 and 5 direct berth matches, that's 25 expansion rows fetched, then merged into the same 10-row `limit * 2` cap downstream — most fetched rows are thrown away. Minor cost; cap globally instead.
### M-7 — `searchDocuments` JOIN on `document_signers` has no port_id filter
Defense-in-depth: `ds.signer_email` ILIKE match is filtered through `d.port_id`, but the JOIN itself doesn't carry the port filter. Documents are FK-scoped, so no leak today, but the recommender pattern in this codebase (per CLAUDE.md) says "defense-in-depth port_id filter at every join." Apply the same here.
### M-8 — `import()` of `searchNavCatalog` inside `search()` is sync wrapped in two `await`s
Line 1867 — `await Promise.resolve((await import('@/lib/services/search-nav-catalog')).searchNavCatalog(...))`. The dynamic import is fine (avoids a circular dep), but `Promise.resolve` wrapping a sync result then awaiting it is dead ceremony. Inline or `await import(...).then(...)`.
### M-9 — Bucket ordering matches spec: notes second-to-last, navigation last ✓
`BUCKETS` in `command-search.tsx` (lines 6081) — confirmed. Notes is index 14, Navigation is index 15. `buildFlatRows` preserves this order, and the comments at lines 7579 and 11351138 document the rationale.
---
## What works well
- `scoreEntry` ladder (label-exact 100 → label-prefix 80 → label-substring 60 → kw-exact 50 → kw-prefix 35 → kw-substring 20) is correct and matches the spec.
- Paste detection: regex narrowness is fine because resolve-id is port-scoped and the fallback is normal search.
- The `NEVER_TSQUERY` / `NEVER_PHONE` sentinels (line 385386) correctly avoid Postgres-evaluation-order surprises that would otherwise break NULL guards in WHERE.
- `searchBerths` exact-match short-circuit (line 757) is the right UX call — typing "A1" when A1 exists should not also dump A10A19.
- Catalog `requires` is permission-gated correctly and `searchNavCatalog` respects both `requires` and `superAdminOnly`.
- `mergeWithExpansion` uses a `Set` dedupe — direct match wins, no duplicate rows.
- `applyAffinity` is stable wrt original order (line 327) when the touched-set is empty.
---
## Recommendations, ranked
1. **Fix HIGH-1 immediately** — graph-expansion permission leak. One-line gate per bucket merge.
2. **Resolve HIGH-2 catalog duplicates** — product decision needed.
3. **Decide on M-1** — direct-first vs affinity-first. Document the chosen rule in the service docstring.
4. Clean up HIGH-3 dead code or wire it up to actually reorder buckets for email/phone-shaped queries.
5. Sweep through M-2 / M-5 / M-7 in a single pass — all are SQL-shape fixes in the same file.
---
## 12. Onboarding + first-run UX audit (onboarding-auditor)
# Audit · Onboarding + first-run UX (task #12)
Scope: `src/app/(dashboard)/[portSlug]/admin/onboarding`, `ensureSystemRoots`,
`seed-bootstrap.ts`, the required-settings gates (SMTP / branding / EOI
signers / recommender), empty-state copy on the main lists, and the
"what works out of the box" path after `POST /api/v1/admin/ports`.
Bottom line: the checklist is the right shape but **three of its nine auto-
checks read the wrong setting key**, the `forms` step links to nowhere,
fresh ports ship with zero domain data (no berths, no tags, no signers),
and nothing prompts a freshly-invited admin to even open the checklist.
A new port is technically usable for clients/companies but cannot
generate an EOI without manual SQL or several blind admin visits.
---
## CRITICAL
### C1. Three checklist auto-checks read keys that no admin page ever writes
`src/components/admin/onboarding-checklist.tsx` `STEPS` declares
`autoCheckSettingKey` values that don't match what the linked admin
pages actually persist:
| Step | Checklist reads | Admin page actually writes |
| ----------- | --------------------------- | --------------------------------------------------------------------------------- |
| `email` | `sales_email_smtp_host` | `smtp_host_override` (email page) / `sales_smtp_host` (sales-email card) |
| `documenso` | `documenso_api_url` | `documenso_api_url_override` |
| `settings` | `recommender_top_n_default` | nothing — `DEFAULT_RECOMMENDER_SETTINGS` covers all keys, admin never has to save |
Effect: a port that has actually been fully configured will still show
those three steps as incomplete. The "manual mark done" fallback is
hidden behind an extra click, and the percentage bar is permanently
stuck below 70 %. This makes the checklist actively misleading —
operators stop trusting it.
Fix: rename the keys to the `_override` variants (or both) and drop the
recommender auto-check (or check `heat_weight_*` whose presence
genuinely means "admin tuned it").
### C2. `forms` step href is broken
`STEPS[8].href = '../'` resolves through the `Link` template to
`/${portSlug}/admin/../``/${portSlug}/` (the dashboard).
The intended target (`src/app/(dashboard)/[portSlug]/admin/forms/page.tsx`)
exists and is what the description references. Should be `'forms'`.
### C3. No gate on EOI signer identity
The checklist treats `documenso_api_url` (sic — see C1) as proof of
Documenso readiness, but the EOI pathway also requires
`documenso_developer_name`, `documenso_developer_email`,
`documenso_approver_name`, `documenso_approver_email`, and
`documenso_eoi_template_id`. Without these, `buildDocumensoPayload`
sends recipients with empty names/emails or the template-generate call
404s on a missing template id. There is no visible warning until a rep
tries to send the first EOI and Documenso bounces it. Add an
`autoCheckSettingKey` (or a derived multi-key check) for each so the
step doesn't go green until the developer + approver + template are all
populated.
### C4. `ensureSystemRoots` is awaited but its failure mode poisons port creation
`src/lib/services/ports.service.ts:46` awaits `ensureSystemRoots(...)`
**after** the `INSERT INTO ports` has committed (no surrounding tx).
The inline comment claims "non-fatal if this throws" — but a throw
propagates out of `createPort`, the route returns 500, and the operator
sees a failure even though the port row is live. The next admin action
self-heals through `ensureEntityFolder`'s fallback, but the failed
response leaves the operator suspicious and re-`POST`ing produces a
409 `slug already exists`. Either wrap port + folders in a transaction
or catch + log + continue here so the error message matches the
comment's promise.
---
## HIGH
### H1. `createPort` seeds **nothing** beyond folders
`createPort` only writes the port row and the three system folders. It
does not seed:
- Default tags (the checklist asks for "starter tags" but offers no
one-click default set)
- Default brochure (rep can't send the "send brochure" flow until one
is uploaded; nothing flags this)
- Berths (no UI to add berths; the only path is
`scripts/import-berths-from-nocodb.ts`)
- `berth_rules` (defaults vary per trigger and are off for
`berth_unlinked` — fine, but the absence isn't surfaced)
- `email_from_address` / `branding_app_name` (used in emails but not
validated; sending mail with a blank from address fails silently
on most providers)
- Recommender weight rows (defaults work but the onboarding step
reads the absence as "incomplete" — see C1)
Net effect: an admin can finish every onboarding step and still have a
port that can't generate an EOI (no berths, no developer/approver,
possibly no template) or send a brochure (no brochure exists). The
checklist needs either (a) a "Seed defaults" button on port creation
that writes recommended starter rows, or (b) explicit failing gates per
domain.
### H2. Storage step has no in-app action
`autoCheckSettingKey: 'storage_backend'` only flips green when a row
exists in `system_settings` — but the default backend (`s3`) is
inferred in code from `loadStorageConfig()` when no row is present, so
a perfectly functional s3-backed install never writes that row and the
step stays red forever. `/admin/storage` is read-only (status panel +
test connection); switching backends still requires a manual
`UPDATE system_settings` + `pnpm tsx scripts/migrate-storage.ts`.
Either add the writer UI or change this step to verify
`getStorageBackend()` round-trips a probe object.
### H3. Roles step auto-ticks immediately
`/api/v1/admin/roles``listRoles()` returns **all** roles unfiltered
by `portId`, so the six global system roles created by `seedBootstrap`
make the count > 0 on the freshest possible port. The step turns
green without the admin doing a thing, and the description "Create
roles & assign users" implies they did. Auto-check should be a
per-port subset, e.g. count rows in `user_port_roles` for `portId`.
### H4. No first-run prompt anywhere outside the buried nav link
The onboarding checklist lives under Admin → Tenancy → "Onboarding
checklist" (`admin-sections-browser.tsx:300`), described as
"read-only references" (which it is not — it has working manual
checkboxes). A freshly invited port-admin who logs in lands on
`/{portSlug}` and sees empty stat cards, with no banner, toast, or
"Finish setup" CTA pointing at the checklist. Discoverability is
effectively zero unless they know the URL. At minimum: dashboard banner
when `< X` of the auto-checks are passing, dismissible per user.
### H5. Berth list empty state misleads fresh ports
`src/components/berths/berth-list.tsx`:
`title="No berths found", description="Berths are imported from external sources. Adjust your filters..."`.
On a port with **zero** berths there is nothing to filter — the copy
implies the data exists but is hidden. Should branch on
`totalCount === 0 && noFiltersActive` and link to `/admin/import` with
the exact `pnpm tsx scripts/import-berths-from-nocodb.ts` command, or
to a future in-app importer.
### H6. Two competing `EmptyState` components
`src/components/ui/empty-state.tsx` uses `{body, actions}`, while
`src/components/shared/empty-state.tsx` uses `{description, action}`.
Different list pages consume different ones (e.g. clients/yachts use
`shared/`, documents-hub uses `ui/`). Same visual but divergent props
will trip up any future "improve onboarding copy" pass. Consolidate.
---
## MEDIUM
### M1. Branding auto-check anchors on logo only
`branding_logo_url` is the proxy for "branding done", but
`branding_app_name` and `branding_primary_color` are more functionally
load-bearing (app name shows in email subjects, color in CTAs).
Consider `branding_app_name` as the gate — or any-of.
### M2. Tags step has no "Apply default set" affordance
`/admin/tags` starts blank. Onboarding tells the operator to "define
starter tags" but offers no recommended palette. Add a one-click "Apply
recommended set (Hot / Warm / Cold / VIP / Press)" or similar so
operators have an opinionated baseline they can edit.
### M3. Settings auto-check confuses "value exists" with "operator chose it"
Once the admin opens `/admin/settings` and saves without changing
anything, `settings-manager.tsx` writes the default back as a real row
and the checklist turns green. That's a side effect, not informed
consent. Use a sentinel ("admin saw this page") rather than a
defaultable knob.
### M4. `admin-sections-browser` description is wrong
"Setup checklist for fresh ports (read-only references)" —
`OnboardingChecklist` has working `toggleManual` + persisted state.
Update the copy or it discourages clicking in.
### M5. Vocabularies are global-code-constant
Interest sources / statuses / contact reasons come from
`VOCABULARIES` in `src/lib/vocabularies.ts`, not from per-port settings.
Fine for MVP, but the onboarding doc says "vocabularies" implying
configurability. Either expose per-port overrides or remove the
mention.
### M6. Documents hub root view doesn't tell admins why `Clients/`/`Companies/`/`Yachts/` exist
On first visit to `/{portSlug}/documents`, the system roots are
present (from `ensureSystemRoots`) but with zero children. Empty-state
copy ("Upload a file...") doesn't explain that the three locked
system folders will auto-populate as deals progress.
---
## What works well
- `seedBootstrap` is genuinely idempotent and safe to re-run.
- `ensureSystemRoots` race semantics are clean; the partial-unique
index pattern is exemplary.
- `DEFAULT_RECOMMENDER_SETTINGS` plus `loadRecommenderSettings`'s
layered (port > global > default) lookup means recommender is the
one subsystem that genuinely works zero-config.
- The checklist UI affordances (progress bar, auto-detected hint,
manual-override button) are solid; only the wiring is wrong.
(~1,290 words)
---
## 27. Type-safety + drizzle leak audit (types-auditor)
# Type-Safety + Drizzle Leak Audit — Task #27
Branch: `feat/documents-folders` · 2026-05-12
## Top-line counts (src/, ts+tsx)
| Pattern | Count |
| ------------------------------------------------------ | ----------------------------- |
| `as unknown as` | 72 |
| `as any` (raw, mostly route hrefs) | 69 |
| `// eslint-disable @typescript-eslint/no-explicit-any` | 73 |
| `// @ts-ignore` / `// @ts-expect-error` | 0 |
| `as Route` (typed-routes cast) | 17 |
| `$inferSelect` / `$inferInsert` direct exports | **0** |
| Bare `: any` parameter (not eslint-disabled) | 2 functional + 2 declarations |
**Good news up front**: no `@ts-ignore` / `@ts-expect-error` anywhere, and no `$inferSelect` type leaked through the API boundary as a public response contract. Service return shapes go through `{ data }` envelopes; drizzle row types stay internal.
---
## CRITICAL
### 1. `tx: any` in client-restore service — bypasses Drizzle's transaction type contract
`src/lib/services/client-restore.service.ts:361`
```ts
tx: any,
```
This parameter receives a Drizzle transaction client and threads writes through 12+ downstream tables in a multi-step restore. A typo'd table or wrong column type goes undetected at compile time. Type as `Parameters<typeof db.transaction>[0]` (see `src/lib/db/utils.ts:17` for the same shape applied via `as unknown as`).
### 2. `useQuery<any>` + `apiFetch<{ data: any }>` on berth detail page
`src/components/berths/berth-detail.tsx:2025, 60`
```ts
const { data, isLoading } = useQuery<any>({...});
apiFetch<{ data: any }>(`/api/v1/berths/${berthId}`)
const berth = data as any;
```
Three escape hatches stacked on the highest-traffic detail page. Every field access downstream is unchecked — a service-side rename to `mooringNumber``mooring_number` would silently render `undefined`. Replace with a `BerthDetailResponse` type co-located with the service.
### 3. Portal-auth and public routes bypass `parseBody`
6 portal + 3 public-intake routes use raw `await req.json()` instead of the project-standard `parseBody(req, schema)`:
- `src/app/api/portal/auth/{forgot-password,reset-password,sign-in,activate,change-password}/route.ts`
- `src/app/api/auth/set-password/route.ts`
- `src/app/api/public/{residential-inquiries,website-inquiries,interests}/route.ts`
- `src/app/api/v1/admin/custom-fields/[fieldId]/route.ts` (intentional — comment explains)
CLAUDE.md mandates `parseBody` so 400 errors have field-level shape the toast hook recognizes. ZodErrors from `schema.parse` after raw `req.json()` become generic 500s. Custom-fields one is justified; the other 9 are not.
---
## HIGH
### 4. `mergePerms` double-cast in new permission-overrides route
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:254, 259`
```ts
const out = { ...(base as unknown as Record<string, Record<string, boolean>>) };
return out as unknown as RolePermissions;
```
Comment acknowledges this duplicates `withAuth`'s `deepMerge`. Either reuse `deepMerge` from `helpers.ts` (lines 202205, 234237 already use the same pattern) or extract a typed helper `mergePermsTyped(base, patch): RolePermissions`. Two implementations of permission merge is a divergence risk.
### 5. Audit-log `as unknown as Record<string, unknown>` epidemic
21 occurrences across services that write `oldValue` / `newValue` to `audit_logs`:
- `invoices.ts` × 7, `expenses.ts` × 6, `documents.service.ts` × 2, `berths.service.ts` × 2, `companies.service.ts`, `company-memberships.service.ts`, `yachts.service.ts`, `document-templates.ts`, `ocr-config.service.ts` × 2, `ai-budget.service.ts` × 2
Wide repetition of the same widening cast is a smell — every service does the same dance to fit Drizzle row types into the audit JSONB column. **Fix**: introduce `toAuditJson<T>(row: T): Record<string, unknown>` once in `src/lib/services/audit.ts` (same pattern `gdpr-bundle-builder.ts` already uses — `toJsonRow`, line 152 comment explicitly cites avoiding this). Removes 21 unsafe casts in one shot.
### 6. `next/typedRoutes` defeated by 49 `as any` href casts
`router.push(..)` and `<Link href={..}>` with template-literal dynamic URLs get widened to `string`, which isn't assignable to `Route<string>`. Components compensate via `as any` (49 sites) or `as Route` (17 sites). Hotspots: `command-search.tsx` (10), `topbar.tsx` (10), `user-menu.tsx` (8), `reservation-list.tsx` (6), residential lists/headers (8).
This nullifies the value of `experimental.typedRoutes` everywhere it matters most — dynamic navigation in shells, search, and detail headers. **Fix**: introduce `route(path: string): Route` helper in `src/lib/routes.ts` that does the cast in one audited place; ban `as any`/`as Route` for href via ESLint rule. Bonus: makes it possible to migrate to a real typed-routes wrapper later.
### 7. `RolePermissions` ↔ `Record<string, unknown>` round-trip in withAuth chain
`src/lib/api/helpers.ts:203, 235` — two layers of permission merge cast both directions to satisfy `deepMerge`'s untyped signature. `deepMerge` should be generic over `<T extends Record<string, unknown>>` and accept `RolePermissions` directly. Same problem as #4; same fix.
---
## MEDIUM
### 8. `Record<string, unknown>` JSONB writes without zod re-parse at write time
Server-side blobs stored to `system_settings.value`, `userProfiles.preferences`, audit JSONB columns:
- `ocr-config.service.ts:79, 85``value: value as unknown as Record<string, unknown>`. Upstream zod parse exists, so safe in practice, but the cast hides the relationship.
- `ai-budget.service.ts:88, 94` — same pattern.
- `me/route.ts:148168`**good model**: explicit `ALLOWED_PREF_KEYS` allow-list + 8KB size cap + zod via `parseBody`. Use as the template for the other two.
- `components/admin/settings/settings-manager.tsx:237, 250` and `settings-form-card.tsx:7993` — client-side `Record<string, unknown>` state. Admin-only surfaces, low risk, but no per-key shape check before PUT.
### 9. Dynamic-sort key cast in invoices list
`src/lib/services/invoices.ts:199`
```ts
column: invoices[query.sort as keyof typeof invoices] as unknown as PgColumn,
```
The `query.sort` zod schema should already enum-restrict sort keys to actual columns; if so, the inner `as keyof typeof invoices` is redundant. If `query.sort` is a free string, this is also a SQL-shape risk surface (mitigated only because Drizzle column proxy will throw on unknown keys). Verify the validator enum is exhaustive.
### 10. Template-preview accepts arbitrary content as TipTap
`src/app/api/v1/admin/templates/preview/route.ts:32`
```ts
const doc = body.content as unknown as TipTapNode;
```
Admin-gated, so blast radius limited, but the renderer downstream assumes well-formed TipTap JSON. Add a minimal `tipTapNodeSchema` zod check at the boundary — a malformed node tree would otherwise throw deep in the renderer with a useless stack trace.
### 11. Node stream ↔ Web stream casts
5 sites cast between `NodeJS.ReadableStream`, `Readable`, and `ReadableStream<Uint8Array>` via `as unknown as`:
- `src/app/api/storage/[token]/route.ts:103`
- `src/lib/services/expense-pdf.service.ts:507510`
- `src/lib/services/document-sends.service.ts:374`
- `src/lib/services/brochures.service.ts:255`, `berth-pdf.service.ts:350`
Type system gap is genuine (Node Readable ↔ Web ReadableStream don't have a structural match in lib.dom + @types/node). Centralize in `src/lib/storage/stream-bridge.ts` with named helpers `toWebStream(readable)` / `toNodeStream(web)` — removes the casts from feature code.
### 12. `as unknown as { destroy: () => void }` stream cleanup
`brochures.service.ts:255` and `berth-pdf.service.ts:350` reach into stream internals because the storage backend's return type doesn't expose `destroy`. Add `destroy?(): void` to the `StorageBackend.get()` return type so cleanup is part of the contract.
### 13. `as unknown as string` for pdfme BLANK_PDF sentinel
9 PDF templates carry `basePdf: 'BLANK_PDF' as unknown as string`. This is a known pdfme upstream type-def bug — the string literal `'BLANK_PDF'` is accepted at runtime but typed as `Uint8Array | string`. Wrap once: `const BLANK_PDF = 'BLANK_PDF' as unknown as string;` exported from `src/lib/pdf/constants.ts`. Removes 9 casts.
### 14. Drizzle self-FK uses `: any`
`src/lib/db/schema/system.ts:43`
```ts
revertOf: text('revert_of').references((): any => auditLogs.id),
```
Standard Drizzle workaround for forward-references, but the official typing is `(): AnyPgColumn`. Swap.
### 15. `phone-parse.ts` metadata require
`src/lib/dedup/phone-parse.ts:25``const metadata: any = require(...)` for `libphonenumber-js/metadata.min.json`. CommonJS interop hack; replace with `import metadata from 'libphonenumber-js/metadata.min.json'` + `resolveJsonModule: true` (already on in `tsconfig.json`).
---
## Drizzle leak check — clean
Searched for `$inferSelect` / `$inferInsert` exports crossing the API boundary: **zero hits in src/**. Services return Drizzle row types internally, but every API route wraps them in `{ data }` envelopes (confirmed by spot-check across invoices, berths, clients, documents). The `Record<string, unknown>` widenings flagged above happen at write time into JSONB columns, not at read time across the API surface. No PII columns or internal-only fields slip through.
---
## Recommended sequence
1. **Critical first**: fix #1 (`tx: any`), #2 (berth detail `<any>`), #3 (parseBody for portal auth) — 34h.
2. **One helper, big win**: `toAuditJson<T>` (#5) — removes 21 casts.
3. **Route helper**: `route()` (#6) — removes 49+ `as any` and unblocks future real typed-routes adoption.
4. **Stream bridge**: centralize Node↔Web conversion (#11) — removes 5 casts.
5. **PDF constant**: extract `BLANK_PDF` (#13) — removes 9 casts.
Net effect: ~85 of 145 escape hatches removed with five focused refactors, and the remaining ones become small enough to justify case-by-case.
---
## 31. Auth flow polish audit (auth-flow-auditor)
# Auth Flow Polish Audit — Task #31
Scope: CRM `(auth)/` pages (login, reset-password, set-password), portal `(portal)/` pages (login, activate, reset-password, forgot-password), email-change confirm/cancel landing, `/api/auth/resolve-identifier`, `withAuth` gates and Better Auth config.
Severities: **CRITICAL** = silent security/data risk · **HIGH** = real user can hit a dead-end · **MEDIUM** = polish/copy that erodes trust.
---
## CRITICAL
### C1 — Password reset does not revoke existing sessions on either flow
Better Auth's `sendResetPassword` (`src/lib/auth/index.ts:73`) is configured with no `onPasswordReset` / `revokeAllSessions` hook; the same is true for `resetPassword` in `portal-auth.service.ts:428`. Outcome: a compromised cookie keeps working forever after the legitimate owner does the "forgot password" dance. This is the canonical "session-bumping on reset" guarantee users assume and we're not delivering it. Add a step that deletes every row in `sessions` (CRM) and `portal_auth_tokens` + active `portal_sessions` (portal) for the affected user inside the same transaction that writes the new password hash.
### C2 — Disabled CRM user retains an active session cookie
`withAuth` rejects with 403 "Account disabled" when `userProfiles.isActive === false` (`src/lib/api/helpers.ts:152`), but:
- `auth.api.signInEmail` itself doesn't know about `isActive` — a disabled user can still complete `/login` and be redirected to `/dashboard`, where every API call then 403s.
- Setting `isActive=false` in `updateUser` (`users.service.ts:227`) never deletes the existing `sessions` row, so an already-logged-in disabled user keeps every page that doesn't hit `/api/v1` working, and any cached SSR page loads.
Fix: (a) on signIn add a profile lookup and reject before issuing the cookie; (b) on `isActive=false` flip, delete from `sessions` for that userId; (c) middleware should treat a 403 from an API as a global redirect to `/login?reason=disabled`.
---
## HIGH
### H1 — No dedicated "this link expired / already used / your account is disabled" landing pages
Every token failure today is surfaced as a toast on a still-functional form, or as a 400 JSON error that the user only sees if they actually submit the form.
- `set-password/page.tsx` (CRM) handles `!token` (the "Link is missing or invalid" branch, line 73-88) but does NOT distinguish "token present but expired" from "token present but already used" — both surface as a `toast.error(body.message)` and leave the form interactive, inviting an infinite retry loop.
- The portal `password-set-form.tsx` is identical (line 63-67): expired/used tokens render as a red `<p>` under the form.
- There is no `/account-disabled` page; the user just sees `403 Account disabled` text from the JSON response in DevTools and gets stuck on `/dashboard` rendering nothing (or rendered SSR shell with broken API calls).
Recommend: a single `<TokenStateMessage state="expired" | "used" | "invalid" />` component that the page server-renders by doing a HEAD-style `validate` call on mount, plus a `/disabled` route the middleware redirects to.
### H2 — Email-change `/settings?emailChange=confirmed|cancelled` query param is never consumed
The confirm/cancel redirect URLs at `api/v1/me/email/{confirm,cancel}/[token]/route.ts:71/50` set `?emailChange=confirmed|cancelled`. Grep shows ZERO consumer in `src/app` or `src/components`. The redirect succeeds, but the user lands on the bare `/settings` page with no banner, no toast, no confirmation — for a security-sensitive action this looks broken / makes users wonder if it took. Wire a banner in `user-settings.tsx` keyed off `useSearchParams().get('emailChange')`.
### H3 — Cancel-email-change link is GET-only with no friction
`api/v1/me/email/cancel/[token]/route.ts` is a one-click GET that wipes the pending change. Gmail/Outlook link prefetchers, antivirus URL scanners, and corporate proxies will auto-fetch links and cancel a legitimate request without the user ever clicking. Pattern for safety: GET renders a confirmation page (`Are you sure?` button), POST executes. Same fix needed on `confirm` if a link-scanner could pre-confirm an attacker's address-change before the real user sees the cancel link.
### H4 — `set-password` (CRM) success path has no auto-sign-in
`set-password/page.tsx:64` toasts "Password set successfully" then routes to `/login`. The user has to type their email and the password they just chose, again. For invite flows this is the worst conversion point. Either (a) auto-sign-in via `auth.api.signInEmail` after the consume call returns, or (b) at minimum prefill the email field on `/login`. (Portal's `activate` flow has the same problem in `password-set-form.tsx:97`).
### H5 — Reset-password expiry shows no time estimate; users hit "expired" cold
CRM `(auth)/reset-password/page.tsx:62` says "we have sent a password reset link" with no TTL. Better Auth default reset token expiry is 1 hour (the email body on line 79 mentions "expires in 1 hour" but the success page doesn't echo this). Portal forgot-password (`forgot-password/page.tsx:43`) correctly says "expires in 30 minutes". Make the CRM message say "Link expires in 1 hour" so users at the airport know whether to wait.
### H6 — `resolve-identifier` returns 429 with `{ email: '' }` which bypasses the synthetic miss path
`api/auth/resolve-identifier/route.ts:56` returns `{ email: '' }` on rate-limit. Client (`login/page.tsx:56`) does `payload.email?.trim() || identifier` — so the original username (without `@`) is passed into `authClient.signIn.email`. Better Auth rejects it as "invalid email format" instead of "invalid credentials", which is a distinguishably-different error from the normal miss case and re-opens the enumeration channel the synthetic-email defence was built to close. Return `{ email: syntheticEmail(raw) }` on the 429 path too (status code can stay 429).
---
## MEDIUM
### M1 — Login error toast leaks Better Auth wording
`login/page.tsx:65` uses `result.error.message ?? 'Invalid credentials'`. Better Auth surfaces strings like "User not found" / "Invalid password" / "Email or password is invalid" depending on the path — the first two are an enumeration leak that bypasses the resolve-identifier defence. Always overwrite to a fixed `'Email or password is incorrect.'` and only log the underlying reason server-side.
### M2 — Portal sign-in error message is friendlier than CRM
Portal: `'Invalid email or password'`. CRM: raw Better Auth message OR `'Invalid credentials'`. Unify on "Email or password is incorrect" everywhere (matches CRM `(auth)/login/page.tsx:65` and portal `(portal)/portal/login/page.tsx:37`) — the CRM phrasing "Invalid credentials" is jargon.
### M3 — `set-password` divergence: form-validation TTL mismatch
CRM `set-password/page.tsx` requires min 9 chars (line 16). Portal `password-set-form.tsx` also 9 (line 23). But the activation/CRM invite TTL diverges silently: CRM invite = 72h (`crm-invite.service.ts:17`), portal activation = 72h (`portal-auth.service.ts:25`), portal reset = 30min, CRM reset = 60min. The "request a new link" copy in the invalid-token branch should embed the actual TTL so admins debugging "why doesn't this work" don't have to read the schema.
### M4 — `set-password` (CRM) error fallback is inconsistent shape
`(auth)/set-password/page.tsx:60` reads `body.message ?? body.error` — but `api/auth/set-password/route.ts` uses `errorResponse(err)` which emits `{ error }`. The `message` key is dead code, fine, but the legacy comment on `set-password/route.ts:24` says envelopes were normalised in commit "auditor-F §32" — the page should match: `body.error ?? 'Failed to set password.'`.
### M5 — Portal forgot-password 30-min TTL is short for international clients
30 minutes is aggressive when emails routinely sit in spam quarantine for 5-15 minutes before clearing. CRM reset's 60min is a sensible floor. Either lift to 60min or surface the 30min countdown more aggressively in the email + landing page.
### M6 — `login` Suspense fallback for set-password renders empty shell
`set-password/page.tsx:143` falls back to `<BrandedAuthShell>{null}</BrandedAuthShell>` — a flash of empty branded card while `useSearchParams` resolves. Replace with a skeleton or "Verifying link…" microcopy; the empty state reads as "page broken" for ~100ms on slow networks.
### M7 — `/portal/activate` Suspense fallback is unbranded grey div
`portal/activate/page.tsx:8` falls back to a plain `Loading…` div — jarring after the branded email. Mirror the CRM `set-password` pattern with `<BrandedAuthShell>`. Same on `portal/reset-password/page.tsx:8`.
### M8 — "Request a new link" target on portal `set-password` invalid-token is wrong for activation flow
`password-set-form.tsx:86` always points to `/portal/forgot-password`. For activation the user has no password yet — `/portal/forgot-password` returns the silent 200 and the admin has to manually `resendActivation`. Branch on `endpoint`, or give portal users a self-service "Resend activation".
### M9 — No "Remember me" / shared-device control
Better Auth session `expiresIn: 24h` (`auth/index.ts:94`); portal token also 24h. No checkbox to shorten on a shared device, no copy saying so. Add a session-only cookie path.
### M10 — Portal login `next` param is unvalidated
`portal/login/page.tsx:42`: `router.replace(next as never)` where `next = search.get('next')`. Open redirect: `/portal/login?next=https://evil.example` navigates cross-site after sign-in. Validate `next.startsWith('/portal/')` before using.
---
## Summary
- **2 CRITICAL** (no session-revoke on password reset; disabled-user keeps session)
- **6 HIGH** (no expired/used/disabled landing pages; emailChange success param consumed by nobody; GET-cancellation prefetch risk; no auto-sign-in after set-password; missing TTL copy; rate-limit branch leaks enumeration)
- **10 MEDIUM** (copy inconsistencies, branded-shell drift, open redirect in portal `next`, no shared-device session control)
Token mechanics themselves are sound (32-byte CSPRNG, SHA-256 storage, single-use markers, dual rate-limit buckets, anti-enumeration silent-200 on forgot-password, dummy-hash timing equalisation in portal signIn). The polish gaps are in _what happens after_ a token succeeds or fails — landing pages, banners, session lifecycle.
---
## 30. Image + asset hygiene audit (asset-auditor)
# Image + Asset Hygiene Audit (Task #30)
Scope: uploaded-image handling across avatar, brochure, berth-PDF, generic file
uploader, receipt scanner, and the new portrait avatar cropper. EXIF, MIME
spoof, polyglots, server-side resize, dimension caps, SVG/GIF risk, filename
sanitisation, Content-Disposition, per-surface size caps.
Files reviewed (highlights):
- `src/lib/constants/file-validation.ts` (allow-list + magic bytes)
- `src/lib/services/files.ts` (`uploadFile` + previews)
- `src/lib/services/storage.ts` (`sanitizeFilename`)
- `src/app/api/v1/me/avatar/route.ts` + `src/components/shared/image-cropper-dialog.tsx`
- `src/app/api/v1/files/upload/route.ts`
- `src/app/api/storage/[token]/route.ts` (filesystem-backend proxy)
- `src/lib/services/berth-pdf.service.ts`, `brochures.service.ts`, `expense-pdf.service.ts`
- `src/app/api/v1/documents/[id]/download/[...slug]/handlers.ts`
---
## CRITICAL
### C1. No server-side image normalisation on avatar / generic image uploads — EXIF (GPS) is persisted and served verbatim
**Where:** `src/app/api/v1/me/avatar/route.ts:46-68`, `src/lib/services/files.ts:45-72`.
The `/api/v1/me/avatar` handler takes the multipart body, checks size (≤2 MB),
runs `bufferMatchesMime` (first-bytes-only) and writes the bytes straight to
storage. The "cropper" (`image-cropper-dialog.tsx`) does run a Canvas re-encode
client-side, which incidentally drops EXIF — but a malicious user (or simply
any user with curl) can bypass the cropper by POSTing the raw image directly
to the same endpoint. The route accepts whatever JPEG/PNG/WebP/GIF arrives and
the generic uploader (`/api/v1/files/upload`) has the same property: no
`sharp().rotate().toBuffer()` normalisation, no EXIF strip, no ICC profile
reset, no re-encode.
Result: every photo uploaded from a phone — receipt scans (`/expenses/scan`),
client/yacht photo attachments via `FileUploadZone`, manual avatar PUTs — is
served from MinIO with full EXIF: GPS latitude/longitude, device serial,
photographer name, original capture timestamp.
GDPR/PII exposure (audit #8 already flagged related issues). The previewer
just hands a presigned URL straight to the browser, so any rep / client portal
visitor with a download URL gets the metadata.
**Fix:** run every accepted `image/*` payload through `sharp(buf).rotate()
.withMetadata({ orientation: undefined }).toBuffer()` (or `.toFormat(jpeg|png|webp)`)
in `uploadFile()` before `backend.put(...)`. Sharp is already a dependency
(used by `expense-pdf.service.ts`). Same wrapper enforces a max-pixel cap
(see H1).
### C2. No max-dimension / decompression-bomb gate on uploaded images
**Where:** `src/lib/services/files.ts:38-72`, `src/app/api/v1/me/avatar/route.ts`.
`MAX_FILE_SIZE` is 50 MB (avatar: 2 MB). Neither path inspects width/height.
A 2 MB highly-compressed PNG can decode to >300 megapixels (e.g. a 30000×30000
palette PNG). Any downstream consumer that decodes:
- the `<AvatarImage>` in the React UI,
- `pdf-lib`/`pdfme` embedding the image into a generated client/interest PDF,
- `sharp` resize in `expense-pdf.service.ts`,
will OOM or pin a worker. The receipt-PDF service runs sharp but only when
`raw.byteLength > 500 KB`, so a 400 KB decompression-bomb PNG skips the
threshold and `sharp` is called on the embed path with no dimension cap, and
PDFKit attempts to embed the raw bytes.
**Fix:** in the normalisation step from C1, cap output to `MAX_DIMENSION = 4096`
(or 2048 for avatar) using `sharp.resize({fit:'inside',withoutEnlargement:true})`
and reject any source whose `metadata().pixels > LIMIT` before allocating the
decode buffer.
---
## HIGH
### H1. Magic-byte check is prefix-only — polyglots pass
**Where:** `src/lib/constants/file-validation.ts:48-87`.
`bufferMatchesMime` checks the leading 38 bytes. PNG/JPEG/GIF/WebP/ZIP-based
office formats all share short, well-known prefixes. A file beginning
`FF D8 FF ... <PDF body> ... <HTML> <script>...</script>` passes as
`image/jpeg`, lives in storage as `image/jpeg`, and gets served from a
presigned MinIO URL with `Content-Type: image/jpeg`. With `nosniff` set this
is mostly inert in modern browsers, but:
- The S3-presigned download URL is hit **directly by the browser** (the proxy
with `X-Content-Type-Options: nosniff` is only on the filesystem backend at
`/api/storage/[token]`). MinIO/S3 does not add `nosniff` automatically.
- The signed URL is on the **same origin's CDN** for portal users when MinIO
is fronted by the marketing site, raising same-origin sniff risk.
The avatar/general path has no trailing-byte gate. Compare with the PDF path
which at least checks both `%PDF-` prefix (good) and **doesn't** enforce a
trailing EOF marker — same shape weakness.
**Fix:**
- After the prefix check, run `sharp(buf).toFormat(declared)` re-encode (from
C1) which strips any non-image trailer.
- Force `ResponseContentDisposition`/`ResponseContentType` on the presigned
download (minio-js supports both via `respHeaders`) so MinIO emits
`X-Content-Type-Options: nosniff` regardless of object metadata.
### H2. Filesystem backend proxy enforces stronger checks than the S3 path
**Where:** `src/app/api/storage/[token]/route.ts:217-225` vs `src/lib/storage/s3.ts:249-262`.
The filesystem PUT proxy does a magic-byte check on the streamed body when
the token's declared content-type is `application/pdf`. The S3 presigned PUT
(used in prod) lets the browser stream straight to MinIO — the only
post-upload verification is in `berth-pdf.service.ts:234-262` and
`brochures.service.ts:230-263`. **Generic image uploads via S3 presigned PUT
have no post-upload verification at all** because no caller currently mints
presigns for arbitrary images — but the abstraction allows it. If a future
caller ever presigns a non-PDF, the S3 path will accept anything.
**Fix:** make `presignUpload` accept a `verifyMagicBytes: true` flag and require
every caller to opt in/out explicitly. Or wrap S3 presigns in a one-shot
post-upload `head + first-5-bytes` verifier (the brochure path already does
this; lift it into `getStorageBackend().registerUpload(...)`).
### H3. Animated GIF is allowed with no frame cap
**Where:** `ALLOWED_MIME_TYPES` includes `image/gif`. No upstream consumer
inspects `metadata().pages` or `metadata().delay`.
A 50 MB animated GIF with 5000 frames at 5 ms delay will burn CPU on every
rep's client list view and on PDF embed. Also a known browser DoS vector.
**Fix:** during the sharp normalisation (C1), pass `{ animated: false }` so
only the first frame is kept, or set `pages: 1`. Or drop GIF from
`ALLOWED_MIME_TYPES` entirely — the CRM has no real reason to accept it
(reps share PNG/JPEG, brochures are PDF).
### H4. Avatar `Content-Type` echoes browser-declared MIME — preview endpoint trusts blindly
**Where:** `src/app/api/v1/me/avatar/route.ts:53``mimeType: fileEntry.type || 'image/jpeg'`.
`fileEntry.type` is the **browser-declared** type. Magic bytes are checked but
the **stored** content-type is still the declared one. If a uploader claims
`image/webp` but sends a JPEG (passes magic-byte check against jpeg signature?
no, but a crafted polyglot can pass webp's RIFF check while embedding extra
bytes), the stored `mimeType` is wrong. Downstream `PREVIEWABLE_MIMES` check
goes off `files.mimeType` so the server's content-type lies.
**Fix:** after the magic-byte check, derive the **canonical** MIME from the
matched signature (one entry per signature) and store that, not the
browser-declared form.
---
## MEDIUM
### M1. `Content-Disposition` for `/api/v1/documents/[id]/download/...` lacks RFC 5987
**Where:** `src/app/api/v1/documents/[id]/download/[...slug]/handlers.ts:68,92-94`.
`sanitizeFilenameForHeader` replaces `"`/`\`/CRLF with `_` but emits only
`filename="..."` — Unicode filenames render as mojibake / get truncated by
Firefox + Safari. The `/api/storage/[token]` proxy gets this right
(`filename*=UTF-8''<encoded>`); the doc download doesn't. The other
ad-hoc PDF exports (`clients/[id]/export-pdf`, `berths/[id]/export-pdf`,
`interests/[id]/export-pdf`) hard-code ASCII filenames and skip RFC 5987 too
— acceptable because they're constant, but the doc download is dynamic.
**Fix:** mirror the storage-proxy form:
`attachment; filename="<sanitised-ascii>"; filename*=UTF-8''<encoded>` and
switch the disposition from `inline` to `attachment` for non-previewable
MIMEs (the current `inline` lets a malicious file open in-page even with
nosniff).
### M2. `sanitizeFilename` doesn't strip RTL/zero-width Unicode
**Where:** `src/lib/services/storage.ts:15-22`.
Strips `[/\\:]`, NUL, and `\x01-\x1f\x7f`. Doesn't touch:
- U+202E RIGHT-TO-LEFT OVERRIDE — classic Windows-icon-spoof vector
(`invoice_fdp.exe` displays as `invoice_exe.pdf`).
- U+200B/U+200C/U+FEFF zero-widths — collision spoofs in folder listings.
- Surrogate halves.
**Fix:** Unicode-normalise (`name.normalize('NFC')`) then drop the
Cf/bidi-control category, e.g. via `/[---]/gu`.
### M3. No per-surface size caps beyond avatar/PDF
**Where:** `file-validation.ts` (50 MB), avatar (2 MB), berth PDFs (admin
setting), brochure (admin setting). The generic uploader has only the
50 MB ceiling — applies equally to a yacht photo, a maintenance-log
attachment, a client document scan. Reps could legitimately upload a 49 MB
phone-camera PNG and it would be embedded into PDFs without resize.
**Fix:** `uploadFile` should branch on `category` (`avatar`, `yacht_photo`,
`maintenance`, `attachment`, …) and apply per-category byte + dimension caps,
not a flat 50 MB.
### M4. `text/plain` / `text/csv` have no signature verification
**Where:** `file-validation.ts:71-72` (intentionally unconstrained), served
through the same presigned URL path as binary files. A user can upload
`evil.html` claiming `text/plain`; with `nosniff` plus the stored
`Content-Type: text/plain` modern browsers display it as text, but stale
links that get loaded via `<iframe src="...">` will render as the declared
type. Lower risk than C1/H1 but worth tightening.
**Fix:** sniff: reject when the first 512 bytes contain `<script`, `<html`,
`<!DOCTYPE`, or non-printable bytes outside common encodings. Or require an
explicit `category: 'text'` for these MIMEs and refuse them on the avatar /
attachment surfaces.
### M5. Image cropper outputs only 512 px JPEG @0.85 — no enforcement that the upload matches
**Where:** `src/components/shared/image-cropper-dialog.tsx:51,70`.
`outputWidth = 512` is the only client-side cap. Once the cropped JPEG hits
the server, the server does **not** verify that the avatar is square or under
some pixel ceiling — the server just sees a 2 MB image. A scripted client can
ship a 4000×4000 JPEG straight to `/api/v1/me/avatar` because the cropper is
client-side. Tied to C1's fix (normalise + resize server-side).
### M6. No HEIC/HEIF support — iOS share-sheet uploads silently fail
**Where:** `ALLOWED_MIME_TYPES`. iPhones default photo format is `image/heic`;
the receipt scanner (`scan-shell.tsx:494`) uses `accept="image/*"` so the
browser allows the pick, then the upload 400s with "type not allowed". UX
regression more than security.
**Fix:** add `image/heic` + `image/heif` to the allow-list and transcode to
JPEG in the same normalisation pass (sharp 0.34 supports HEIC via libvips
- libheif, but check the deploy image first).
---
## Already strong (no action)
- PDF magic-byte gate on **both** in-server and presigned-PUT paths
(`berth-pdf.service.ts`, `brochures.service.ts`, filesystem proxy).
- SVG **excluded** from `ALLOWED_MIME_TYPES` — no SVG-XSS surface in user
uploads. The only SVG generation is the chart-card data-URI which is
produced by the CRM, not user-controlled.
- Filesystem-backend proxy sets `X-Content-Type-Options: nosniff` +
`Cache-Control: private, no-store` + single-use HMAC tokens.
- Storage key derivation is UUID-based (`generateStorageKey`) so original
filename never controls a path — no path-traversal surface from filenames.
- `uploadFile` allow-list + size cap + magic-byte composition.
---
## Recommended order
1. C1 + C2 + H1 together: introduce a `normalizeAndStoreImage()` helper in
`src/lib/services/files.ts` that runs every accepted image through
`sharp().rotate().withMetadata({orientation:undefined}).resize().toFormat()`
before `backend.put()`. Drops EXIF, kills polyglot trailers, caps pixels.
2. H4 + M5: derive canonical MIME from the matched signature; treat
`mimeType` field as server-authoritative.
3. M1 + M2: tighten filename + disposition headers.
4. H2: lift the post-upload verify out of berth-pdf / brochure into the
storage abstraction.
5. M3 / M6: per-surface caps + HEIC transcode (deploy-image work).
6. H3 / M4: drop GIF or freeze to first frame; sniff text payloads.
(word count ≈ 1280)
---
## 21. Mobile + PWA + iOS quirks audit (mobile-pwa-auditor)
# Mobile + PWA + iOS quirks audit
Branch: `feat/documents-folders` · Scope: `src/app/(scanner)`, `src/components/layout/mobile/*`, `src/components/search/mobile-search-overlay.tsx`, `src/components/shared/drawer.tsx`, `src/middleware.ts`, `public/manifest.json`, `public/icon-*.png`, root layout viewport/metadata, `tailwind.config.ts` safe-area utilities.
---
## CRITICAL
_None blocking ship._
## HIGH
### H1. No service worker is registered — `/scan` PWA has zero offline capability
Grep `serviceWorker|navigator.serviceWorker|workbox|next-pwa` returns nothing across `src/` + `public/`. The per-port manifest declares `display: 'standalone'` and the scanner's whole product premise is "rep walks the marina with a phone capturing receipts", i.e. exactly the situation where Wi-Fi drops to nothing between pontoons. Consequences:
- iOS Add-to-Home Screen installs succeed but cold-launch with no signal fails at the first network call (Next.js page chunks 404 in WebView).
- The OCR + upload + create-expense chain in `ScanShell` (`src/components/scan/scan-shell.tsx`) has no offline-queue / retry. `kind: 'error'` is rendered and the only artifact is the in-memory blob — closing the PWA loses the photo and the manual-typed fields.
- Android Chrome will refuse to fire `beforeinstallprompt` without a service worker, so the install prompt never auto-surfaces.
Fix (in priority order): (1) ship a minimal Workbox/`next-pwa` SW that precaches the scanner route + Tesseract WASM + lucide icons, (2) wrap the expense submit in an outbox (IndexedDB queue → background sync), (3) capture `beforeinstallprompt` and surface an "Install" CTA inside the idle-state scanner card.
### H2. Two manifests overlap — root `/manifest.json` and dynamic `/[portSlug]/scan/manifest.webmanifest`
- Root layout (`src/app/layout.tsx:47`) declares `manifest: '/manifest.json'` with `start_url: '/'`, `theme_color: #0f172a` (slate-900). Root viewport says `themeColor: '#1e2844'` (navy). Two different theme colors → Chrome will pick the `<meta name="theme-color">` from `<head>` (navy) but the manifest install splash will use `#0f172a`. Cosmetic mismatch on install.
- Scanner manifest overrides scope to `/<portSlug>/scan` with `theme_color: #3a7bc8` (brand blue) and viewport `themeColor: '#3a7bc8'` — internally consistent. ✓
- Issue: if a rep visits `/<portSlug>/dashboard` and hits "Add to Home Screen" (rare but possible), they get a PWA whose `start_url` is `/` which redirects to `/login` on every cold-launch because the root `<head>` resolves the unscoped manifest first. There is no `<link rel="manifest">` swap between the two surfaces; Next.js's `generateMetadata` on the scanner route DOES override the root metadata (verified at `src/app/(scanner)/[portSlug]/scan/layout.tsx:28`), but root `/manifest.json` still defines a competing PWA.
Fix: either narrow the root manifest's `scope` and `start_url` to `/login` (so non-scanner installs land on auth), or remove root `manifest:` and lean solely on the per-port scoped scanner manifest. Add `start_url: '/<portSlug>/dashboard'` per-port via a second dynamic manifest for the main app, if installable main-app is even desired.
### H3. iOS standalone status-bar / safe-area mismatch in the scanner
- Scanner layout declares `appleWebApp.statusBarStyle: 'default'` (`src/app/(scanner)/[portSlug]/scan/layout.tsx:32`) — that's the white-bar-with-black-text style that iOS draws OPAQUELY above the WebView, NOT under it.
- `viewport.viewportFit: 'cover'` is set (line 46) which tells iOS to let content extend under safe areas.
- `ScanShell` (`src/components/scan/scan-shell.tsx:449`) renders `<main className="mx-auto ... min-h-[100dvh] w-full max-w-xl ... px-4 py-6 sm:py-10">`**no `pt-safe-top`, no `pb-safe-bottom`, no `safe-left/safe-right`**.
- Result on iPhone 14/15 with home indicator + standalone install: the "Capture receipt" / "Save expense" buttons sit flush against (or under) the home-indicator stripe. The brand logo at the top is fine because `py-6 sm:py-10` happens to clear the notch — by accident, not by design.
Fix: add `pb-[calc(env(safe-area-inset-bottom)+1rem)]` to `<main>`, switch `statusBarStyle` to `'black-translucent'` so the brand-blue theme paints over the status area (or to `'default'` AND remove `viewportFit: 'cover'`), and add `pl-safe-left pr-safe-right` for landscape edge-case.
### H4. Dashboard mobile shell uses `min-h-screen` (`100vh`) instead of `100dvh`
`src/components/layout/mobile/mobile-layout.tsx:24,29` uses `min-h-screen` twice. On iOS Safari (not standalone) `100vh` is the LARGE viewport height (URL bar collapsed), so on first paint the page renders ~75100px taller than visible. The bottom tab bar is `position: fixed` so it lands correctly, but `<main>`'s `min-h-screen` means content scrolls below the visible viewport on initial load — reps see a blank strip past the tab bar until the URL bar collapses on first scroll.
Fix: swap both `min-h-screen` for `min-h-[100dvh]` (Tailwind 3 supports dynamic viewport units). The scanner layout already does this correctly (`src/app/(scanner)/[portSlug]/scan/layout.tsx:68`).
## MEDIUM
### M1. Touch targets below 44pt in the mobile search overlay
`src/components/search/mobile-search-overlay.tsx`:
- "Cancel" button (line 273) is plain text — no min-height, hit-area ≈ 16px tall. Thumb-prone position next to the keyboard.
- Clear-X button (line 260) is `size-7` = 28px. Below Apple HIG 44pt.
- Bucket chips (line 344) are `px-3 py-1.5 text-xs` → ~28px tall. Apple HIG 44pt fail; they're scrollable so misses are recoverable, but each chip needs `min-h-[44px]` or a transparent expanded hit-box (`before:absolute before:inset-0 before:-my-2`).
### M2. Inline-editable-field hit-areas too small for marina-glove use
`src/components/shared/inline-editable-field.tsx:133,172,257` uses `h-8` (32px) and `h-7` (28px) for the edit-mode inputs and select triggers. Detail pages on mobile share this pattern. Apple HIG fail; reps with wet/salty fingers on a pontoon will mis-tap. Bump to `h-11` (44px) on mobile or guard with a `min-h-[44px] md:h-8` mobile-first override.
### M3. `visualViewport.offsetTop` ignored in search overlay positioning
`src/components/search/mobile-search-overlay.tsx:7686` subscribes to `visualViewport.resize` + `scroll` and reads `vv.height`. The drawer uses `top: 12px` + computed height. But `vv.offsetTop` (the visual-viewport's vertical offset within the layout viewport) is not consulted. On iOS Safari with keyboard up + rubber-band scroll, the visual viewport can shift relative to layout; the drawer's `top: 12px` is layout-viewport-relative, so the top of the drawer can briefly clip up under the URL/status bar. Minor visual artifact; only affects scrolled-during-typing states.
Fix: `top: ${(vv?.offsetTop ?? 0) + 12}px`.
### M4. Mobile bottom tabs lack `safe-left` / `safe-right` insets
`src/components/layout/mobile/mobile-bottom-tabs.tsx:4247` uses `pb-safe-bottom` only. The dynamic manifest forces `orientation: 'portrait'` ONLY when installed as a PWA. In Safari (pre-install) on iPhone landscape, the bottom tab bar tucks under the notch. Add `pl-safe-left pr-safe-right` (Tailwind `pl-safe-left` resolves to `padding-left: env(safe-area-inset-left)`).
### M5. Stale memory + suspiciously small PNGs
`project_pwa_assets_pending.md` claims icons must be added; all four exist in `/public` (icon-192 = 688B, icon-512 = 2411B, 512-maskable = 2411B, apple-touch = 654B; dated 2026-05-03). Memory note is stale — delete it. **However**: 688B / 2411B is small for a real branded PWA icon — these look like placeholders. Swap in production artwork before launch.
### M6. apple-touch-icon at `/apple-touch-icon.png` not referenced by the scanner manifest
The root metadata icons block (`src/app/layout.tsx:4046`) declares `apple: '/apple-touch-icon.png'` (180×180). The scanner layout only sets `manifest:` + `appleWebApp` — it inherits the root `icons.apple` because Next.js does shallow-merge of metadata. ✓ but only because of inheritance; explicit confirmation in a comment would prevent future regressions if someone overrides `icons:` in the scanner layout.
### M7. No `apple-mobile-web-app-status-bar-style` mismatch detection between routes
Root layout: `'black-translucent'` (matches navy theme + safe-area inset). Scanner: `'default'` (white opaque bar). When a rep navigates from `/scan` into the main CRM via a deep link inside the same PWA install, iOS uses the install-time status bar style and ignores per-page overrides — so depending on which surface they installed FROM, every other surface looks wrong. Pick one style and apply globally; recommend `'black-translucent'` plus consistent safe-area-inset usage on every shell.
### M8. Vaul drawer `repositionInputs={false}` defaults are correct, but iOS keyboard layoutViewport vs visualViewport edge case
`src/components/shared/drawer.tsx:2022` defaults `shouldScaleBackground: false` and `repositionInputs: false`. The comments in `mobile-search-overlay.tsx:106118` describe the iOS reasoning correctly. Verified ✓. However, the MoreSheet's `<DrawerContent>` uses default `bottom: 0` anchoring (no visualViewport-based height override). If MoreSheet ever gains a text input, it'll exhibit the same scroll-then-jump the search overlay had to special-case. Currently MoreSheet is link tiles only — non-issue unless inputs are added.
### M9. No `<NoScript>` or offline fallback page anywhere
If the scanner PWA cold-launches with no network and no service worker (H1), Next.js's standalone-mode router will fail-soft to a blank screen. There is no `not-found.tsx`, `error.tsx`, or `offline.tsx` in `src/app/(scanner)/[portSlug]/scan/`. Goes hand-in-hand with H1.
### M10. The legacy `/expenses/scan` page coexists with the new `/scan` PWA flow
`src/app/(dashboard)/[portSlug]/expenses/scan/page.tsx` is a desktop-flavored scan-receipt page inside the dashboard shell — different from the standalone PWA at `/[portSlug]/scan`. Both upload to the same `/api/v1/expenses/scan-receipt` and `/api/v1/expenses` endpoints, but the user-facing flows diverge (the dashboard one has both camera + file picker buttons; the PWA one is camera-first). Confusion risk; pick one or clearly label the dashboard surface as "Upload receipt (desktop)" vs the PWA "Scan receipt".
### M11. `interests/interest-list.tsx` FAB safe-area offset is hand-rolled
Line 350 hardcodes `bottom-[calc(env(safe-area-inset-bottom)+86px)]` where 86 = tab-bar height (56) + 30px gap. If tab-bar height changes, FAB collides. Extract `MOBILE_TAB_BAR_HEIGHT` to a shared constant or CSS var.
## Quality nits
- Scanner manifest `short_name: 'Scanner'` vs `appleWebApp.title: 'PN Scanner'` → installed-app label differs across iOS/Android. Unify on "PN Scanner".
- `safe-left`/`safe-right` Tailwind utilities are declared (`tailwind.config.ts:150154`) but never referenced anywhere in `src/`.
- `must-revalidate` on manifest `Cache-Control` is redundant alongside `max-age=300`.
## What's solid
- Per-port dynamic manifest with proper `scope` + `start_url`.
- `viewportFit: 'cover'` + safe-area-inset utilities in topbar/bottom-tabs.
- `-webkit-tap-highlight-color: transparent` global (`globals.css:98`).
- Vaul defaults `shouldScaleBackground: false`, `repositionInputs: false` (`drawer.tsx:2022`) match iOS+Vaul known-issue guidance.
- `visualViewport.height` tracking for above-keyboard sizing (modulo M3).
- Drawer GPU-compositing hints (`globals.css:261267`).
- HEIC-safe capture (`accept="image/*"` + `capture="environment"`).
- Tesseract.js on-device first, AI optional — privacy-respecting fallback.
- Middleware correctly exempts `/scan/manifest.webmanifest` + `/scan` from auth (`middleware.ts:17,33`).
---
**Top 3 to fix before launch:** H1 (service worker + offline queue), H2 (manifest scope overlap), H3 (scanner safe-area bottom-button collision). Everything else is polish.
---
## 26. Multi-currency + FX correctness audit (currency-auditor)
# Multi-currency + FX correctness audit — task #26
Scope: USD-vs-port-currency across berths/invoices/reports/expenses, FX
snapshotting, `currency_rates` retention, rounding, mixed-currency
dashboard totals, PDF math, `berths_default_currency`, hardcoded USD,
`formatCurrency`. Read-only. Branch `feat/documents-folders`.
---
## CRITICAL
### C1. Dashboard "Pipeline Value" sums mixed currencies as USD
`dashboard.service.ts:39-51` and `:95-160` reduce `berths.price` into
`pipelineValueUsd` **without reading `berths.priceCurrency`**, then the
UI labels the result `'USD'`
(`pipeline-value-tile.tsx:45-47`, `kpi-cards.tsx:19`,
`revenue-forecast.tsx:25`). Same bug in `getRevenueForecast`
(weighted pipeline) and the stage-weights total. A single non-USD berth
poisons the headline KPI; masked today only because Port Nimara is USD-
only. With the new per-port `berths_default_currency` setting this will
detonate as soon as a second port chooses EUR/GBP.
Fix shape: either (a) refuse to aggregate mixed currencies and render a
grouped figure like the Revenue Breakdown chart already does, or (b)
convert per row via `convert(price, priceCurrency, 'USD')` and surface
the conversion timestamp. (a) is safer — (b) hides FX risk in one number.
### C2. Revenue / Pipeline PDF reports drop currency entirely
`pdf/templates/reports/revenue-report.ts:78-97` and
`pipeline-report.ts:91-100` render amounts with
`Number(...).toLocaleString(undefined, …)` — no currency code, no
symbol, no `formatCurrency`. The generator
(`report-generators.ts:106-147`) sums `berths.price` across all
currencies, again ignoring `priceCurrency`. PDF output reads
`TOTAL COMPLETED REVENUE: 1,234,567.00` with no unit. Plus the implicit
`undefined` locale means the same PDF renders differently between US-en
and de-DE nodes — non-deterministic under Next.js standalone runtime.
Combined with C1 these are the highest-risk financial artefacts in the
app — they ship to ownership.
### C3. `expenses.amountUsd` snapshot is brittle and date-misaligned
`expenses.ts:117-135` and `:227-249` snapshot `amountUsd` +
`exchangeRate` on the row at create/update — good. But:
- Frankfurter unreachable at create time → `amountUsd = null`,
`exchangeRate = null`. The PDF (`expense-pdf.service.ts:235-246`)
falls back to 1:1 with a footnote but **no aggregate-total guard**
totals silently undercount the foreign-currency portion.
- The snapshot uses the rate **at edit time**, not `expenseDate`. An
expense from 6 months ago, edited today, gets today's FX. The
correct anchor is `expenseDate`.
Expenses is the only table that snapshots FX. Invoices, berths, yacht
maintenance costs, and EOIs store amount + ISO code only and re-resolve
FX live at display — see H1.
---
## HIGH
### H1. `currency_rates` has no history / retention
`db/schema/system.ts:207-222` — one row per `(base, target)`.
`refreshRates()` (`currency.ts:36-68`) **upserts in place**; only the
latest rate ever exists. Consequences:
- Cannot value an old invoice at its issue-date rate.
- No FX audit trail — if Frankfurter returns bad data the prior value
is gone.
- The 6-hourly cron (`queue/scheduler.ts:31`) overwrites silently.
Fix: append-only table (`fetchedAt` in PK), `getRate(from, to, asOf?)`
selects the most recent row ≤ `asOf`. Pairs with M8.
### H2. Rounding policy is undocumented and currency-blind
`currency.ts:23` does `Number((amount*rate).toFixed(2))` — pins to 2
decimals regardless of currency. JPY has 0 fractional digits, so a USD
→ JPY conversion stores `.45 JPY` which is unspendable; a JPY → USD
conversion floors at 1 cent precision when 1 yen ≈ $0.0066. No banker's-
rounding helper exists, no `Math.round` policy, no doc.
Invoice math (`services/invoices.ts:251-276`, `:435-466`) does
`(subtotal * discountPct) / 100` and `subtotal - discountAmount +
feeAmount` **with no rounding** before `String()`-ing into `numeric`
columns. A 2% discount on a $100.10 subtotal stores `'2.002'` and
`'98.098'`. The displayed total (Intl truncates at 2dp) and the stored
total diverge by sub-cent amounts for every percentage-discounted
invoice.
### H3. `formatCurrency` cents-clamp hides fractions on berth pricing
`utils/currency.ts:55-56` clamps `minFractionDigits` to 0 when
`maxFractionDigits: 0` is passed — correct for headline tiles but also
the default for berth-card / berth-columns / berth-tabs price
(`berth-card.tsx:91`, `berth-columns.tsx:185`, `berth-tabs.tsx:410`).
€1,250,000.50 renders as "€1,250,001" with no tooltip. Low impact today;
will confuse yacht-show buyers once non-round prices land.
### H4. Berth recommender ranks prices currency-blind
`berth-recommender.service.ts` scores by `berths.price` with no FX
normalization. Multi-currency tier ranking is meaningless. Heat weights
in `system_settings` are tuned per-port; admins have no way to spot the
skew. Same root as C1 but isolated to the recommender.
### H5. `/api/v1/currency/convert` swallows rate-unavailable as `data: null`
`/api/v1/currency/convert/route.ts:19` does not differentiate "rate
unavailable" from "amount was zero" — both return `{ data: null }` with 200. Callers that distinguish these need a separate error envelope.
`expenses.ts` and `expense-pdf.service.ts` handle null correctly; the
API surface does not.
---
## MEDIUM
### M1. No cold-start bootstrap for `currency_rates`
`queue/scheduler.ts:31` runs every 6h; on a fresh `db:seed` the table is
empty for up to 6h and every `convert()` returns null. Seed initial
rates in `seed-bootstrap.ts` or self-trigger on cron registration. Masked
today because seeded ports are all USD (USD→USD short-circuits at
`currency.ts:9`).
### M2. `seed-bootstrap.ts` hardcodes USD for every port
`seed-bootstrap.ts:42,49` — both demo ports default to USD. The schema
admits per-port currency but no EUR/GBP demo port exists. Multi-currency
correctness has zero seed/fixture coverage. Adding one non-USD demo port
would surface C1/C2/H4 in smoke output.
### M3. Hardcoded "Rates (USD)" column header
`berth-columns.tsx:324` — header reads `'Rates (USD)'` regardless of
the row's `priceCurrency`. Column body is currency-aware; header lies
for non-USD rows.
### M4. EOI / interest-summary PDFs use prefix code instead of `formatCurrency`
`pdf/templates/interest-summary-template.ts:112`,
`berth-spec-template.ts:127,172``'USD 1,234,000'` rather than
`$1,234,000`. Inconsistent with the invoice template and in-app UI.
Surfaces to clients in EOI bundles.
### M5. OCR receipt parser maps `$` → USD unconditionally
`ocr/parse-receipt-text.ts:17`. CAD/AUD/HKD/SGD all print `$`. Force
confirmation when the port's `defaultCurrency` isn't USD.
### M6. Expense form/scan defaults hardcode USD rather than port default
`expense-form-dialog.tsx:61,85,215,227`,
`expenses/scan/page.tsx:63,314`, `scan-shell.tsx:102`. A rep at a
EUR-default port changes the dropdown on every expense.
### M7. Synthesized inverse rates drift
`refreshRates()` stores `1/rate` rounded to 6dp (`currency.ts:60`).
USD→EUR→USD round-trips diverge from identity by basis points; matters
for the `expense-pdf` USD→EUR chain. Fetch base=USD and base=EUR
separately from Frankfurter rather than synthesizing.
### M8. Unique index blocks the H1 fix
`currency_rates_base_target_idx` makes append-only history a breaking
migration. Flagged so the H1 fix is planned with the index drop.
---
## Notes / non-issues
- `formatCurrency` is well-defended; consolidate the ad-hoc
`toLocaleString({ style: 'currency' })` in `expense-columns.tsx` /
`expense-detail.tsx` onto it.
- `getRate` caching in `expense-pdf.service.ts:215-231` is the right
shape — reuse for any other batch conversion path.
- Documenso payloads carry currency through unchanged; no FX in that
path.
**Top 3 to fix first:** C1 (dashboard mixed-currency totals), C2
(report PDFs drop currency entirely), H1 (no FX history/retention).
---
## 29. Outbound webhook delivery audit (outbound-webhook-auditor)
# Outbound Webhooks — Audit (Task #29)
Scope: `src/app/(dashboard)/[portSlug]/admin/webhooks/`,
`src/app/api/v1/admin/webhooks/**`, `src/lib/services/webhooks.service.ts`,
`src/lib/services/webhook-dispatch.ts`, `src/lib/services/webhook-event-map.ts`,
`src/lib/queue/workers/webhooks.ts`, `src/lib/validators/webhooks.ts`,
`src/lib/db/schema/system.ts`, `src/lib/utils/encryption.ts`,
`src/lib/queue/index.ts`. Read-only.
---
## CRITICAL
### C1 — Signature has no replay protection
**`workers/webhooks.ts:120-134`** — HMAC covers only the JSON body
(`sha256=HMAC(secret, bodyString)`). The body contains a `timestamp`
field, but it's not separately authenticated/headered in a way the
receiver can verify in a freshness window. A captured request replays
verbatim, signature still valid. No `X-Webhook-Timestamp`, no nonce,
no documented receiver dedup contract.
Fix: Stripe-style `signature = HMAC(secret, `${ts}.${body}`)` with
`X-Webhook-Timestamp` header, and document that receivers must reject
`|now ts| > 5 min`. Also document `X-Webhook-Delivery` (already
sent) as the receiver-side idempotency key.
### C2 — `webhook_deliveries` grows unbounded
**`schema/system.ts:107-126`** — no reaper anywhere; searches for
the table outside writers returned zero hits. Every event, retry,
test, and redeliver writes a row with full `payload` JSONB plus up
to 1 KB `response_body`. BullMQ's `removeOnComplete`/`removeOnFail`
only prunes Redis, not Postgres. On a port subscribed to high-volume
events (`berth.status_changed`, `interest.stage_changed`,
`invoice.*`) this is unbounded write-amplification.
Fix: maintenance job pruning by status + age (e.g. 30 d success,
90 d dead_letter), gated by a `system_settings` retention key. Add
`(status, createdAt)` index for the scan.
### C3 — Worker dispatches with empty signature when secret is NULL
**`workers/webhooks.ts:111-134`, schema `:97`** — `secret` is
nullable; the worker silently sends header `X-Webhook-Signature: ''`
when missing. Compliant receivers reject, mis-coded ones accept.
Creation always generates a secret, so NULL implies DB tampering or a
future migration mistake — defence-in-depth still warrants a hard
fail.
Fix: dead-letter with reason `missing_signing_secret`; ideally make
`webhooks.secret` `NOT NULL`.
---
## HIGH
### H1 — DNS-rebinding TOCTOU
**`workers/webhooks.ts:18-45, 147-167`** — `resolveAndCheckHost()`
does its own `dns.lookup`, then hands the **hostname** back to
`fetch`, which resolves again before connecting. The validator's
comment (`validators/webhooks.ts:69-70`) defers rebind to the worker,
but the worker's check is independent of the actual connect; a rebind
between `lookup` and `fetch` still hits internal IPs.
Fix: pin the connect to the resolved IP via an undici `Agent` with
`connect: { lookup: () => allowedIp }`, keep the original hostname as
the `Host` header so TLS SNI works. Or inspect `socket.remoteAddress`
post-connect and abort on mismatch.
### H2 — Retry policy too short / no jitter
**`queue/index.ts:14, 29-34`** — `maxAttempts: 3`, exponential 1000 ms.
Real schedule ≈ 1 s / 2 s / 4 s — a 30 s receiver outage during a
deploy permanently dead-letters every in-flight event; super-admins
get a notification storm and events are lost unless redelivered
manually. Industry norm (Stripe, GitHub) is ≥5 attempts over hours
with jitter.
Fix: bump to 810 attempts, exponential base 30_000 ms with jitter,
surface the next-retry-time in the admin detail view.
### H3 — No circuit-breaker on chronically failing endpoints
**`workers/webhooks.ts:231-287`** — after a dead_letter the webhook
stays `is_active=true`. Five global worker slots (`concurrency: 5`)
get saturated by a broken subscriber's 3×10 s retry cycle, starving
other ports' webhooks. The dead_letter notification dedupes per
delivery, so 1000 events → 1000 alerts.
Fix: rolling failure counter on `webhooks`; auto-set
`is_active=false` after N consecutive dead_letters and alert once.
Coalesce notification `dedupeKey` by webhook+day.
### H4 — `EMAIL_REDIRECT_TO` short-circuit writes status `dead_letter`
**`workers/webhooks.ts:94-109`** — semantically wrong;
"exhausted retries" is what `dead_letter` means in admin UI.
The SSRF-blocked path at `:148-166` shares the same status.
Doesn't fire alerts (alert path requires `isFinalAttempt`), but
pollutes the deliveries list.
Fix: introduce a `skipped` (or `paused`) status, use it for both
paths.
### H5 — Payloads are ID-only; redeliver re-sends stale data
All 18 `dispatchWebhookEvent(...)` callsites pass only
`{ clientId, berthId, interestId, ... }`. Receivers must call back
for anything beyond the ID — yet webhooks fire for archived /
merged / deleted entities (`client.archived`, `client.merged`,
`yacht.ownership_transferred`). Worse, `redeliverWebhookDelivery`
(`webhooks.service.ts:283-343`) clones the payload verbatim, so a
replay after GDPR erasure resurrects the deleted ID and a replay
after a client merge resurfaces the pre-merge identity.
Fix: snapshot a minimal `{id, name, status, archived}` at dispatch
time; on redeliver, re-check entity existence and short-circuit to
`skipped` if the row is gone.
### H6 — Test endpoint has no rate limit
`webhooks.service.ts:347-383` — combined with H3 a rapid-fire test
can stall the queue. Add a per-webhook test throttle (e.g. 1/sec).
---
## MEDIUM
### M1 — SSRF denylist gap
`validators/webhooks.ts:18-43` covers RFC1918, loopback, link-local
(incl. AWS IMDS 169.254/16), CGNAT 100.64/10 (catches Alibaba's
`100.100.100.200`), IPv6 ULA / link-local, GCP/Azure named metadata
hosts. **Missing:** Oracle Cloud metadata `192.0.0.192`. Add the
literal.
### M2 — HTTPS check is create-time only
`webhook.url` isn't re-validated at dispatch. A bad migration or DB
edit could let `http://` through. Add `url.startsWith('https://')`
in the worker before `fetch`.
### M3 — `secretMasked` decrypts on every list/get
`webhooks.service.ts:80-99, 103-123` runs `decrypt()` per row to
compute a 5+3-char mask. The mask is deterministic from the
plaintext; cache it in a new column `secret_masked` so the read path
doesn't exercise the encryption key per webhook.
### M4 — Shared encryption key naming
`utils/encryption.ts:7` reads `EMAIL_CREDENTIAL_KEY` but encrypts
webhook secrets, SMTP creds, and IMAP creds with it. Rotation
spans multiple tables; the name implies email-only and invites config
drift. Rename to `APP_CREDENTIAL_KEY` (alias the old name) and
document the rotation runbook.
### M5 — No event-name versioning
`webhook-event-map.ts` exports a flat list. When `data` inside
`interest.stage_changed` changes shape, every receiver breaks
silently. Add either a `X-Webhook-Version` header or `event` name
suffix (`.v2`) before this surfaces to external integrators.
### M6 — `responseBody` may carry third-party PII
`workers/webhooks.ts:191, 197` stores up to 1 KB of the receiver's
response and surfaces it through the admin deliveries list. If a
receiver echoes data in 4xx bodies it lands in
`webhook_deliveries.response_body` and the pino warn line. Flag for
the GDPR DPIA; consider redacting headers / scrubbing on storage.
### M7 — SSRF-blocked delivery isn't audit-logged
`workers/webhooks.ts:148-166` updates the row but skips
`createAuditLog`. Success and final-fail paths both write audit rows.
Add one here — these are the deliveries you most want forensics on.
### M8 — No port-id assertion on the BullMQ payload at worker entry
`workers/webhooks.ts:69` trusts `portId` from the job and uses it for
notifications + audit. The producer is internal and writes consistent
data, so this is defence-in-depth, but the worker could fetch the
webhook row and assert `webhook.portId === payload.portId` before
proceeding.
---
## What's good
- AES-256-GCM at rest; secret returned to admin **once only** on
create / regenerate (`webhooks.service.ts:71-75, 225-229`).
- HTTPS-only create-time validation, comprehensive IPv4 + IPv6
private-range denylist, named cloud metadata hosts.
- DNS re-resolution at dispatch time — intent is right (TOCTOU gap
noted in H1).
- Idempotent delivery row created **before** enqueue
(`webhook-dispatch.ts:48-57`); worker crash leaves a recoverable
`pending` row.
- BullMQ retries + dead_letter handling + super-admin notification;
redeliver path preserves the original failed row + tags the replay
payload with `retried_from` / `retried_at`.
- Multi-tenant guard at every read/write
(`webhooks.service.ts:108, 137, 174, 200, 244, 292, 352`) and on
the dispatcher subscription query (`webhook-dispatch.ts:33-39`).
- `EMAIL_REDIRECT_TO` dev kill-switch.
- 10 s `fetch` timeout via `AbortController`.
- Permission gating: every admin route wraps in
`withPermission('admin', 'manage_webhooks', …)`; redeliver +
regenerate-secret included.
---
## Priority
Land **C1** (timestamp-in-signature) + **C2** (deliveries reaper) +
**H2** (retry policy) + **H3** (auto-disable) before exposing
webhooks to external integrators. **C3, H1, H5** are smaller patches
but should ship in the same release. MEDIUM items can batch behind a
single "webhook hardening" follow-up.
---
## 20. Authorization model integrity audit (authz-auditor)
# Authorization model integrity audit
**Branch:** feat/documents-folders · **Date:** 2026-05-12 · **Auditor:** authz-auditor
**Scope:** every API route's permission gate, port-scope SQL filters, per-user override
merge semantics, `isSuperAdmin` bypass paths, residential toggle, Documenso webhook
port resolution. Read-only.
---
## CRITICAL
### C-1 — Privilege escalation via user-permission-overrides PUT
**File:** `src/app/api/v1/admin/users/[id]/permission-overrides/route.ts` lines 153244
(plus the parallel issue in `src/lib/services/users.service.ts` `updateUser` line
285292).
The PUT endpoint gates on `withPermission('admin', 'manage_users')` and refuses
self-target (line 163), but it **does NOT verify that the caller already holds the
permission they are granting to the target user**. A port admin holding ONLY
`admin.manage_users` can therefore:
1. Mint a colleague with `admin.permanently_delete_clients = true`,
`admin.system_backup = true`, `admin.manage_settings = true`,
`documents.delete = true`, `interests.override_stage = true`, etc.
2. Have the colleague execute those actions on their behalf, or
3. Re-flip leaves on the colleague's record at will because nothing in the
override-merge path knows the granting admin was unprivileged.
The same path exists in **`updateUser` (role reassignment)** — `roleId` is validated
to exist (line 289) but there is no "you can only assign a role whose effective
permission set ⊆ your own" check. Because `admin/roles POST` is super-admin-only,
role-creation is safe, but **role assignment** is the privilege-escalation surface
since a sales_director-equivalent could promote a peer to a super-admin-flavoured
role.
The audit log records the change so the activity is detectable, but detection is not
prevention. Self-target block on the override route is necessary but not sufficient
— the admin can just bounce the elevated permission off a sock-puppet account.
**Fix:** before writing `permissionOverrides`, compute the caller's effective
permission map and reject any leaf in the new override that is `true` while the
caller's matching leaf is `false`. Same check on the `roleId` change in
`updateUser` — compare the new role's effective permission set against the
caller's and refuse on any superset.
---
## HIGH
### H-1 — Listing endpoints without an explicit `withPermission` gate
`grep -L "withPermission|requireSuperAdmin|requirePermission"` against `withAuth`
routes turns up 31 files. Most are legitimate self-service surfaces
(`/me`, `/notifications`, `/currency/*`, `/users/me/preferences`,
`/alerts/*/dismiss|acknowledge`, `/saved-views/[id]` — ownership-checked) or
correctly do an in-handler check (`/clients/bulk`, `/companies/bulk`,
`/yachts/bulk`, `/interests/bulk` — all gate on `ctx.permissions?.<resource>?.<action>`).
The outliers worth flagging:
- **`/api/v1/alerts/route.ts` GET** — no permission gate. Anyone in the port
with valid auth can read every alert row (audit blockers, GDPR alerts,
permission-denied alerts, etc.). Service `listAlertsForPort` scopes on
`portId` so cross-tenant leakage is contained, but the alert payload exposes
internal-only signals (e.g. who triggered a `permission_denied`). Either
gate on `admin.view_audit_log` or filter the payload by sensitivity tier.
- **`/api/v1/vocabularies/route.ts` GET** — intentionally permissionless per the
comment (vocabularies feed pickers across the app). Fine — port-scoped.
- **`/api/v1/settings/feature-flag/route.ts` GET** — port-scoped, returns a
single boolean for a key the client names. Acceptable.
- **`/api/v1/search/route.ts` GET** — relies on the service's `can()` helper
to skip buckets the caller can't see (`search.service.ts` line 305). Good.
`includeOtherPorts` correctly gates on `ctx.isSuperAdmin` (line 20).
### H-2 — Search service: residential buckets
`src/lib/services/search.service.ts` line 305 (`can()`) honours `permissions.residential_clients.view`
and `.residential_interests.view`. The withAuth resolver sets these to `true` when
`portRole.residentialAccess` is true (helpers.ts line 209221) BEFORE the per-user
override layer runs (line 227238). So a per-user override with
`residential_clients.view = false` _will_ take effect — verified by tracing
`deepMerge` (helpers.ts line 7398): the source `false` boolean replaces the target
`true` at the leaf because the recursion only triggers when _both sides_ are
objects. Per-user `false` correctly bubbles through. **Pass.**
### H-3 — `withAuth` userOverride fetch costs a round-trip on every request
Not a security issue but a perf+coupling note: every authenticated request now
runs **three** sequential queries when the user is not a super-admin
(`userPortRoles → portRoleOverrides → userPermissionOverrides`). Hot routes
inherit the latency tax. Consider a `Promise.all` for the override pair, or
a per-request memoize keyed on `(userId, portId)` since multiple `withAuth`
calls per request don't happen but middleware-adjacent paths exist.
---
## MEDIUM
### M-1 — `withAuth` residential toggle bypasses `portRoleOverrides` for residential.\*
`helpers.ts` line 209221: when `portRole.residentialAccess === true`, the resolver
_replaces_ `permissions.residential_clients` and `permissions.residential_interests`
with a hardcoded all-true map. If a port-role-override set
`residential_clients.delete = false` (e.g. "this port lets reps see but not delete
residential rows"), the residential toggle silently overrides that. By design? Maybe
— the toggle is documented as "full residential access" — but it would be
surprising if an admin set up the port-role-override expecting it to constrain
toggled users. Document or compose more carefully.
The per-user permission override still wins (it runs after, line 227), so a
deliberate admin can recover, but the precedence is subtle.
### M-2 — `parseBody`-vs-`req.json` consistency on bulk routes
All four bulk routes (`clients`, `yachts`, `companies`, `interests`) use
`parseBody` correctly. The bulk permission check pattern is repeated four times
with the same shape — extract a `requireOneOf(ctx, [{resource, action}, ...])`
helper to avoid drift when a new bulk route ships.
### M-3 — `documents.feature-flags` and `documents.wizard`
Both routes wrap with `withAuth + withPermission('documents', 'view')`. The
`feature-flags` route returns Documenso/template feature toggles — fine. The
`wizard` route fetches drafts. Spot-check passed; both scope to `ctx.portId`
in the service.
### M-4 — Documenso webhook port resolution: verified correct
`src/app/api/webhooks/documenso/route.ts` line 58101: secret enumeration over
`listDocumensoWebhookSecrets()` with `verifyDocumensoSecret` (timing-safe). The
matched `portId` threads through `portScope` (line 143) to every per-recipient
and per-document handler. `resolveWebhookDocument` (documents.service.ts
line 967996) refuses to mutate when the lookup is ambiguous across ports without
a portId. **Pass.** No cross-tenant write surface.
One small nit: the webhook returns `200` on invalid-secret to avoid leaking
signal (line 100) but the audit row records a `webhook_failed` with
`portId: null`. Rate-limited per IP (line 77). Fine.
### M-5 — `requireSuperAdmin` always requires a portId in `userPortRoles`?
No — super admins skip the `userPortRoles` lookup entirely (helpers.ts line
174 condition), but still need `portId` set somewhere (header or
`profile.preferences.defaultPortId`, line 161164) unless they're hitting a
no-port endpoint. The gate on line 166 only fires when `!portId && !isSuperAdmin`.
A super admin without a portId in the request will have `ctx.portId = ''` and
`ctx.portSlug = ''`; any route that uses `ctx.portId` in a SQL filter will
match nothing, which is a fail-safe but produces confusing empty UIs. Worth
documenting that super-admin requests SHOULD always carry an `X-Port-Id`.
### M-6 — `requireSuperAdmin` audit-logs denials with empty entityId
`helpers.ts` line 298: `entityId: ''`. The audit row is functional but harder
to query later. Set to `attemptedAction` or the route path for forensics.
---
## Pass / verified
- **`deepMerge` false-propagation:** `false` at any leaf correctly overwrites
`true` in the role baseline because the recursion guard requires both sides
to be objects (helpers.ts line 8188). Boolean → boolean falls into the
`else` branch and assigns directly.
- **Override layer ordering:** role → port-role-override → residential toggle →
user-permission-override. User override wins last. Self-target on PUT
rejected (route line 163).
- **Listing port_id SQL filter (sampled):** `clients`, `interests`, `yachts`,
`companies`, `documents`, `files`, `berths`, `invoices`, `expenses`,
`reminders`, `residential_clients`, `residential_interests`, `alerts`,
`error_events`, `audit_logs`, `notes` (all four polymorphic types) — every
service `list*` function constrains on `portId` in the WHERE clause.
Search service goes further with defense-in-depth port_id filters inside
each per-bucket query (search.service.ts lines 361, 443, 490, 539, 600,
667, 717, 772, 805, 856, 902, 952, 995, 1035, 1069, 1107, 1156, 1172,
1186, 1200, 1384).
- **`/admin/ports/[id]`:** explicit `assertPortInScope` blocks
cross-tenant access by non-super-admins (route line 1520). Pass.
- **`/admin/error-events`:** super admins see all, regular admins scoped to
`ctx.portId`; the [requestId] route additionally re-checks the row's
`portId` and returns 404 (not 403) on mismatch to avoid existence leak.
- **`isSuperAdmin` writability:** not in `createUserSchema` /
`updateUserSchema`. Only settable via the invitation flow with an explicit
`if (body.isSuperAdmin && !ctx.isSuperAdmin) throw` guard
(`/admin/invitations/route.ts` line 40). Pass.
- **Documenso webhook:** secret enumeration is timing-safe; ambiguous
cross-port `documensoId` lookups refuse to mutate; portScope threaded to
every handler. Pass.
---
## Summary punch list
| Sev | Item | File / fix |
| ---- | ---------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| CRIT | Privilege escalation via permission-overrides PUT and role reassignment | `permission-overrides/route.ts`, `users.service.ts updateUser` — refuse to grant any leaf the caller doesn't hold |
| HIGH | `/api/v1/alerts` GET ungated | Add `withPermission('admin','view_audit_log')` or filter payload |
| MED | Residential toggle silently overrides port-role-override on `residential_*` | Document precedence in helpers.ts or compose via deepMerge |
| MED | `withAuth` runs three sequential override queries per request | Parallelize override fetches |
| MED | Bulk-route permission check duplicated 4× | Extract `requireOneOf` helper |
| MED | `requireSuperAdmin` audit row carries empty `entityId` | Set to `attemptedAction` |
| INFO | Super-admin request without `X-Port-Id` produces empty `ctx.portId` and silently empty queries | Document; consider 400 |
Word count: ~1050.
---
## 25. Inquiry → CRM funnel correctness audit (funnel-auditor)
# Inquiry → CRM Funnel Audit
Scope: `src/app/api/public/{interests,residential-inquiries,website-inquiries}`, `src/app/api/v1/admin/website-submissions/*`, `src/lib/services/inquiry-notifications.service.ts`, `src/components/admin/inquiry-inbox.tsx`, `src/lib/validators/{interests,residential}.ts`, settings keys `inquiry_contact_email` / `inquiry_notification_recipients` / `residential_notification_recipients`.
Read-only; no edits.
---
## CRITICAL
### C1. "Convert to client" prefill goes nowhere — every conversion is double-typed
`inquiry-inbox.tsx:127-135` flips the row to `converted`, then pushes
`/clients?prefill_name=…&prefill_email=…&prefill_phone=…&prefill_source=website&prefill_inquiry_id=…`.
A repo-wide grep for any of those five keys returns only the writer — **no client form / page / hook ever reads `prefill_*`**. So Convert: (a) flushes the inbox row to "converted" eagerly, (b) drops the operator on a blank New Client form, (c) loses the `inquiry_id ↔ client/interest` linkage permanently because nothing persists it. The triage state is now lying ("converted" with no downstream entity), and operators retype the payload from the inbox card. Either consume the params in `client-form.tsx` (with a hidden `inquiry_id` that the create endpoint persists into `clients.metadata` or a new `inquiry_origin_id` FK), or revert the eager state flip so the inbox stays honest until the client is actually saved.
### C2. Two parallel intake pipelines with no correlation → duplicate interests + zombie inbox rows
`/api/public/interests` directly creates `clients + yacht + interest` rows and queues notifications. `/api/public/website-inquiries` is the new "dual-write capture" that stores the raw payload in `website_submissions` for triage. The website is expected to call both (the docstring on `website-inquiries/route.ts` says "AFTER its existing NocoDB write succeeds"). Nothing links them. Result for a single berth-form POST:
1. `interests` row created automatically with notifications fired.
2. `website_submissions` row inserted with `triage_state='open'`.
3. Operator opens the inbox, sees the "open" card, clicks Convert → second `interests` row.
4. Inbox UI never sees that step 1 already happened; the heat-scored interest from step 1 is silently shadowed.
Either the public form path should write `submission_id` onto the created `interests` row (and the inbox should auto-mark `converted` whenever a matching interest exists), or the two pipelines need to be merged into one. Right now they coexist and contradict each other.
### C3. Email dedup is case-sensitive — capital-letter resubmission spawns duplicates
`public/interests/route.ts:91` matches `clientContacts.value === data.email` (no `lower()`). The supporting `idx_cc_email` index (`clients.ts:91`) is also raw-value, not `lower(value)`. Two POSTs as `Matt@Example.com` and `matt@example.com` produce **two separate clients, yachts, interests** — and now the recommender has split history for the same human. The companies branch (`route.ts:122`) gets this right (`sql\`lower(${companies.name}) = lower(${data.company.name})\``); the email branch must match: lowercase on insert and on lookup, plus a partial unique index on `lower(value) WHERE channel='email'`.
---
## HIGH
### H1. Residential clients have **no dedup at all**
`residential-inquiries/route.ts:73-93` always inserts a fresh `residential_clients` row. There's no email/phone match, no unique index on `residential_clients.email` (verified — `schema/residential.ts` has no `uniqueIndex`). Every resubmit = new prospect. Sales gets a bloated list of phantoms. Mirror the berth-path dedup (lowercased-email lookup → reuse → open a new `residential_interests` row only).
### H2. `findUsersWithInterestsPermission` ignores `user_permission_overrides` (migration 0055)
`inquiry-notifications.service.ts:139-158` reads only `roles.permissions`. The request-time auth path in `lib/api/helpers.ts:227-238` correctly layers role → port-role-override → user-override, but this fan-out helper does not. Symptoms:
- User granted `interests.view` via override only → never gets new-inquiry pings.
- User had `interests.view` in role but their override removed it → still gets pinged.
Either (a) collect every user on the port and run the same `deepMerge` chain per user, or (b) move permission-resolution into a single service helper both callers use.
### H3. Bogus `portId` fails as a 500 (FK violation) instead of 400
`public/interests/route.ts:51-52` accepts the `portId` query/header but never verifies the port exists before the transaction. An invalid id surfaces as a Postgres FK error from `clients.port_id`, returned as a generic 500. The residential endpoint (`residential-inquiries/route.ts:58-61`) validates upfront via `db.query.ports.findFirst` — make the berth route do the same.
### H4. Cross-port email collisions are non-deterministic
`public/interests/route.ts:90-114`: when a client*contact with the same email exists on a \_different* port, the code creates a new client. But `tx.query.clientContacts.findFirst` returns "any matching row" with no `ORDER BY` — subsequent submissions may pick either port's row first. Net: same email used cross-port, then resubmitted to the original port, can spawn 2nd/3rd same-port clients. Fix: filter the lookup by joining to `clients.port_id`, or scope the contact lookup to clients owned by the target port from the start.
### H5. `portName` hardcoded as `'Port Nimara'` in four call sites
- `inquiry-notifications.service.ts:57, 126` (client confirmation + sales alert)
- `residential-inquiries/route.ts:158, 207` (subject tokens)
The author left a `// future: resolve from getPortBrandingConfig` comment. The moment a second port has a marketing site, every email reads "Port Nimara" regardless of recipient. Wire through `getBrandingShell(portId).portName` (already loaded for the HTML body via `branding-resolver.ts`).
### H6. Residential confirmation ignores `inquiry_contact_email`
`residential-inquiries/route.ts:150` hardcodes `contactEmail: 'sales@portnimara.com'` in the client confirmation email. The berth path reads the per-port `inquiry_contact_email` setting (`inquiry-notifications.service.ts:44`). Settings UI (`settings-manager.tsx:96`) advertises this setting controls both — but it doesn't. Admins can't reroute residential replies.
### H7. Residential sales alert bypasses the email queue
`residential-inquiries/route.ts:214` calls `await sendEmail(recipients, …)` synchronously inside `sendResidentialNotifications`. The berth path enqueues via BullMQ (`inquiry-notifications.service.ts:51,118`). When SMTP is slow/down, the residential POST hangs (or 500s — though wrapped in `.catch`, the await is fired _after_ the response is returned, so worker eventloop is the only victim) and the notification is lost with no retry. Move to the email queue like the berth path.
---
## MEDIUM
### M1. No UTM / referrer / attribution capture anywhere
`publicInterestSchema` and `publicResidentialInquirySchema` have no `utm_source`, `utm_medium`, `utm_campaign`, `referrer`, `landing_page` fields. `source` is hardcoded `'website'` (`interests.ts:231`). Berth-recommender heat scoring and lead-source dashboards (audit #11) cannot differentiate organic vs paid vs broker referral. The `website_submissions.payload` JSONB at least preserves whatever the website chooses to forward — but `interests` itself stores only the literal string `'website'`. Add an attribution block to both validators + columns (`interests.utm_*`, `residential_interests.utm_*`) and persist what the website hands us.
### M2. Public routes use `req.json(); schema.parse(body)` instead of `parseBody`
`public/interests/route.ts:47-48` and `public/residential-inquiries/route.ts:51-52`. CLAUDE.md explicitly flags this: "Always use `parseBody(req, schema)` from `@/lib/api/route-helpers`" so the error envelope is field-level 400 instead of a generic 500.
### M3. Company / yacht / phone matching missing `trim` + phone-E164 dedup
- `companies` match (`route.ts:121-124`) is case-insensitive but **not whitespace-trimmed**: `"Acme Ltd "``"Acme Ltd"`.
- Phone contact dedup uses raw `clientContacts.value`, never `valueE164`. The same number formatted differently is a duplicate row.
- Yachts always insert; resubmissions create a fresh yacht every time even if the hull/registration is identical.
### M4. Response envelope inconsistency
- Berth route: `{ data: { id, message } }` at 201 — close to canonical envelope.
- Residential route: `{ success: true, clientId, interestId }` at 201 — legacy `success:true` shape that CLAUDE.md says was normalized away in 2026-05-07.
Pick one and update consumers.
### M5. Inquiry inbox payload-key extractor is brittle
`pickName/pickEmail/pickPhone` in `inquiry-inbox.tsx:58-78` use a small set of candidate keys but never compose `first_name + last_name`. Website payloads that send `{first_name, last_name}` without a `name`/`fullName` field render as `(no name supplied)`. Two-line addresses and contact-form payloads silently lose the operator's first hint of who submitted.
### M6. Audit log misses dedup decisions
`createAuditLog` (`public/interests/route.ts:254-271`) records the interest creation but not whether the client+yacht were created fresh vs reused. Forensics ("did this lead come from the form or get manually entered?") become guesswork. Add `metadata.dedup` = `{ clientReused: boolean, companyReused: boolean }`.
### M7. Yacht inserted with `status: 'active'` even on speculative form leads
`route.ts:179`. There's no "prospect" yacht state, so every unconverted interest still leaves an "active" yacht. Active-yacht counts in reports become inflated. Consider a `'prospect'` status or a deferred-insert pattern keyed on `interests.outcome != 'lost_*'`.
### M8. Admin website-submissions list permission mismatch
The inbox at `/[portSlug]/admin/inquiries` is the marketing-funnel triage surface, but `/api/v1/admin/website-submissions/route.ts:23` gates GET on `admin.view_audit_log`. A sales lead reviewing submissions doesn't conceptually need audit-log access. Introduce a dedicated `inquiries.view` / `inquiries.triage` permission (consistent with the rest of the permission matrix) so this can be granted independently.
---
## Settings application — verified flow
- `inquiry_contact_email` (string, per-port): consumed by berth client-confirmation email (`inquiry-notifications.service.ts:44`); **not** consumed by residential confirmation (H6). Falls back to `sales@portnimara.com` literal.
- `inquiry_notification_recipients` (JSON array, per-port): consumed by berth sales-alert fan-out (`inquiry-notifications.service.ts:106`). Empty array = no external alert. No de-dup against role-based recipients (a user listed here who _also_ has `interests.view` gets two pings).
- `residential_notification_recipients` (JSON array, per-port): consumed by residential alert; falls back to `[inquiry_contact_email]` if empty (`residential-inquiries/route.ts:174-179`). Correct envelope.
Three settings are surfaced on the admin Settings page (`settings-manager.tsx:96-117`) so admins can edit them; default values match the service-side fallbacks.
---
## 32. Improvements + nice-to-haves + genuine AI integration opportunities (improvements-auditor)
_This is a forward-looking proposal report, not a defect audit. Grouped HIGH-VALUE / MEDIUM / EXPLORE with effort estimates and "what NOT to AI-ify" critical pass._
# Audit #32 — Improvements, Nice-to-Haves & AI Opportunities
**Scope:** Forward-looking proposals, not a defect audit. Every proposal grounded in real surfaces seen in this repo (file paths cited). For each: user benefit, implementation sketch, effort estimate (S/M/L), and risk note where it matters.
**Effort key:** S = ≤½ day, M = 13 days, L = >3 days / cross-cutting.
---
## Section A — UX / Feature Improvements
### A · HIGH-VALUE
**A1. Bulk actions on Berths, Companies, Yachts**
Bulk archive/tag/move flow exists in `src/components/clients/client-list.tsx` + `src/components/interests/interest-list.tsx` (single `/bulk` endpoint per domain), but Berths, Companies, and Yachts use the same `data-table.tsx` shell with `BulkAction[]` support and never pass any. Reps regularly need to retag a batch of yachts after import or move 30 berths to a new pricing band.
- _Sketch:_ Add `bulkActions=[...]` wired through the existing `data-table.tsx` API; mirror the `/api/v1/clients/bulk` and `/api/v1/interests/bulk` endpoint pattern for `berths`, `companies`, `yachts`. `interest-list.tsx` lines 124280 are the reference implementation.
- _Effort:_ M
- _Risk:_ low — pattern already tested for two domains; ensure permission gate per action mirrors single-entity gates.
**A2. Smart undo banner for archive / outcome / stage-change**
Already have `client-restore.service.ts` + a smart-restore-dialog component, and stage rollback would be supported by audit logs. Reps lose minutes every time they fat-finger an archive or set an outcome on the wrong card on the pipeline board.
- _Sketch:_ After any archive / outcome-set / `interest_archived` / `interest_completed` trigger, raise a Sonner toast with an "Undo" action for 8s, calling the existing restore service or a tiny reverse-mutation endpoint. Hook into the mutation `onSuccess` in `interest-list.tsx`, `client-list.tsx`, `pipeline-board.tsx`, and `interest-outcome-dialog.tsx`.
- _Effort:_ M
- _Risk:_ berth-rules-engine has already fired side-effects (`berth_unlinked`, `interest_completed` cascade). Undo must replay the reverse rule or explicitly skip rule-engine via a `skipRules` flag — otherwise undo leaves stale berth status.
**A3. "What changed since I last looked" digest on detail pages**
The `entity-activity.service.ts` + `use-track-entity-view.ts` infrastructure is already in place — every detail view is tracked. Reps open a deal they haven't touched in a week and have to manually scroll the activity feed.
- _Sketch:_ On detail page load, query activity items with `createdAt > lastViewedAt` (from `recently-viewed.service.ts`) and render a dismissable "3 new things since 5 days ago: signed EOI, +€2k deposit, new note from María" strip above `entity-activity-feed.tsx`.
- _Effort:_ M
- _Risk:_ none meaningful — purely additive.
**A4. j/k row navigation + `o` open + `e` edit + `/` focus filter on list pages**
Cmd-K is already wired in `command-search.tsx`; reps still mouse-hop between rows in `data-table.tsx`. Power users on busy pipeline days are the loudest beneficiaries.
- _Sketch:_ Add a `useListKeyboardNav(rows, activeIndex)` hook used inside `data-table.tsx`. `j/k` move active row, `o`/Enter opens detail, `e` triggers inline-edit on the first inline-editable cell, `/` focuses the filter input. Respect `e.target` being an input.
- _Effort:_ S
- _Risk:_ must be globally disabled inside dialogs/forms — use the same `document.activeElement instanceof HTMLInputElement` guard already in command-search.
**A5. Quick-create overlay (cmd-K → "+ New …")**
Command-search currently navigates but doesn't create. Reps regularly want to drop a client/interest/reminder without leaving the current page (e.g. a quick call comes in while reviewing a berth).
- _Sketch:_ Extend `command-search.tsx` palette with `+ New client`, `+ New interest`, `+ New reminder`, `+ Log call`. Each opens a drawer-mounted minimal form (3 fields max) using the existing forms wrapped in `Drawer` instead of `Dialog`. Re-use `client-form.tsx`, `reminder-form.tsx` in a "compact" mode prop.
- _Effort:_ M
- _Risk:_ low — entirely additive UI.
### A · MEDIUM
**A6. Smarter defaults from "my last used"**
Today `client-form.tsx`, `interest-form.tsx`, `expense-form-dialog.tsx`, and `reminder-form.tsx` reset every field. A rep doing 12 interests in a row re-types the same source / currency / lead source.
- _Sketch:_ Persist last-submitted values per form per user under `user_profiles.preferences.formDefaults` (same shape used for `dashboardWidgets` per `widget-registry.tsx` comments). On form open, prefill from preferences, mark prefilled fields with subtle "(last used)" hint. Provide a "Reset defaults" link in the form footer.
- _Effort:_ S
- _Risk:_ leaks tag/source preference into the wrong port for super-admins switching ports — scope key by `(userId, portId, formName)`.
**A7. Pipeline board: drag-to-stage with confirm on "won/lost"**
`pipeline-board.tsx` exists. Today reps must click a card → open the interest → open outcome dialog. Drag-to-stage is the natural kanban gesture.
- _Sketch:_ Add `@dnd-kit/sortable` (already in tree if not, very light add). Wire `onDragEnd` to `inline-stage-picker.tsx`'s mutation. Dropping into `won/lost` columns opens `interest-outcome-dialog.tsx` instead of silent set.
- _Effort:_ M
- _Risk:_ berth-rules-engine fires on `eoi_sent` / `contract_signed` triggers — make sure stage drag uses the same `advanceStageIfBehind` codepath, not a raw stage update.
**A8. Saved-view sharing within a port**
`saved-views.service.ts` is per-user. Sales teams want a shared "Hot leads — March" view.
- _Sketch:_ Add `visibility: 'private' | 'shared'` column to `savedViews`; service `list()` returns own + shared. Permission gate: `savedViews.share` (new). Show a "Share" toggle in `save-view-dialog.tsx`.
- _Effort:_ M
- _Risk:_ low — additive; ensure shared views can't expose entity rows the viewer lacks permission for (filter happens server-side on data fetch, not view definition, so already safe).
**A9. Bulk "Move to folder" in documents hub**
Documents hub (`hub-root-view.tsx`, `entity-folder-view.tsx`, `flat-folder-listing.tsx`) supports single-item move via `move-to-folder-dialog.tsx`. No multi-select. Admins post-importing 200 docs spend 200 clicks.
- _Sketch:_ Add row-checkboxes to `document-list.tsx`, surface `Move to folder` as a bulk action. Reuse existing `move-to-folder-dialog.tsx` accepting an array. Service already supports the operation per-item; wrap in a single transaction.
- _Effort:_ S
- _Risk:_ system-managed folders already reject mutations via `assertNotSystemManaged` — bulk move must respect this per-item and report per-item errors (partial success).
**A10. Reminder snooze presets in a single hotkey**
`snooze-dialog.tsx` exists with a Date picker. Reps want "tomorrow morning", "next Mon", "in 1 week" one-tap.
- _Sketch:_ Add quick-buttons row to snooze dialog. Same options as Gmail's snooze. Pre-compute target dates relative to user timezone (already wired via `inline-timezone-field.tsx`).
- _Effort:_ S
- _Risk:_ DST — use the existing `formatInTimezone` helpers, don't add raw ms.
**A11. Dashboard widget: "My open EOIs — needs nudge"**
13 widgets in `widget-registry.tsx`; none surface "EOIs sent ≥ 5 days ago, not yet signed, no reminder set". This is the single most actionable rep widget — the deal that's slipping.
- _Sketch:_ New widget `eoi_followups` querying `documents` where `status='sent'`, computed `sent_age_days > N` (from `system_settings.eoi_nudge_days`, default 5), grouped by client. Include "Send reminder" action calling existing `sendReminder` Documenso wrapper.
- _Effort:_ M
- _Risk:_ none.
**A12. Dashboard widget: "Berths I'm watching"**
Multiple reps end up specialising on berth subsets. Today no way to pin.
- _Sketch:_ Add a `watchedBerths` array under user preferences, "watch" toggle in `berth-detail-header.tsx`, widget rendering status changes since last view.
- _Effort:_ SM
### A · EXPLORE
**A13. Pipeline "what's due this week" board view**
A second pipeline-board view mode that columns by next-action-date instead of stage. Useful when stage is similar across many deals but timing varies.
- _Sketch:_ Toggle in `pipeline-board.tsx` header switching between stage-mode and date-mode. Bin into "Today / This week / Next week / Later".
- _Effort:_ M
**A14. Inline-editable pipeline-board cards**
`pipeline-card.tsx` is read-only; double-click → edit value/notes in place, mirroring the `<InlineEditableField>` pattern already used everywhere on detail pages.
- _Effort:_ S
**A15. "Open in new tab" cmd-click on any entity row**
`data-table.tsx` row click navigates. Need to make every row a real `<a href>` so cmd-click + middle-click behave natively. Power users coming from Linear / Notion will expect this.
- _Effort:_ S
- _Risk:_ keyboard-nav handler from A4 must not interfere with native link semantics.
---
## Section B — Subtle Ergonomic Wins
### B · HIGH-VALUE
**B1. Auto-save indicator on `<InlineEditableField>`**
Inline-editable fields blur-save silently. Reps occasionally close the tab thinking their edit didn't take.
- _Sketch:_ Tiny "Saved · just now" timestamp ghost-text near the field for 2s after mutation success; "Saving…" spinner while pending. Surface in `inline-editable-field.tsx` and `inline-tag-editor.tsx`.
- _Effort:_ S
**B2. Empty-state CTAs everywhere**
`empty-state.tsx` exists but several lists fall back to "No results" plain text (e.g. interest-eoi-tab when no EOI yet, client-yachts-tab when none linked).
- _Sketch:_ Audit every list/tab consumer, wire `<EmptyState>` with a primary CTA (e.g. "Generate EOI", "Link yacht").
- _Effort:_ S
**B3. Copy-to-clipboard with smarter format**
Mooring numbers (`A1`), client phones, IBANs all benefit from "Copy" affordance. Today users select-and-copy from inline-editable fields which produces inconsistent whitespace.
- _Sketch:_ Add tiny "copy" icon-button next to `inline-phone-field.tsx`, mooring number display in `berth-detail-header.tsx`, and bank details in invoice detail. Use the standard `navigator.clipboard.writeText` with a 1s "Copied" tooltip.
- _Effort:_ S
### B · MEDIUM
**B4. Visual indicator for system-managed folders**
CLAUDE.md says folder-tree-sidebar shows lock markers on system folders. Add the same visual rule to `move-to-folder-dialog.tsx` — today the dialog lets you select system folders (and gets rejected later by `assertNotSystemManaged`).
- _Effort:_ S
**B5. "Recently viewed" rail in command-search**
`recently-viewed.service.ts` exists; cmd-K opens to all-purpose search. Show last-5-viewed entities at top of palette when no query typed.
- _Effort:_ S
**B6. Inline phone-to-call / phone-to-WhatsApp links**
`inline-phone-field.tsx` renders text. Wrap in `tel:` and append a WhatsApp icon linking to `https://wa.me/<E.164>`. For a port-side sales team WhatsApp is the primary channel.
- _Effort:_ S
- _Risk:_ phone numbers without an `+` country code break `wa.me` — only render when E.164-valid.
**B7. Toast deduplication for realtime invalidation**
`realtime-toasts.tsx` (touched in current branch). Multi-edit sessions where one rep edits 8 fields generate 8 toasts on the watching rep's screen. Coalesce within 2s.
- _Effort:_ S
**B8. Filter chip "save as view" shortcut**
`filter-chips.tsx` + `saved-views-dropdown.tsx` exist. Add a small "Save current filters as view" inline button when there's an unsaved filter delta.
- _Effort:_ S
### B · EXPLORE
**B9. Command-palette macros**
"send EOI to last-viewed client", "create reminder in 3 days for current client", etc. Recorded by holding a key while performing actions, then invokable via cmd-K → "Run macro".
- _Effort:_ L
- _Risk:_ niche; design-heavy for low payoff. Push to backlog.
**B10. Inline timezone awareness on dates**
`timezone-drift-banner.tsx` warns of drift. Extend: every `formatDate` in detail headers shows `Mon 14 May · 14:32 (your time) · 15:32 (client time)` on hover when client timezone is known.
- _Effort:_ S
**B11. "Pin" comment/note**
`notes.service.ts` is polymorphic; add a `pinned: boolean` column and surface pinned notes at the top of every tab.
- _Effort:_ S
---
## Section C — Genuine AI Integration Opportunities
Existing AI surfaces grounded in this repo: `admin/ai` and `admin/ocr` admin pages; `email-draft.service.ts` (compose suggestion via `/api/v1/ai/email-draft`); `interest-scoring.service.ts` (pure SQL — _not_ AI today, candidate for AI uplift); `berth-pdf-parser.ts` (AI is the 3rd parser tier); `expense-ocr.service.ts` + `receipt-scanner.ts` (OCR + structuring); `ai-budget.service.ts` (cost-budget gate). The OpenAI SDK is wired but optional. All proposals below assume model calls go through a service that respects `ai-budget` and an explicit per-port enable flag.
### C · HIGH-VALUE
**C1. Auto-summarize a client / interest on detail open**
When a rep opens a client/interest, summarize: "5 EOIs over 18 months, 2 archived, last touched 12 days ago by María, current stage is contract-out — last note suggests cash-flow concern; berth A4 is the primary." Plays directly into A3 (what-changed digest).
- _Sketch:_ New `/api/v1/ai/entity-summary` endpoint accepting `entityType + entityId`, gathering activity log + notes + linked entities (already available via `entity-activity.service.ts`), prompting GPT for a 3-sentence summary. Cache by `(entityId, last_activity_id)` in Redis. Surface as a collapsible card above `detail-header-strip.tsx`. Always show "View source" → activity feed; never hide raw data.
- _Effort:_ M
- _Risk:_ confabulation — model invents a number. Mitigate: structured prompt that returns JSON with `claims: [{text, sourceActivityIds: []}]`, render only claims with non-empty source IDs. Hard 200-token cap.
**C2. Semantic search across notes, email bodies & document content**
`search-nav-catalog.ts` is keyword-based. Reps searching "the client who was worried about wave exposure" can't find anything. The biggest practical AI win in a CRM.
- _Sketch:_ Add an `embeddings` table (pgvector — already supported by Postgres). Embed `notes.body`, `email_messages.text`, signed-document OCR text, on insert via a new BullMQ `embeddings` worker (sibling to `workers/ai.ts`). Add `/api/v1/search/semantic` returning ranked entityIds. Toggle in cmd-K palette between "Exact match" and "Semantic". Cite source row per hit.
- _Effort:_ L
- _Risk:_ PII flowing to OpenAI embeddings. Use a local embedding model (gte-small via fastembed/onnx) per `lib/ai` design — never ship raw notes to OpenAI for embedding. Document this clearly in CLAUDE.md.
**C3. Interest scoring uplift — hybrid SQL + lightweight learned model**
`interest-scoring.service.ts` is pure rule-based (pipelineAge, stageSpeed, etc.). It works but reps disagree on signal weights. Train a per-port logistic regression on historical `outcome` (won/lost) using current factors + a few new ones (days since last note, last email response time, deposit pattern). Output a calibrated probability.
- _Sketch:_ New nightly job `train-interest-model` in `workers/ai.ts` using a tiny library (no GPT — pure numerical). Persist coefficients in `system_settings.interest_model`. Service applies them at scoring time. Expose model AUC on `admin/ai`.
- _Effort:_ L
- _Risk:_ per-port data thin (cold start). Default to SQL weights until ≥30 closed interests exist. Document drift detection — refuse to serve a model with AUC ≤ 0.6.
**C4. Smart reminder suggestions from email content**
Inbox (`email-threads-list.tsx`) already exists. When a client email contains "Let's chat next Tuesday" or "I'll get back to you in two weeks", surface a one-click "Create reminder for 21 May".
- _Sketch:_ On new `email_messages` insert, the existing worker calls a new `extractActionableDates(body)` GPT prompt returning JSON `{candidates: [{date, summary, confidence}]}`. Surface as a banner in `email-threads-list.tsx` and in the matching interest's reminder rail. **Never auto-create** — always suggest.
- _Effort:_ M
- _Risk:_ dates in client signatures / disclaimers ("This email was generated on …") fool the model. Filter low-confidence; cap one suggestion per message.
### C · MEDIUM
**C5. "Why this berth?" + "Why not?" explanation for the recommender**
`berth-recommender.service.ts` outputs a tier (A/B/C/D) + heat score. Reps can't always articulate to the client why a specific berth made the shortlist.
- _Sketch:_ Add an LLM rephrasing step over the structured tier-reasoning JSON (already produced by the service). Returns plain-English: "Tier A: matches your yacht's 22m LOA + 5m beam, on the protected pontoon, currently available, no historical pushback." Render inside `berth-recommender-panel.tsx`. Source data is fully structured → low confabulation risk.
- _Effort:_ S
- _Risk:_ explanation must never contradict the structured tier. Add an automated unit assertion that the explanation contains the tier label and the dimensions field.
**C6. Auto-draft post-meeting note from a voice memo**
Reps walk back from a viewing with a 90s phone recording. Today they re-type. Drop the audio into the client's notes tab, Whisper transcribes + GPT summarizes into note-friendly bullet points.
- _Sketch:_ Add `audio-note-upload` action to `notes-list.tsx`. Worker pipeline: upload via storage backend → Whisper → GPT bullets → insert as a draft note flagged `ai_generated=true`. Rep reviews + saves.
- _Effort:_ M
- _Risk:_ Whisper accent accuracy on Polish / Italian names. Always preserve the raw audio + transcript alongside the bullets; never delete the source.
**C7. Translation for portal/client comms**
Polish reps writing English. English reps writing Polish. Currently they paste into Google Translate.
- _Sketch:_ Add a translate-icon button to `compose-dialog.tsx` and `notes-list.tsx`. One-click translates a draft into the client's preferred language (already tracked on `clients.preferredLanguage`). Show both versions side-by-side before send.
- _Effort:_ S
- _Risk:_ never auto-translate without rep confirmation, especially for any contractual phrasing.
**C8. Document-template merge-field auto-population from client context**
`merge-fields.ts` catalog + `eoi-context.ts` already do structured population. Where merge fields lack a structured source (admin templates with `{{custom_intro}}` blanks), an LLM could draft from notes + client profile. Rep then reviews.
- _Sketch:_ New "Suggest draft" button on each blank merge field at template-fill time. Returns 23 phrasings; rep picks one.
- _Effort:_ M
- _Risk:_ see "what NOT to AI-ify" below — this is borderline. Allowable only for non-legal merge fields (greeting, intro paragraph), explicitly blocked for legal/financial blanks.
**C9. Photo categorisation for berth/yacht uploads**
Berth PDFs are parsed; raw photos uploaded to yacht/berth detail aren't tagged. AI auto-tagging would speed search for "yachts with a bowsprit" or "berths with a fixed davit".
- _Sketch:_ On image upload via `image-cropper-dialog.tsx`'s completion, queue a vision job that returns 35 tags (drawn from a controlled vocabulary). Store as photo metadata. Search filters use vocabulary terms.
- _Effort:_ M
- _Risk:_ vision-model bias / hallucinated features. Constrain output to a port-defined vocabulary list; reject anything outside it.
### C · EXPLORE
**C10. Conflict / clause-mismatch detection across templates and signed copies**
When admins edit a template, did the new clause contradict something they wrote in another template? When a counterparty returns a "with edits" PDF (currently uploaded via `external-eoi-upload-dialog.tsx`), did they alter a non-trivial clause?
- _Sketch:_ Embed each clause; on template save, surface "this clause is 0.92 similar to but materially differs from a clause in Template X". On external-EOI upload, diff against the canonical template's text and flag deltas in a yellow strip with "Reviewed by [rep]" before the rep can finalize.
- _Effort:_ L
- _Risk:_ false confidence — see "what NOT to AI-ify". Acceptable only as an _assistive flag_, never as a green-light. UI copy must say "Possible material difference detected — review required" not "No material difference".
**C11. Expense anomaly detection beyond `expense-dedup.service.ts`**
`expense-dedup.service.ts` handles exact duplicates. Layered AI: detect amounts outside the rolling p95 for the same vendor, or trip-labels that look mismatched against expense date.
- _Sketch:_ Nightly job computes per-vendor p95 and flags outliers as `expense_anomaly` reminders for the admin.
- _Effort:_ M
- _Risk:_ low — it's a soft flag, not an auto-action. No money movement is gated.
**C12. Smart vocabulary maintenance**
`vocabularies` table holds lead-sources etc. Over time, reps spawn synonyms ("Inst.", "Instagram", "IG"). Cluster + suggest merges to the admin.
- _Effort:_ SM
---
## Section C+ — What NOT to AI-ify (critical pass)
These places either carry liability if the model confabulates, or have a tighter ground-truth than AI can match. **Refuse the AI proposal even if it sounds appealing.**
- **Legal text in EOIs, contracts, reservation agreements.** `eoi-context.ts`, `document-templates.service.ts`, `reservation-agreement-context.ts`. The merge-field allow-list (`VALID_MERGE_TOKENS` in `merge-fields.ts`) exists _precisely_ to keep AI out of legal copy. Never AI-generate a clause; never AI-paraphrase a clause "for readability"; never AI-translate a clause and present the translation as binding. Keep all legal text rep-authored or counsel-authored, period.
- **Money flow.** Invoice amounts, deposit allocation, currency conversions, FX rate selection (`currency.ts`, `invoices.ts`). The audit-26 multi-currency audit is in flight precisely because money math has to be deterministic and reconcilable. AI here = unrecoverable customer trust damage on a single mistake.
- **Regulatory / GDPR responses.** `gdpr-export.service.ts`, `gdpr-bundle-builder.ts`. Subject-access requests must return _exactly_ what's in the database, with no LLM summarization layer that could omit a record.
- **Signing decisions.** The Documenso webhook (`handleDocumentCompleted` idempotency, audit-tier 1) is the source of truth that a contract was signed. AI must never infer signing state from email content. If the contract isn't in the webhook stream as `DOCUMENT_COMPLETED`, it isn't signed.
- **Berth assignment auto-commit.** `berth-recommender.service.ts` is intentionally pure SQL; the rules engine is intentionally `suggest` by default. Don't change that — auto-binding a berth to a client based on an LLM "judgment" is exactly the kind of mistake that ends in a refund and an apology. Recommend, never auto-assign.
- **Mooring-number / dimensions parsing.** The 3-tier PDF parser (AcroForm → OCR → AI) escalates to AI only when OCR confidence is low _and_ a rep clicks "AI parse" _and_ a mooring-mismatch confirmation is required at apply time (`berth-pdf.service.ts`). Don't lower any of those guards.
- **Pipeline outcome ("won" / "lost").** This drives revenue reporting (`reports.service.ts`). Setting an outcome must remain a human decision. AI may suggest "this looks won based on the signed contract", but the human clicks the button.
- **Email send-side text in template-driven send-outs.** `document-sends.service.ts` rate-limits and audits. AI-generated wording is fine for free-form composes (`compose-dialog.tsx`) where the rep reviews. AI-generated wording is _not_ fine on bulk template sends where one bad phrasing reaches 50 clients before anyone notices.
- **Audit log entries.** Audit logs (`audit.service.ts`) must remain raw structured events. Never let AI rewrite or compress them.
- **Permission overrides.** `user_permission_overrides` (new in this branch). AI must never suggest or auto-apply grant/revoke — that's a security primitive.
---
## Implementation sequencing recommendation
If the team wants a 2-sprint shipping bundle aligned with the existing branch's themes:
1. **Sprint 1 (UX, low risk):** A1, A4, A5, A6, A11, B1, B2, B3, B5 — everything tagged S or low-M, no new infra.
2. **Sprint 2 (AI runway):** Build the `lib/ai` skeleton (budget gate is in place; need a local-embedding pipeline + a worker) → land C1 (entity summary) and C5 (recommender explanation), both low-risk because they wrap structured data. Defer C2 (semantic search) until the embedding worker is proven.
3. **Backlog:** A2 (smart undo — needs rules-engine reverse design), A7 (drag-to-stage on board), C3 (learned scoring — needs sufficient closed-deal volume per port), C10 (clause conflict — handle with extreme care).
Every C-section proposal should ship behind a per-port admin toggle (`system_settings.ai_features.<name>`) and respect `ai-budget.service.ts`. Every AI surface must cite its source rows or be flagged as "AI assistance".
— End of report —
---
## 22. Date/time + DST + scheduled jobs audit (datetime-auditor)
# Date/time + DST + scheduled jobs audit — 2026-05-12
Scope: BullMQ cron schedules, reminder dueAt round-trip, TZ drift banner,
server-side date formatting, ISO-8601, jobs that fire around midnight in
user TZ vs server UTC, DST transitions, leap years, end-of-month.
## CRITICAL
### C1 — Reminder `dueAt` round-trip shifts by user-TZ offset on every edit
`src/components/reminders/reminder-form.tsx:86,99,119`
```ts
setDueAt(reminder.dueAt.slice(0, 16)); // line 86 — load
tomorrow.toISOString().slice(0, 16); // line 99 — default
new Date(dueAt).toISOString(); // line 119 — submit
```
`reminder.dueAt` is an ISO-8601 UTC string (`...Z`). Stripping the last 5
chars yields `2026-05-15T13:30` and feeds it into a `<input type="datetime-local">`
which interprets the value as **local time**. On submit, `new Date('2026-05-15T13:30')`
parses as local-time and `.toISOString()` converts back to UTC, **subtracting
the user's UTC offset**. So in Warsaw (CEST, UTC+2) every save of an
existing reminder shifts the time backward by 2 h. Open + save again, it
shifts another 2 h. End-result: a reminder created at "10:00 local" drifts
to 06:00, then 04:00, until it's eventually negative-of-the-other-side
(early morning vs evening).
The "default tomorrow 9 AM" path has the same bug in the opposite
direction: `tomorrow.setHours(9,0,0,0)` gives 09:00 _local_, then
`.toISOString().slice(0,16)` strips the Z so the input shows 07:00 (UTC)
to the user, who reads it as 07:00 local. On submit it stores 05:00 UTC.
The contact-log dialog at `src/components/interests/interest-contact-log-tab.tsx:459-469`
already implements the correct pattern (`localIsoString` building the local
HH:MM from `getHours()`/`getMinutes()`). Port it to `reminder-form.tsx` and
`snooze-dialog.tsx`. Same applies to any other future `datetime-local`
binding.
### C2 — BullMQ recurring jobs run in UTC, not in port-local time
`src/lib/queue/scheduler.ts:66-72`
```ts
await queue.upsertJobScheduler(
job.name,
{ pattern: job.pattern }, // no `tz` option
{ data: {}, name: job.name },
);
```
BullMQ's `RepeatOptions` defaults `tz` to UTC when unset. Concrete
fallout for the Warsaw port (CET/CEST, UTC+1/+2):
| Pattern | Intent | Actual fire (CET / CEST) |
| -------------------------------------------- | ------------- | ------------------------------- |
| `0 8 * * *` (invoice-overdue, tenure-expiry) | "8 AM local" | 09:00 winter / 10:00 summer |
| `0 2 * * *` (database-backup) | "2 AM local" | 03:00 winter / 04:00 summer |
| `0 4 * * *` (session-cleanup, gdpr cleanup) | "4 AM local" | 05:00 winter / 06:00 summer |
| `0 3 * * 0` (backup-cleanup) | "Sunday 3 AM" | Sun 04:00 winter / 05:00 summer |
Twice a year (last Sun of March, last Sun of October) the local firing
time visibly shifts by an hour and admin docs ("daily check at 8 AM")
silently break. Fix: pass `tz: process.env.SCHEDULER_TZ ?? 'Europe/Warsaw'`
(or read per-port — see also C3) to every `upsertJobScheduler`. The
hourly/sub-hourly patterns (`* * * * *`, `*/N * * * *`, `0 * * * *`) are
TZ-invariant and don't need a `tz`.
### C3 — `report-scheduler` never advances `next_run_at`
`src/lib/queue/workers/reports.ts:22-50`, `src/lib/services/reports.service.ts`
The minutely scheduler selects `WHERE next_run_at <= now()`, enqueues a
`generate-report` job, and inserts a `generated_reports` row — but does
not bump `scheduled_reports.next_run_at`. There is no other write of
that column anywhere in the service layer or API. Effect: once a
scheduled report comes due, the worker re-queues it **every minute,
forever**, until a human zeros the row out. For weekly/monthly reports
this means an instant flood of duplicate emails to recipients.
After enqueueing, write a new `next_run_at` derived from the cron
expression (use `cron-parser` or equivalent; project already vendors
`croner`-style logic via BullMQ's repeat machinery). Wrap the SELECT +
UPDATE in a transaction with `FOR UPDATE SKIP LOCKED` so two scheduler
ticks racing on the same row can't double-fire.
## HIGH
### H1 — `detectOverdue` compares against UTC "today"
`src/lib/services/invoices.ts:763`
```ts
const today = new Date().toISOString().split('T')[0]!;
// ... lt(invoices.dueDate, today)
```
`invoices.due_date` is a `DATE`. Building "today" from `toISOString()`
returns the UTC calendar date. The cron fires at 08:00 UTC (= 09:00 / 10:00
local) so today-in-UTC and today-in-Warsaw agree at that moment, but if a
human ever calls `detectOverdue` between 00:0002:00 local (still
yesterday in UTC), invoices due "today" get flagged overdue a day early.
Compute the comparison date in port-local time (Intl + `formatToParts`).
### H2 — Server-side PDF/email date formatting has no `timeZone`
`src/lib/pdf/templates/reports/*.ts`, `src/lib/pdf/templates/*.ts`,
`src/lib/email/templates/document-signing.ts:141`
Many calls of the form `new Date().toLocaleString('en-GB')` or
`new Date(...).toLocaleDateString('en-GB')` with no `{ timeZone }`
option. On a UTC-deployed Docker container the output is UTC even when
the PDF context is per-port-local. "Generated: 11/05/2026, 22:30:00" on
a report a Warsaw rep opens at 00:30 the next morning is confusing.
Pass `{ timeZone: portTimezone }` (resolve from `ports.timezone` or
`port_settings`) into every server-side formatter.
### H3 — Notification-digest TZ gate skips a day on DST spring-forward
`src/lib/services/notification-digest.service.ts:79-83`
The local-hour gate works correctly in steady state, but on the
spring-forward boundary (e.g. Warsaw 31 Mar, 02:00 → 03:00 CEST), if the
configured digest time is `02:00` it is _skipped entirely_ — local hour
goes from 01 to 03. Conversely on fall-back (CEST → CET) at 03:00 → 02:00
a `02:00` digest fires twice in the same calendar day. Document the gap
or, better, gate on `(port_id, local-date)` last-sent rather than the
hour alone.
### H4 — Reminders fire/list use `new Date()` against UTC-stored timestamps but UI shows port-local
`src/lib/services/reminders.service.ts:87, 105, 515`
`lte(reminders.dueAt, new Date())` is correct (`dueAt` is `timestamptz`),
but `processOverdueReminders` runs every 15 minutes and emails users
the second the UTC instant matches. If a rep sets a reminder for "Friday
17:00" in Warsaw, the email lands 17:00 CEST → fine. But the email
template (`notifications` insert) renders the _server_ time — same H2
issue. Verify the user-facing email body renders `dueAt` in the recipient's
preferred timezone (`userProfile.preferences.timezone`), not server UTC.
## MEDIUM
### M1 — TZ-drift banner endpoint asymmetry
`src/components/dashboard/timezone-drift-banner.tsx:62-75`
Reads from `GET /api/v1/me` (returns `profile.preferences.timezone`),
writes to `PATCH /api/v1/users/me/preferences` (a _different_ preferences
JSONB row). Both endpoints exist and both ultimately update
`user_profiles.preferences`, so functionally fine — but having two
endpoints write the same blob with different validators (`/me`
allow-lists `{dark_mode, locale, timezone, tablePreferences}`,
`/users/me/preferences` uses `updateUserPreferencesSchema`) means a key
accepted on one endpoint may be silently dropped on the other. Either
merge into a single endpoint or document which is canonical.
### M2 — Alpine small-ICU risk for per-port `Intl.DateTimeFormat({ timeZone })`
`notification-digest.service.ts` `localHourFor` and any future per-TZ
formatter need full-ICU. If the Docker base is Alpine without `full-icu`,
named zones silently fall back to UTC and the catch swallows it. Add a
startup self-test confirming `Intl.DateTimeFormat('en',{ timeZone: 'Europe/Warsaw'}).format(new Date())` differs from UTC.
### M3 — Contact-log `followUpAt` validator is looser than reminders
`src/lib/validators/interest-contact-log.ts:14,23`
`z.coerce.date()` accepts unzoned strings. Tighten to `z.string().datetime()`
to match the direct reminders endpoint.
### M4 — BR-060 follow-up uses raw ms-arithmetic for "days since"
`src/lib/services/reminders.service.ts:438`
`(now - lastActivity) / 86_400_000` under/over-counts by 1 h across DST
boundaries. Cosmetic for 14-day windows; document the rounding bias.
### M5 — Greeting hourly tick uses `setInterval(3_600_000)`
`src/components/dashboard/dashboard-shell.tsx:113` — drifts across DST.
Use a recursive `setTimeout` keyed to next local hour boundary.
## ISO-8601 conformance summary
- Reminder writes/emit: `z.string().datetime()` + `.toISOString()`
- Contact-log writes: `z.coerce.date()` — loose, see M3.
- `type="date"` fields serialize as `YYYY-MM-DD` matching DB `DATE`. ✓
- PDF/email render: mixed; H2 covers the missing `timeZone`.
## Round-trip recap (picker → DB → email)
1. `datetime-local` value is **local time**, no TZ marker.
2. `new Date(v).toISOString()` → UTC Z form to API.
3. DB `timestamptz` stores the instant.
4. Re-render to picker via `localIsoString(iso)` (build local YMD/HM
from `getHours()` etc.) — **never** `iso.slice(0,16)`.
5. Email/PDF render with `{ timeZone: portOrUserTz }`.
C1 is the only place this breaks today. Once fixed plus C2/C3, the
chain is consistent.
## Out of scope
- No `node-cron` / `croner` jobs outside BullMQ.
- No `Date.UTC` construction; everything via `new Date(...)` / `Date.now()`.
- No `Temporal` adoption; defer until Node 22 LTS unflags it.
---
## 24. File lifecycle + storage drift audit (file-lifecycle-auditor)
# Audit — File lifecycle + storage drift
Scope: orphan blobs, stale folder rows, avatar cleanup, EOI signed-PDF orphans, brochure / berth_pdf version retention, storage-swap migration completeness, demoteSystemFolderOnEntityDelete, file_id orphans after document delete, GDPR-export ZIP retention.
Branch: `feat/documents-folders` @ 660553c. Read-only.
---
## CRITICAL
### C1. Avatar replacement leaks `files` rows + S3 blobs forever
`src/app/api/v1/me/avatar/route.ts` POST uploads a NEW file via `uploadFile()` and overwrites `user_profiles.avatar_file_id` — but never reads or deletes the previous id. Every "Replace photo" leaks one DB row + one blob, untethered (no `client_id`/`yacht_id`/`company_id`), so invisible to every existing UI sweep.
```ts
// no read of old avatarFileId, no cleanup
await db
.update(userProfiles)
.set({ avatarFileId: record.id, updatedAt: new Date() })
.where(eq(userProfiles.userId, ctx.userId));
```
**Fix:** SELECT the prior `avatar_file_id`, call `deleteFile()` (already handles ref-check + blob + audit), wrapped in try/catch so a stale-blob failure doesn't block the new avatar.
---
## HIGH
### H1. `handleDocumentCompleted` put-before-insert leaks signed-PDF blobs on retry storms
`src/lib/services/documents.service.ts:1131-1188`. Sequence: `storage.put``db.insert(files)``db.update(documents).set(signedFileId)`. The idempotency gate at line 1110 stops a _second_ webhook from minting a _second_ blob — but only if `doc.status === 'completed'` AND `signedFileId` is set, which requires step 3 to have run. If step 2 OR step 3 throws on attempt N, the blob from step 1 survives with no DB pointer; Documenso retries; the gate doesn't trip (status still not 'completed'); step 1 runs again with a fresh UUID storage path. Each retry compounds an orphan.
**Fix:** either insert the `files` row in a `pending` state BEFORE `storage.put` (so failure rolls back via FK / explicit cleanup), or reuse a stable storage key derived from `documents.id` so retries overwrite the same blob.
### H2. `deleteDocument` strands `fileId` + `signedFileId` rows + blobs
`src/lib/services/documents.service.ts:596-616` does `db.delete(documents)` only. Both file FKs are plain `references()` (no cascade, no SET NULL) — the document row vanishes but the `files` rows + blobs survive with no link back. For a `cancelled`/`expired` doc with `signedFileId` (the `sent`/`partially_signed` block at line 599 doesn't cover these), the signed contract PDF — containing PII — is permanently orphaned in storage.
**Fix:** in `deleteDocument`, also delete dependent `files` rows via `deleteFile()`, or refuse the delete if files attached (mirroring `deleteFile`'s ref-check).
### H3. Brochure versions: zero cleanup, ever
`src/lib/services/brochures.service.ts:191` `archiveBrochure` only flips `archivedAt` + clears `isDefault`. No version-row delete, no blob delete. No "delete prior version" admin endpoint, no retention cron, no rolling cap. CLAUDE.md says "Archived brochures retain version history" — that's by design, but there's also zero path to ever drop one. With ~10 MB PDFs iterated monthly, linear unbounded growth.
**Fix:** admin `deleteBrochureVersion(brochureId, versionId)` endpoint (blob delete via `getStorageBackend().delete()` + row delete in tx); refuse to delete the only remaining non-archived version. Optionally `brochure_version_retention_count` system setting.
### H4. `berth_pdf_versions` has no cleanup mechanism
Symmetric problem. `src/lib/services/berth-pdf.service.ts` inserts a fresh row + UUID-keyed blob per upload (line 213); old versions accumulate forever. `current_pdf_version_id` advances; history-by-design is unbounded-by-default. For a port with hundreds of berths reuploaded under parser iterations, this is the largest storage footprint in the system.
**Fix:** admin "Delete this version" action on the version-history list, gated so the `current_pdf_version_id` cannot be deleted. Storage delete + row delete in a tx.
---
## MEDIUM
### M1. `files.client_id` lacks an explicit `onDelete` — fragile
`src/lib/db/schema/documents.ts:30`: `clientId: text('client_id').references(() => clients.id)` (no `onDelete`). Migration 0000 records `ON DELETE no action`. The only existing client-delete path (`client-hard-delete.service.ts:193`) explicitly nullifies `files.client_id` first, so it works — but any future bulk-delete / port-teardown / dev script bypassing `hardDeleteClient` will FK-violate. Compare `files.yacht_id` + `files.company_id`, both `set null` (added in 0042).
**Fix:** new migration to `ON DELETE SET NULL` `files.client_id`. Removes the implicit invariant that hard-delete is the only legal path.
### M2. `demoteSystemFolderOnEntityDelete` is wired for clients only
One caller (`client-hard-delete.service.ts:236`). No hardDeleteYacht / hardDeleteCompany exists today, so not currently broken — but it's a landmine when those flows ship. Both must call `demoteSystemFolderOnEntityDelete(portId, 'yacht'|'company', id)`.
### M3. Hard-deleted-client files become un-swept root orphans
`client-hard-delete.service.ts:193` nullifies `files.clientId` and demotes the system folder to `"{name} (deleted)"`. The file rows now have `clientId=null` + `folder_id` pointing at the demoted folder — discoverable in the demoted folder but never automatically dropped. The HARD delete of the client doesn't actually hard-delete their files. Inconsistent with the "hard" naming AND with GDPR Article 17.
**Fix:** mid-transaction (before the nullify), capture the affected file IDs; post-transaction call `deleteFile()` on each (handles blob + audit). Alternatively: nightly worker that drops file rows where every entity FK is null + no doc/expense/maint reference + `created_at < N days`.
### M4. GDPR export cleanup retries forever on storage failure
`src/lib/queue/workers/maintenance.ts:97-108`. If `storage.delete(row.storageKey)` throws, the catch increments `failed` but does NOT delete the DB row. Next 4 AM run, same row reappears; same failure; same warn. No max-retry, no dead-letter, no admin escalation. A permanently broken storage path silently piles up infinite warns AND the GDPR-erasure obligation never completes.
**Fix:** track `delete_attempts` per row; after N failures either force-delete the DB row + log the orphan-blob to an admin-visible orphans table, or escalate at pino `error` + Sentry.
### M5. `migrate.ts` table list has no drift guard
`src/lib/storage/migrate.ts:52` explicitly admits: _"The `report_snapshots` table called out in the audit does not exist yet. Add it here when it lands."_ This is a manual checklist with no enforcement — any future table that adds a `storage_key`/`storage_path` and forgets to extend `TABLES_WITH_STORAGE_KEYS` will silently leave its blobs behind on every backend swap.
**Fix:** integration test that diffs `information_schema.columns WHERE column_name IN ('storage_key','storage_path')` against `TABLES_WITH_STORAGE_KEYS`. Failing test forces an update before the new table can ship.
### M6. `deleteFolderSoftRescue`: no per-row audit + opaque sibling-name collision
`src/lib/services/document-folders.service.ts:283-326`:
- Only the folder delete itself is audit-logged; the bulk re-parent of N documents + N files leaves no per-row trail. An auditor cannot reconstruct "which folder did this signed contract land in?"
- If a re-parented child folder's name collides with an existing sibling at the destination, the UPDATE throws on `uniq_document_folders_sibling_name` and the tx rolls back. Error propagates as a raw "duplicate key" — compare `moveFolder`, which catches via `isSiblingNameConflict` and returns a useful 409.
**Fix:** (a) emit one bulk audit row with `metadata: { docsMoved, filesMoved, rescuedTo }`; (b) wrap the UPDATE in the same conflict catch.
### M7. `listTree` silently drops orphan folder rows
`document-folders.service.ts:95` logs `"listTree: orphan folder row … dropped from tree"`. Defensive — but the orphans aren't auto-healed and aren't surfaced anywhere. Post-soft-rescue this shouldn't happen, but if it does (race, manual SQL, future bug), the row hides forever.
**Fix:** daily maintenance worker counts `documentFolders WHERE parent_id IS NOT NULL AND parent_id NOT IN (SELECT id FROM documentFolders)` and emits a metric / log.
---
## Summary
| Sev | Finding | File | Effort |
| ---- | ----------------------------------------------------- | -------------------------------------------- | ------------ |
| CRIT | C1 — Avatar replace leaks rows + blobs | `api/v1/me/avatar/route.ts` | XS |
| HIGH | H1 — completed-webhook put-before-insert orphan | `services/documents.service.ts:1131` | S |
| HIGH | H2 — `deleteDocument` strands signed PDF | `services/documents.service.ts:596` | S |
| HIGH | H3 — Brochure versions: no cleanup ever | `services/brochures.service.ts` | M |
| HIGH | H4 — Berth PDF versions: no cleanup ever | `services/berth-pdf.service.ts` | M |
| MED | M1 — `files.client_id` lacks `onDelete` | `schema/documents.ts:30` | XS migration |
| MED | M2 — `demoteSystemFolderOnEntityDelete` client-only | `services/document-folders.service.ts:733` | XS (future) |
| MED | M3 — Hard-delete client leaves orphan files | `services/client-hard-delete.service.ts:193` | S |
| MED | M4 — GDPR cleanup loops on storage failure | `queue/workers/maintenance.ts:97` | S |
| MED | M5 — Migrate table list has no drift guard | `lib/storage/migrate.ts:55` | S test |
| MED | M6 — Soft-rescue: no per-row audit + opaque collision | `services/document-folders.service.ts:283` | S |
| MED | M7 — Orphan folder rows logged, never healed | `services/document-folders.service.ts:95` | XS |
Biggest cumulative storage waste: H3 + H4 (uncapped version retention) and C1 (per-user avatar churn). Most dangerous correctness/GDPR findings: H1 (silent signed-PDF orphan under Documenso retry) and H2 (signed PII PDFs surviving document deletion).
---
## 28. Code quality + maintainability hotspots audit (maintainability-auditor)
# Audit — Code Quality & Maintainability Hotspots (task #28)
**Scope:** cyclomatic complexity hotspots, files >500 lines, services violating SRP,
monster components, cross-domain duplication, abandoned scaffolding. Read-only.
**Top-line numbers:** 9 source files >700 lines; 22 files >500 lines.
TODO/FIXME/HACK markers: only **3 files** (3 markers total) — drift is not the
problem here; sheer file size and per-entity duplication are.
---
## CRITICAL
### C1. `src/lib/services/documents.service.ts` — 1982 lines, 33 exports, 30 imports, ~7 distinct concerns
One file owns: document CRUD, hub listing, signing send-flow (`sendForSigning`,
~200 lines, 10+ branches), manual upload (`uploadSignedManually`), 6 Documenso
webhook handlers (`handleDocumentCompleted` 224 lines / 11 branches,
`handleRecipientSigned`, `…Expired`, `…Rejected`, `…Cancelled`, `…Opened`),
template-driven wizard (`createFromWizard`), and aggregated-by-entity projection
(`listInflightWorkflowsAggregatedByEntity` + `fetchWorkflowGroupRows`). Single
strongest SRP violation in the codebase. **Recommend split:**
`documents.service.ts` (CRUD+detail), `documents-signing.ts` (send/cancel/manual-
upload), `documents-webhook-handlers.ts` (the 6 handlers), `documents-
aggregation.ts` (the hub projection). Webhook handlers in particular are
inbound-event logic, not service CRUD, and dynamic-import circular deps with
`interests.service.advanceStageIfBehind` cross the boundary today.
### C2. `src/lib/services/search.service.ts` — 2163 lines, single file
26 exports, 14 per-entity `searchX` helpers (clients, residential clients,
yachts, companies, interests, residential interests, berths, invoices, expenses,
documents, files, reminders, brochures, tags, notes, otherPorts), plus
`expandGraph` (~420 lines, 14+ branches), `search` orchestrator, and recent-
search storage. Cohesive in purpose but no single dev can hold this in head.
**Recommend:** `search/buckets/*.ts` (one per entity), `search/expand-graph.ts`,
`search/orchestrator.ts`. Touching one bucket today forces reading 2000+ lines
of unrelated context.
### C3. `src/lib/services/notes.service.ts` — 1121 lines, near-pure duplication
6 entity-type branches per operation (clients / interests / yachts / companies
/ residential*clients / residential_interests). The `create` function alone
(lines 689846) is 158 lines of 6 copy-pasted insert-then-profile-lookup
blocks; same for `update` (8471019) and `deleteNote` (1020+). A
`tableForEntity()` dispatcher is \_defined* at line 82 then immediately silenced
(`void tableForEntity;` line 98) — i.e. the abstraction was started, abandoned,
and the dead helper left in place. Aggregated listers (`listForClient/Yacht/
Company/ResidentialClientAggregated`) are 4 near-identical 100-line bodies.
**Recommend:** dispatch table `{ table, fk, link }` keyed by entityType +
single generic insert/update/delete; collapses ~600 lines.
### C4. `src/components/interests/interest-tabs.tsx` — 959 lines, single file
`OverviewTab` is 415 lines of inline JSX (456870). Inline helpers
`MilestoneSection`, `MilestoneAdvanceButton`, `FutureMilestones`,
`EditableRow`, `InfoRow`, `useInterestPatch`, `useStageMutation`,
`humanizeStatus` all share this file. Single file owns the entire detail-page
overview, milestone widget, mutation hooks, and tab definition. **Recommend
split:** `interest-overview-tab.tsx`, `interest-milestones.tsx`,
`hooks/use-interest-patch.ts`.
---
## HIGH
### H1. Two near-named template services live side-by-side
`src/lib/services/document-templates.ts` (955 lines — CRM template flow:
listTemplates, generateAndSign, EOI generation) **and**
`src/lib/services/document-templates.service.ts` (262 lines — Admin TipTap
template flow with audit-log versioning). Both export `listTemplates`,
`getTemplateById`, `createTemplate`, `updateTemplate` against different
schemas. Different consumers import each by accident-prone path. Strongly
recommend renaming the admin one to `admin-document-templates.service.ts` (it
already prefixes its functions with `…AdminTemplate…`).
### H2. Per-entity component duplication is system-wide (4× scaffolding)
For each of clients / yachts / companies / interests there exist near-parallel:
`<entity>-list.tsx`, `<entity>-columns.tsx`, `<entity>-filters.tsx`,
`<entity>-form.tsx`, `<entity>-detail-header.tsx`, `<entity>-card.tsx`,
`<entity>-tabs.tsx`, `<entity>-files-tab.tsx`. **Confirmed near-identical
pairs:**
- `client-files-tab.tsx` vs `company-files-tab.tsx` — 88 lines each, only
difference is the entity-key parameter (`clientId` vs `companyId`) in 6
spots. **~95% byte-identical.** Should be `<EntityFilesTab entityType=…>`.
- `client-list.tsx` (350) / `yacht-list.tsx` (295) / `company-list.tsx` (308) /
`interest-list.tsx` (469): same imports, same TanStack-table wiring, same
bulk-action shape, parameterised only by columns + filters + form
components.
A generic `<EntityListShell columns={…} filters={…} form={…} />` would collapse
~1400 lines into ~400. Similarly forms: `interest-form` (756) + `company-form`
(706) share the same react-hook-form skeleton.
### H3. `src/lib/services/expense-pdf.service.ts` — 987 lines, SRP-spanning
Mixes: query/fetch (`fetchExpenseRows`, `resolveReceiptFiles`), grouping
(`groupRows`, `groupKey`, `computeTotals`), image processing
(`maybeResizeImage`, `streamToBuffer`), and PDFKit layout primitives
(`addHeader`, `addSummaryBox`, `addExpenseTable`, `addReceiptPages`,
`renderReceiptHeader`, `addReceiptErrorPage`, `addFooter`). 17 functions, 3
unrelated concerns. **Recommend:** `expense-pdf/data.ts`, `expense-pdf/
layout.ts`, `expense-pdf/index.ts`.
### H4. `src/components/search/command-search.tsx` — 1177 lines, 10 inline subcomponents
`CommandSearch` (268 lines) + `FilterChipRow`, `ChipButton`, `EmptyStateBeforeSearch`,
`ResultsRegion`, `ZeroState`, `QuickCreateButton`, `ResultRow`, `Badge`,
`SectionHeading`, `BucketSection`, plus `buildFlatRows` (327 lines, branch-heavy).
The inline subcomponents are reusable in principle but private to this file by
virtue of co-location. **Recommend:** `search/internal/{filter-chips,result-
row,bucket-section,build-flat-rows,empty-states}.tsx`. `buildFlatRows` deserves
its own file with its own test.
### H5. `src/lib/services/interests.service.ts` — 1273 lines, 17 exports
Owns 6 state-transition mutations (`changeInterestStage`, `advanceStageIfBehind`,
`setInterestOutcome`, `clearInterestOutcome`, `archiveInterest`,
`restoreInterest`), berth-linking (`linkBerth`/`unlinkBerth`), tag setter, board
projection (`listInterestsForBoard`, ~75 lines), list+detail. State-transition
logic could move to `interests-lifecycle.ts`; board projection to
`interests-board.ts`. Two interest CRUD helpers
(`getInterestById` 112 lines, `listInterests` 184 lines) both build elaborate
shaped reads — they're load-bearing but should probably both run through a
single projection helper.
---
## MEDIUM
### M1. Cyclomatic-density hotspots (informal — branch-count per body)
- `documents.service.handleDocumentCompleted` — 224 lines, 11 branches.
- `documents.service.sendForSigning` — 200 lines, 10 branches.
- `search.service.expandGraph` — 420 lines, 14+ branches across entity types.
- `documents.service.uploadSignedManually` — ~110 lines.
- `interests.service.changeInterestStage` — ~140 lines.
- `notes.service.create/update/deleteNote` — 6 inline entity branches each.
### M2. Abandoned scaffolding — `void <identifier>` silencing
The codebase has 7+ deliberate `void <symbol>` statements added to keep
imports/symbols around for future use:
- `src/lib/services/notes.service.ts:98``void tableForEntity;` (full helper
abandoned)
- `src/lib/services/alert-rules.ts:331` — `const _unused = { gt, desc,
alertsTable }; void _unused;` (3 stale imports)
- `src/app/api/v1/clients/bulk/route.ts:227-228` — `void HIGH_STAKES_STAGES;
void ({} as PipelineStage);`
- `src/app/api/v1/admin/email-templates/route.ts:91``void eq;`
- `src/app/api/v1/admin/website-submissions/route.ts:76``void lt;`
- `src/app/api/v1/interests/bulk/route.ts:134-135` — `void inArray; void
withPermission;`
Either the future-PR landed without removing the placeholder, or the abstraction
was never built. Each is a small lint-clean-up; collectively they signal
unfinished refactors. Decide per case: implement the dispatcher (notes), or
delete the dead imports.
### M3. Real TODO/FIXME — only 3 in the entire src tree
- `src/lib/queue/workers/import.ts:13` — `// TODO(L2): implement import job
handlers` (worker is a stub).
- `src/lib/queue/scheduler.ts:44` — `// TODO(L2): make per-user schedule
configurable`.
- `src/components/interests/interest-detail.tsx:26` — JSDoc remark, not a
todo.
The import worker stub is the only real loose end — confirm whether
import jobs are needed before shipping, otherwise delete the worker registration
to avoid an empty queue.
### M4. Cross-service implicit coupling via dynamic-import circles
`documents.service` imports `advanceStageIfBehind` from `interests.service`
statically; `interests.service` imports `evaluateRule` from
`berth-rules-engine`; `berth-rules-engine` calls services via `await
import(...)` to dodge the cycle. The dynamic-import workaround masks circular
ownership: the rules engine is effectively the orchestrator of state changes
across documents + interests + invoices. Worth either (a) hoisting the rules
engine to a top-level coordinator that the services don't import back, or (b)
documenting the cycle explicitly in CLAUDE.md so the next dev doesn't break
it.
### M5. Largest leaf components without inline subcomponents
- `interest-form.tsx` (756) and `company-form.tsx` (706) are single
components. Both define schema + form + nested pickers in one file. Could
benefit from `interest-form-fields/{dimensions,category,picker}.tsx`.
- `interests/linked-berths-list.tsx` (530) and `documents/documents-hub.tsx`
(537) sit just above the threshold; readable but on the edge.
### M6. Re-export shims (legacy import boundary)
`src/components/clients/pipeline-constants.ts` — "Re-export from the canonical
source so legacy imports keep working." Audit the consumer list and migrate
imports to the canonical path; remove the shim.
---
## Notes / non-issues
- TODO/FIXME hygiene is **excellent** (3 markers across 148k LOC).
- The 18 services with `audit.service.ts`-style pattern are short and
cohesive — no monster spread.
- Drizzle schema split (one file per domain in `src/lib/db/schema/`) is clean;
`relations.ts` (953 lines) is large but central by design.
- `dashboard-shell.tsx` (243 lines) is **not** a monster — single composition
surface, leaves widgets in their own files. Healthy pattern.
## Suggested order of operations
1. Rename `document-templates.service.ts``admin-document-templates.service.ts`
(H1; one-day safety win).
2. Build `<EntityFilesTab entityType="…">` and delete the two copies (H2; warm-up).
3. Replace notes.service entity-switch ladders with a dispatch table (C3).
4. Split `documents.service` along the natural seams: CRUD / signing / webhooks
/ aggregation (C1).
5. Split `search.service` into per-bucket files (C2).
6. Split `interest-tabs.tsx` and `command-search.tsx` (C4, H4).
7. Sweep `void <symbol>` placeholders (M2).
Total estimated reduction: ~3500 lines of code via deduplication + better split
points, no functional change.
---
## 23. Multi-port super-admin flow audit (multi-port-auditor)
# Audit: Multi-Port Super-Admin Flow (Task #23)
Scope: super-admin "otherPorts" search extension, port-switcher UX, cross-port report queries, every `isSuperAdmin` bypass path, accidental data bleed, `X-Port-Id` header handling, port_id default resolution from preferences, the super-admin-only `/admin/ports` listing.
Read-only audit. No edits made. Roughly ranked by blast-radius.
---
## CRITICAL
### C1. Port-switcher race — first request after navigation can hit the WRONG port
`src/providers/port-provider.tsx:38-48`, `src/components/layout/user-menu.tsx:65-73`, `src/lib/api/client.ts:50-63`.
`PortProvider` reads the URL slug at render and reconciles Zustand inside a `useEffect`. `apiFetch` reads `useUIStore.getState().currentPortId` synchronously. For a super-admin who is on `/port-A/clients` and clicks `/port-B/clients` (or hits a deep link from search/external nav), the first round of queries fires **before** the reconcile effect commits — sending `X-Port-Id = port-A` while the page chrome renders port-B. Listings come back from port-A and render inside port-B's shell ⇒ **silent cross-port data bleed** in the UI.
`handlePortChange` does invalidate React Query AND push the route, but `setPort` (Zustand setter) is sync — and the `router.push` is async. Any queries kicked off by the new route's components before the next tick can still read stale state on the initial mount. The reconcile happens on the second render.
**Fix sketch:** Have `apiFetch` derive `portId` from `window.location.pathname` FIRST and fall back to Zustand, not the reverse. The slug is authoritative; Zustand is a cache. (The current code only consults the URL when Zustand is empty.)
### C2. `apiFetch` slug-to-id fallback is dead for non-super-admins
`src/lib/api/client.ts:18-40`.
The fallback for "Zustand not hydrated yet" calls `/api/v1/admin/ports`. That endpoint has `requireSuperAdmin(ctx, 'admin.ports.list')` (`src/app/api/v1/admin/ports/route.ts:16`). For a port director on a hard refresh, the request 403s, `resolvePortIdFromSlug` returns `null`, `apiFetch` ships the request with **no X-Port-Id header** — and `withAuth` then falls back to `preferences.defaultPortId`, which (per next finding) is also unwritable. End state for the user: a 400 "Port context required" on every initial request after a cold reload, until Zustand re-hydrates from localStorage. Suggest a public/authed `/api/v1/me/ports` lookup that is permission-free.
### C3. `defaultPortId` preference is read by `withAuth` but the `/me` PATCH allow-list refuses to write it
`src/lib/api/helpers.ts:160-164` reads `(profile.preferences as { defaultPortId?: string })?.defaultPortId` as the X-Port-Id fallback.
`src/app/api/v1/me/route.ts:45-66` defines `preferences` with `z.object({...}).strict()` and the allow-list `ALLOWED_PREF_KEYS = new Set(['dark_mode', 'locale', 'timezone', 'tablePreferences'])` at line 154. `defaultPortId` is silently stripped at every write. The fallback in `withAuth` is therefore dead — `preferences.defaultPortId` can only ever be set by a hand-rolled `db.update`. For super-admins this means: no header ⇒ no portId ⇒ `ctx.portId = ''` ⇒ every WHERE `port_id = ''` returns empty. Mild UX bug for super-admins but **silent**. Either remove the dead fallback or add `defaultPortId` to the strict schema + allow-list.
---
## HIGH
### H1. `searchOtherPorts` ignores per-port ACL for super-admin extension (theoretical, currently fine)
`src/lib/services/search.service.ts:1232-1314`. The docstring at line 31 promises "ports the user can access other than `portId`". The implementation just excludes `excludePortId` and joins every other row in `ports`. Today super-admins can access every port, so the behavior matches. Risk: if a future "regional super-admin" role lands and reuses this code path (`opts.includeOtherPorts && opts.isSuperAdmin`) the leak is total — no ACL filter. Recommend passing in the **set of accessible portIds** as a parameter and using it in the `port_lookup` CTE WHERE, even though the current gate is binary.
### H2. `/api/v1/admin/users/[id]/permission-overrides` PUT — port directors can promote anyone in their port to "owns everything"
`src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:153-244`.
The route gates on `admin.manage_users` (port-scoped), and rejects self-target (line 163) + targets not assigned to the same port (line 173). But there is no guard preventing a port director from writing `admin.permanently_delete_clients: true`, `system_backup: true`, `manage_users: true`, etc. onto a _different_ user in the same port — and then logging in as that user (or asking that user) to act with elevated permissions. Self-target is blocked but **co-conspirator escalation** is not. Mitigation idea: cap the overrides a non-super-admin can set to the leaves they themselves hold (effectively `ctx.permissions ∩ overrides`). The audit log is recorded, so this is detectable post-hoc, but not prevented.
### H3. AdminLayout vs admin-API permission asymmetry
`src/app/(dashboard)/[portSlug]/admin/layout.tsx:31-33` redirects every non-super-admin away from `/[portSlug]/admin/...`. Meanwhile `/api/v1/admin/**` endpoints are mostly gated on `admin.manage_settings` / `admin.manage_users` / `admin.view_audit_log` — leaves that the port-director role holds. So a port director can hit the APIs (via curl, scripts, or non-`/admin` UI surfaces such as `settings/`) but the matching UI is hidden behind a super-admin redirect. Pick a side: either gate the API endpoints on `requireSuperAdmin`, or let port directors into the corresponding sub-pages of `/admin/` (alerting on the ones that should remain super-admin only — backup, queues, storage, ports, invitations).
### H4. Super-admin with empty `ctx.portId` silently filters to zero rows
`src/lib/api/helpers.ts:166-168` — only non-super-admins are blocked when `portId` is null. A super-admin without an X-Port-Id header AND without a preferences.defaultPortId (which is currently every super-admin per C3) gets `ctx.portId = ''`. Downstream services that do `WHERE port_id = ${portId}` silently return empty data, which is harmless. But endpoints that BRANCH on `isSuperAdmin ? undefined : ctx.portId` (e.g. error-events `route.ts:32`) will hand `undefined` to the service and return EVERY tenant's rows. Currently only the error-events listing does this — but the pattern is risky. A scoped super-admin with the wrong header today sees one port; without the header they see ALL ports — surprising to admins debugging "why am I seeing port-X data on port-Y?". Recommend an explicit `?allPorts=true` opt-in on those endpoints rather than coupling cross-port reads to a missing header.
---
## MEDIUM
### M1. Port switcher only invalidates queries, doesn't abort in-flight ones
`src/components/layout/user-menu.tsx:65-73`. `queryClient.invalidateQueries()` marks queries stale but lets in-flight ones finish and write into the cache. If a long-running fetch (e.g. PDF generation, expensive report) was started under port-A and resolves after the user switches to port-B, the cache entry is now port-A data keyed on a query that the new page treats as port-B. Worth pairing with `cancelQueries()` and a re-key on portId (most query keys appear to not embed portId).
### M2. `/api/v1/expenses/export/parent-company` lost its `isSuperAdmin` guard
`src/app/api/v1/expenses/export/parent-company/route.ts:9-12`. The comment says "Hard isSuperAdmin check used to lock out port admins who held expenses.export = true" — but the check is no longer in the route body, it now relies on the perm gate alone. The service `exportParentCompany` is single-port (filters `expenses.port_id`), so this is not a cross-port leak today. But the doc-vs-code drift should be reconciled either by adding `requireSuperAdmin` back or by deleting the stale comment.
### M3. Search "otherPorts" cross-port hits expose port-level metadata to ALL super-admin queries
`src/lib/services/search.service.ts:1862-1864`, `src/app/api/v1/search/route.ts:20`. Toggle `includeOtherPorts` defaults to false — but any super-admin can flip the query param. The merge into `SearchResults.otherPorts` returns `portId/portSlug/portName/type/id/label/sub` from up to 5 other ports per request without rate-limiting the cross-port enumeration. Pairs with the existing search rate-limit (if any) — confirm and add a tighter ceiling on `searchOtherPorts(query, limit)`. Currently `limit` defaults to whatever the searchQuery schema permits.
### M4. Super-admin dashboard redirect always picks first port alphabetically
`src/app/dashboard/page.tsx:24-27``db.query.ports.findFirst({ orderBy: portsTable.name })`. Predictable and stable, but ignores any "last-used port" signal. Combined with C3, a super-admin who manually picks port-B then closes the tab returns to port-A on next login. Cosmetic but disorienting. Easiest fix: persist `last_used_port_id` in `userProfiles.preferences` and read it here.
### M5. Webhook + document workers fan out to ALL super-admins for in-app notifications
`src/lib/queue/workers/webhooks.ts:264`, `src/lib/queue/workers/documents.ts:73`. Both fetch every `isSuperAdmin=true AND isActive=true` user to send notifications. Not a security issue; flagging because a future "regional super-admin" rollout will make the broadcast list quietly cross-tenant. Wrap the queries in a `notifySuperAdmins(portId)` helper now so the porting work is one diff later.
### M6. `/admin/ports/[id]` PATCH lets super-admin mutate any port without the rate-limit gate
`src/app/api/v1/admin/ports/[id]/route.ts:34-50` — no `withRateLimit` on a PATCH that touches every port-wide setting (timezone, currency, branding…). Lower priority because callers are short and trusted, but pairs naturally with the audit log.
### M7. AuthContext has no `accessiblePortIds` set
Every cross-port-aware code path re-derives "which ports can this user touch?" from `userPortRoles` or `isSuperAdmin`. Hoist into `AuthContext` (computed once in `withAuth`) so future endpoints don't have to re-implement the resolution and so cross-port filters can apply `inArray(table.portId, ctx.accessiblePortIds)` uniformly.
---
## Findings that audit clean
- `/api/v1/admin/ports` GET/POST correctly require `requireSuperAdmin` (`route.ts:16,28`).
- `/api/v1/admin/ports/[id]` correctly enforces port-in-scope for non-super-admins (`assertPortInScope`, line 15-20).
- `/api/v1/admin/invitations` correctly rejects port-director-minted super-admin invites (line 40-42).
- `/api/v1/admin/audit` is strictly port-scoped (line 40) — no cross-tenant peek even for super-admins.
- `withAuth` correctly refuses requests where the body tries to set `portId` (header-only); body-based `portId` is documented as forbidden (line 156-159).
- Reports service consistently uses `ctx.portId` in WHERE clauses (`reports.service.ts:103-163`); no super-admin cross-port aggregation paths.
- Public berth/inquiry endpoints take their portId from a query param / dedicated header, never from auth context — correctly decoupled from session port.
---
## Recommended next steps (in order)
1. Fix C1 by making the URL slug authoritative inside `apiFetch`.
2. Fix C2 with a small `/api/v1/me/accessible-ports` endpoint usable by every authed user.
3. Add `defaultPortId` to the `/me` PATCH allow-list (C3) — or strip the fallback from `withAuth`.
4. Add the "overrides ∩ caller's own perms" cap to permission-overrides PUT (H2).
5. Reconcile AdminLayout vs admin-API gating (H3) — write one document of which leaves are super-admin only.
6. Hoist `accessiblePortIds` into `AuthContext` (M7) ahead of the next cross-port feature.
---
## 33. S3 vs internal DB pathing + storage routing audit (storage-pathing-auditor)
# Audit — S3 vs Internal DB Pathing + Storage Routing
Scope: `src/lib/storage/*`, every `getStorageBackend()` consumer, migration script, magic-byte enforcement, encryption-at-rest boundary.
Date: 2026-05-12
## Boundary summary (what lives where)
- **In DB (Postgres):** file metadata only — `files.storage_path`, `berth_pdf_versions.storage_key`, `brochure_versions.storage_key`, `gdpr_exports.storage_key`, `backup_jobs.storage_path`, user-avatar FK (`user_profiles.avatar_file_id``files`), document signing state (`documents.signed_file_id`). AES-256-GCM-encrypted **secrets**: `system_settings.storage_s3_secret_key_encrypted`, `storage_proxy_hmac_secret_encrypted`, `email_accounts.credentials_enc`, `webhooks.secret`, `ocr_config.api_key_encrypted`. No BYTEA / JSONB blobs found (`grep BYTEA → 0`).
- **In backend (S3/filesystem):** every uploaded blob — signed PDFs (`buildStoragePath(slug,'eoi-signed',…)`), per-berth PDFs (`berths/{id}/…`), brochures, avatars, GDPR exports, pg_dump backups, expense receipts, generated reports, template source PDFs, send-out attachment fallbacks.
- **Routing:** `getStorageBackend()` reads global `system_settings.storage_backend` ('s3'|'filesystem'), caches by config fingerprint, invalidated via `resetStorageBackendCache()` on settings write or migration flip. Code never imports `minio/Client` outside `s3.ts` (verified — only legacy `buildStoragePath` helper survives in `src/lib/minio/index.ts`). Interface methods: put/get/head/delete/listByPrefix/presignUpload/presignDownload — both backends implement all 7.
## CRITICAL
### C1. `backup_jobs.storage_path` missing from `TABLES_WITH_STORAGE_KEYS` — silent backup loss on backend swap
`src/lib/storage/migrate.ts:55-60` lists only `files`, `berth_pdf_versions`, `brochure_versions`, `gdpr_exports`. `backup_jobs.storage_path` (pg_dump artefacts written by `src/lib/services/backup.service.ts:54+72`) is **not** in the list. Flipping S3 → filesystem (or vice-versa) leaves every historical database backup pointing at the old backend — `getBackupDownloadUrl(id)` will 404 / NoSuchKey, and the admin won't know until they try to restore. This is the worst category of data loss because backups are the recovery path of last resort. The comment in `migrate.ts:51` calls out `report_snapshots` as a future addition but mentions nothing about `backup_jobs`. **Add `{ table: 'backup_jobs', keyColumn: 'storage_path', pkColumn: 'id' }` and ship the line with a smoke test.**
### C2. Orphan-blob risk: every `backend.put` runs outside the `db.insert(files)` transaction
Pattern repeated across 9+ services (`files.ts:68-92`, `documents.service.ts:833-854` and `1134-1183`, `external-eoi.service.ts:71-96` — comment at L67-70 explicitly acknowledges "orphan reaper handles those" but **no reaper exists**, `invoices.ts:603`, `document-templates.ts:537,674`, `reports.service.ts:231`, `gdpr-export.service.ts:169`, `backup.service.ts:62`, `berth-pdf.service.ts:229`). Sequence is: PUT bytes → DB INSERT. If insert fails or the process dies in between, the blob is permanent and unreferenced. Only `handleDocumentCompleted` (`documents.service.ts:1110`) has an early-return idempotency gate; the rest leak. Over months of operation an S3 prefix accumulates dozens-to-hundreds of orphans that pay storage cost forever and survive every backup-restore. **Add an orphan-reaper maintenance job** that walks `listByPrefix()` against the union of all `storage_*` columns and deletes blobs older than 24 h without a DB pointer. Also wrap the `put + insert` pairs in a try/catch that explicitly deletes on insert failure (cheap defense in depth).
### C3. S3 backend stores blobs without server-side encryption (SSE-S3 / SSE-KMS)
`S3Backend.put()` (`src/lib/storage/s3.ts:191`) passes only `Content-Type` to `client.putObject`. No `x-amz-server-side-encryption` header. Bytes-at-rest encryption depends entirely on the bucket's default-encryption policy, which is invisible to the application — a customer who provisions a MinIO/B2/R2 bucket without server-side encryption gets cleartext signed contracts, GDPR exports, and `pg_dump` archives sitting on disk. Same audit posture as SMTP/IMAP creds (which **are** AES-GCM in the DB) demands the same guarantee for the blob plane. **Either add `ServerSideEncryption: 'AES256'` to every `putObject` call, or surface a boot-time check that asserts the bucket has default-encryption enabled and refuses to start otherwise** (similar to the `MULTI_NODE_DEPLOYMENT` guard on FilesystemBackend).
## HIGH
### H1. Berth-PDF presigned-upload keys are not port-scoped
`src/app/api/v1/berths/[id]/pdf-upload-url/handlers.ts:58` builds `berths/{berthId}/uploads/{uuid}_{name}` — no leading `${portSlug}/`. Result: the optional port-binding (`p` field on the HMAC token, enforced in `filesystem.ts:184-188` and documented in `index.ts:43-49`) cannot be wired here, and the storage-key namespacing convention diverges from `buildStoragePath` (which always prefixes the port slug). Tenant isolation today relies on the up-front `berths.portId === ctx.portId` check before mint, but the defense-in-depth port-binding is unwired. **Normalise the key to `${portSlug}/berths/...` and pass `portSlug` into `backend.presignUpload`.**
### H2. `presignDownload` callers never pass `portSlug` — port-binding token guard is dead code
`presignDownloadUrl(...)` (`storage/index.ts:233`) accepts `portSlug` and only 1 of ~12 callers uses it. `files.ts:117,128`, `backup.service.ts:115`, `portal.service.ts:351`, `reports.service.ts:170`, `gdpr-export.service.ts:224,282` all pass undefined. The filesystem-proxy GET will therefore accept any valid HMAC token regardless of the storage-key's port prefix. The check is genuinely defensible (see `filesystem.ts:179`) but never engaged. **Plumb the active port slug through every call site, or remove the optional `p` field and the verifier code so the contract isn't misleading.**
### H3. `S3Backend.put` and `backup.service` buffer entire blobs into memory
`s3.ts:187` (`Buffer.isBuffer(body) ? body : await streamToBuffer(body)`) and `backup.service.ts:60-62` (concatenates the entire pg_dump dump into memory before put). For a multi-GB database dump the worker OOMs. Comment at `s3.ts:184-187` explicitly says "typical files are under 50MB" but `runPgDump` writes a dump file whose size scales with the tenant. **Use `client.fPutObject` (file-path streaming) for backups; for streamable callers expose a `putStream(key, stream, sizeBytes, opts)` overload that pipes without `streamToBuffer`.**
### H4. Migrator's `copyAndVerify` double-buffers every blob and has no streaming hash
`storage/migrate.ts:170-204` reads source → Buffer, sha256, put, then re-reads target → Buffer, sha256 again. For a 5 GB pg_dump (see C1 — once added) this allocates ~10 GB of heap. The sha256-verify round-trip is the right idea; **pipe through `crypto.createHash` on both legs**, never buffer.
### H5. `S3Backend.presignUpload` lacks content-type / content-length binding
`s3.ts:249-256` only calls `presignedPutObject(bucket, key, expiry)`. The signed URL does not bind `Content-Type` or `Content-Length` — a browser can PUT 1 GB of arbitrary bytes against an EOI-signed key. Caps and magic-byte checks fire only on the **register** call afterwards (`registerBrochureVersion` and `uploadBerthPdf` HEAD-then-stream-first-5-bytes path). That's sufficient for the two consumers today, but the gate is one-deep — a future caller that forgets to wire a register endpoint exposes raw S3 directly. **Switch to MinIO `presignedPostPolicy` with `content-length-range` + `Content-Type` conditions so the binding is on the signature itself.**
## MEDIUM
### M1. CLAUDE.md drift on "TABLES_WITH_STORAGE_KEYS populated in 9a5ba87"
CLAUDE.md says the migrator covers "every blob in `files`, `berth_pdf_versions`, `brochure_versions`, `gdpr_exports`". Verified true — but **backup_jobs is the missing 5th** (see C1). Update the doc + add a unit test that asserts the array matches the set of tables with a `storage_*` column.
### M2. `email-compose.service.ts:124` reads attachment bytes into a Buffer
Each attachment under the `email_attach_threshold_mb` cap is fetched via `storage.get(...)` and concatenated. With multiple recipients × multiple attachments the worker holds N × size MB simultaneously. **Stream into `nodemailer`'s `content: <Readable>` API directly.**
### M3. UUID storage keys never check existence before put (no `If-None-Match: "*"`)
`crypto.randomUUID()` collision is astronomical, but a buggy caller passing a duplicate key (or a re-run of a worker after a partial DB rollback) silently overwrites. **Cheap belt: pass `If-None-Match: '*'` (S3) or `O_EXCL` (filesystem) — surfaces double-writes loudly.**
### M4. Per-port S3 routing not possible / `listByPrefix` unbounded
Storage config rows are global (`portId IS NULL`). Multi-tenant can't direct port-A vs port-B to separate buckets / KMS keys. `listByPrefix` returns every key in one array — script-only today but a footgun if called with empty prefix in production. **Document the global-config assumption; add a cursor variant before any per-port-bucket customer lands.**
### M5. `storage_filesystem_root` change invalidates outstanding HMAC tokens silently
Cache swaps, but tokens minted under the old root still verify HMAC; `resolveKeyForProxy` then 404s under the new root. Customer download links emailed an hour earlier break with no warning. **Either refuse runtime root changes, or warn in admin UI.**
### M6. Avatar URLs re-presign every 15 min — browser cache broken
No CDN / `s-maxage` fronts hot reads. Per-page avatar GET burns a presign + S3 round-trip. **Issue 24 h URLs for `category='avatar'`, or front with the Next.js Image route.**
### M7. Verified clean
- `withTimeout(...)` wraps every minio call (s3.ts L143/150/190/203/219/237/285/292/300); `system-monitoring.service.ts:153` adds its own 5 s deadline. **No bare minio calls escape.**
- `MULTI_NODE_DEPLOYMENT` guard reads `env.MULTI_NODE_DEPLOYMENT` (zod-coerced, `env.ts:80`), test at `filesystem-backend.test.ts:139`. ✓
### M8. Magic-byte enforcement
- **In-server uploads:** `files.ts:58` (`bufferMatchesMime`), `berth-pdf.service.ts:218` (`isPdfMagic`). ✓
- **Presigned-PUT post-upload register:** `brochures.service.ts:258` (first-5-byte stream + `%PDF-`), `berth-pdf.service.ts:259` (`readFirstBytes` + `isPdfMagic`). ✓
- **Filesystem proxy PUT:** inline check `route.ts:220` when token's `c=application/pdf`. ✓
- **S3 direct PUT:** no inline check (relies on the register endpoint). Acceptable per CLAUDE.md, but document divergence: a future S3 consumer that forgets to call register leaks the gate.
## Verified-clean (informational)
- No BYTEA / binary-JSONB blob columns. ✓
- Single canonical key format mismatch (`storage_path` vs `storage_key`) is documented + handled by per-table column mapping. ✓
- `validateStorageKey` rejects traversal, absolute paths, dotfiles, and >1024 chars. ✓
- Proxy token op-binding (`get` vs `put`) is enforced — replay across ops blocked. ✓
- Proxy single-use replay protection via Redis SET NX with TTL pinned to token expiry. ✓
- Filesystem HMAC secret falls back to a derived dev value but **throws in production** when unset. ✓
- All blob keys are UUID-namespaced — collision-safe, not deterministic-audit-style. ✓
## Recommended ordering
1. **C1** (one-line fix + smoke test) before any backend migration ships.
2. **C2** orphan reaper — cron job behind `maintenance` worker.
3. **C3** SSE-S3 — single-line putObject change + bucket-policy assertion at boot.
4. **H1 + H2** port-binding plumbing (small refactor, big invariant).
5. **H3 + H4 + M2** streaming pass over backup + migrator + email attachments.
6. Remainder during next storage-config UI sweep.
---
## 34. Dependency upgrade analysis — Context7-assisted (follow-up after deps-auditor)
_Post-session follow-up. Where the original `deps-auditor` covered abandonment + vulnerabilities, this section queries upstream changelogs via Context7 to weigh the pros/cons of pulling every available major. Use this as your bump roadmap._
# Dependency upgrade analysis (Context7-assisted)
Companion to the deps-auditor report from the original 33-agent run.
That auditor checked vulnerabilities + abandonment + license risk; this
follow-up adds **per-dep pros/cons of bumping to the latest stable**,
informed by upstream changelogs/docs queried via Context7.
**Top-line baseline:** `pnpm audit` reports **0 known vulnerabilities**.
No GPL/AGPL contamination. Lockfile reproducible. We are safe TODAY
without any upgrade; everything below is "should we pull the next
major in?" prioritization.
---
## At a glance — what's outdated
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
| Package | Current | Latest | Bump size |
| ----------------------- | ------- | -------- | --------------------------------- |
| `next` | 15.5.18 | 16.2.6 | major |
| `eslint-config-next` | 15.5.18 | 16.2.6 | major (matches next) |
| `zod` | 3.25.76 | 4.4.3 | major |
| `tailwindcss` | 3.4.19 | 4.3.0 | major |
| `@hookform/resolvers` | 3.10.0 | 5.2.2 | TWO majors |
| `archiver` | 7.0.1 | 8.0.0 | major |
| `react-day-picker` | 9.14.0 | 10.0.0 | major |
| `eslint` | 9.39.4 | 10.3.0 | major |
| `esbuild` | 0.27.7 | 0.28.0 | pre-1.0 minor (effectively major) |
| `@playwright/test` | 1.59.1 | 1.60.0 | minor |
| `libphonenumber-js` | 1.12.43 | 1.13.1 | minor |
| `tailwind-merge` | 3.5.0 | 3.6.0 | minor |
| `bullmq` | 5.76.6 | 5.76.8 | patch |
| `@tanstack/react-query` | 5.100.9 | 5.100.10 | patch |
| `better-auth` | 1.6.9 | 1.6.10 | patch |
| `vitest` | 4.1.5 | 4.1.6 | patch |
| `lint-staged` | 17.0.3 | 17.0.4 | patch |
| `@vitest/coverage-v8` | 4.1.5 | 4.1.6 | patch |
`@types/node` deliberately pinned to ^20.19 to match Node 20 runtime
(audit findings — was previously ^25 against a Node 20 runtime, which
greenlit non-existent APIs).
---
## Tier A — Pull the patches in now (zero-risk wins)
`bullmq`, `@tanstack/react-query` + `react-query-devtools`,
`better-auth`, `vitest` + `@vitest/coverage-v8`, `lint-staged`,
`@playwright/test`, `libphonenumber-js`, `tailwind-merge`.
**Pros:** patch / minor bumps, bug fixes only, no API changes documented.
**Cons:** none material — pin-bumps after a 30-second `pnpm install`
verify and full vitest run.
**Recommended:** **DO** as one batch commit. ~5 minutes.
---
## Tier B — Per-major analysis
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-1 — Next.js 15.5 → 16.2 _(touches every API route + middleware)_
**Upstream summary (via Context7):**
- **`middleware.ts` is renamed to `proxy.ts`** in Next 16. The named export `middleware``proxy`. Config flags rename (`skipMiddlewareUrlNormalize``skipProxyUrlNormalize`). **Edge runtime is NOT supported in `proxy`** — if you need edge runtime you must keep `middleware.ts` (we already use the Node runtime, so this is just a rename for us).
- Async `cookies()` / `headers()` / `params` / `searchParams` was the Next-15 change; Next 16 hardens the warning into an error. We're already async-safe (CLAUDE.md confirms the upgrade landed).
- Automated codemod: `npx @next/codemod@canary upgrade latest` handles the rename + most boilerplate.
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- `src/middleware.ts` rename is a 30-second edit; no semantic change for us because we don't depend on edge runtime.
- The Documenso webhook + websocket server custom-server path (`src/server.ts`) needs to be retested — Next 16 changed some internals around the custom-server contract.
- `eslint-config-next` must bump in lockstep (already at 15.5.18 → 16.2.6).
- Turbopack defaults shifted; our dev script (`next dev --turbopack -H 0.0.0.0`) needs a quick smoke run.
**Recommended:** **WAIT 2-4 weeks.** Next 16 dropped recently; let the field's bug reports settle. Then run the codemod + a full playwright smoke. Effort: 1-2h.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-2 — Zod 3 → 4 _(touches every validator file)_
**Upstream summary (via Context7):**
- Top-level format helpers — `z.email()` / `z.uuid()` / `z.url()` etc. replace `z.string().email()` / `.uuid()` / `.url()`. Old form is **deprecated** but still works.
- Error customization unified: `{ message: '...' }``{ error: '...' }`. Old form deprecated.
- `z.function()` API completely redesigned — now takes `input`/`output` schemas upfront, returns a function factory (not a schema).
- ~14× perf improvement on parse paths.
- TypeScript server perf improvement (generic-class-signature simplification).
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- We have ~30 validator files using `z.string().email()` / `.uuid()` style and `{ message: '...' }` style throughout. Both still work in 4.x but produce deprecation warnings on every parse — noisy in logs.
- `@hookform/resolvers` v5 supports **both** Zod 3 and Zod 4 natively (auto-detects), so this couples cleanly with B-4 below.
- We don't use `z.function()` anywhere, so the biggest breaking change is a non-issue for us.
**Recommended:** **GO once Tier A is in.** Codemod-friendly: a single Find/Replace pass on `z.string().email()``z.email()` etc. covers ~95% of the churn. Effort: 2-3h including running full vitest + writing replacement codemods.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-3 — Tailwind CSS 3 → 4 _(touches `tailwind.config.ts`, `globals.css`, every dynamic-class site)_
**Upstream summary (via Context7):**
- **All-new Oxide engine** — 5× faster full builds, 100× faster incremental.
- **CSS-first config:** `tailwind.config.ts` is gone. Theme defined in `globals.css` via `@theme` + CSS custom properties (`--color-brand: …`).
- **PostCSS plugin consolidation:** `postcss.config.mjs` switches from `tailwindcss + autoprefixer + postcss-import` plugins to single `@tailwindcss/postcss`.
- Built on native cascade layers, OKLCH colors, container queries, `@starting-style`, popovers.
- Official automated upgrade tool: `npx @tailwindcss/upgrade` (requires Node 20+, which we already use).
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- We have a custom `tailwind.config.ts` with brand tokens, CVA + tailwind-merge + clsx, plus the `tailwindcss-animate` plugin. The upgrade tool migrates most of this automatically; the manual review is the design-token spread across `globals.css`.
- shadcn/ui components (`components/ui/*`) use `cn()` + arbitrary values heavily. Some `[--variable]` syntax has changed in v4.
- `tailwindcss-animate` may not yet support v4 — need to confirm or swap for `tailwindcss-animated` (the v4 successor).
**Recommended:** **HIGH-RISK / HIGH-REWARD.** Park until you have a clear afternoon. The build-time speedup is genuinely meaningful for dev experience. Run the official upgrade tool on a throwaway branch first; visually diff a handful of critical pages before merging. Effort: 3-4h on a focused day; visual regressions are the variable.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-4 — `@hookform/resolvers` 3 → 5 _(touches every form file)_
**Upstream summary (via Context7):**
- v5 supports **both Zod 3 and Zod 4** simultaneously via auto-detection — pulls `zod/v4` if you opt into it explicitly.
- Resolver options shape is the same as v3 (`{ mode: 'async' | 'sync', raw?: boolean }`).
- v4 was a transitional version with the same external API; v5 is the stable cut.
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- Coupled with the Zod 4 upgrade — if we stay on Zod 3, v5 still works (the resolver detects Zod-3 schemas via shape probing). Bumping resolvers without bumping Zod is safe.
**Recommended:** **GO IN LOCKSTEP with B-2 (Zod 4).** Effort: 5 min once Zod 4 is in.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-5 — `archiver` 7 → 8 _(touches GDPR-export bundle + backup-restore)_
**Upstream summary:** Library "/gajus/archiver" not found in Context7 — fallback to npm changelog. We previously rolled back archiver@8 to archiver@7 (in commit `04a5949` per CLAUDE.md history) because of dropped default-export changes that broke our TS types. v8 stabilised since then.
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- Last time we tried this it broke. Read the v8 changelog before retrying.
- Used only for GDPR export + backup-restore — narrow blast radius. A failed upgrade is non-customer-facing.
**Recommended:** **DEFER.** Stay on 7 until either v8 demonstrably fixes a CVE / bug we care about, or until we have a green test suite to verify nothing regressed. Re-attempt only when there's a forcing function.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-6 — `react-day-picker` 9 → 10 _(touches every date-picker site)_
**Upstream summary:** v10 is a recent cut. Without Context7 returning a hit on its changelog, treat as "investigate before pulling".
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- Used in ~6 surfaces (reminder form, EOI date fields, expense date, invoice due-date, dashboard date-range picker). A breaking change to the calendar render path would affect every form.
**Recommended:** **DEFER 2-3 weeks** to let bug reports surface. Effort to actually do it: ~1h once the spec is reviewed.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-7 — `eslint` 9 → 10 + `eslint-config-next` _(touches CI)_
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- ESLint 10 likely drops support for some legacy rule configs.
- `eslint-config-next` should bump in lockstep with `next` (B-1).
**Recommended:** **PAIR WITH B-1.** No standalone value to bumping eslint without bumping Next.
---
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
### B-8 — `esbuild` 0.27 → 0.28 _(touches build pipeline)_
**Risk for us:**
deps: bump Tier-A patches + react-day-picker 10 + esbuild 0.28 Successfully bumped: - bullmq 5.76.6 → 5.76.8 - @tanstack/react-query 5.100.9 → 5.100.10 - @tanstack/react-query-devtools 5.100.9 → 5.100.10 - better-auth 1.6.9 → 1.6.10 - @playwright/test 1.59.1 → 1.60.0 - libphonenumber-js 1.12.43 → 1.13.1 - tailwind-merge 3.5.0 → 3.6.0 - vitest 4.1.5 → 4.1.6 - @vitest/coverage-v8 4.1.5 → 4.1.6 - lint-staged 17.0.3 → 17.0.4 - esbuild 0.27.7 → 0.28.0 - react-grab 0.1.33 → 0.1.34 - react-day-picker 9.14.0 → 10.0.0 react-day-picker 10 verified safe: probed v10 release notes against src/components/ui/calendar.tsx — we use only v9-canonical APIs that v10 preserves. Removed the `table` className entry from the wrapper (v10 dropped it since the renderer is now CSS-grid, not table-based). Tried + rolled back: - @hookform/resolvers 3 → 5: stricter input/output inference broke every form using <{schema}, any, {schema}> implicit shape. Needs per-form refactor; parked. Verified clean: pnpm audit (prod + dev) = 0 vulnerabilities; pnpm exec tsc --noEmit clean; vitest 1293/1293 pass. Remaining outdated (deliberately deferred — see docs/AUDIT-2026-05-12.md §34): - next/eslint-config-next 15 → 16 (2-4 wk wait) - zod 3 → 4 (couple with @hookform/resolvers 5; codemod-needed) - tailwindcss 3 → 4 (focused-afternoon project) - @types/node ^20.19 stays pinned to match runtime (audit decision) - archiver 7 stays (no @types/archiver@8 published) - eslint 9 stays (locked to eslint-config-next 15) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:33:24 +02:00
- We use esbuild via `pnpm.overrides` plus directly in `build:server` and `build:worker` scripts.
- Pre-1.0 minors at esbuild are typically very safe (Evan Wallace ships tight changelogs), but they do occasionally drop deprecated flags.
**Recommended:** **GO.** Bundle the bump with the Tier A patches. Effort: 1 min + a `pnpm build` smoke.
---
## Tier C — Things to leave alone
- **`drizzle-orm 0.45.2`** — current major. No upgrade needed.
- **`react 19.2.6` / `react-dom 19.2.6`** — current React 19. Stable.
- **`@radix-ui/*`** — all current. These ship patch updates frequently; consider a quarterly sweep but not blocking.
- **`@dnd-kit/*`, `@pdfme/*`, `socket.io`, `bullmq`, `pino`, `postgres`, `minio`, `ioredis`, `pdf-lib`, `pdfkit`, `sharp`, `tesseract.js`, `recharts`, `cmdk`, `vaul`, `sonner`, `zustand`, `next-themes`, `date-fns`, `clsx`, `class-variance-authority`, `jose`, `nodemailer`, `mailparser`, `imapflow`, `openai`, `lucide-react`, `react-easy-crop`, `react-hook-form`** — all current within their major lines and either no risk-worthy bump available or already bumped.
---
## Recommended sequencing
1. **Now** — pull Tier A patches as one commit (~5 min).
2. **Now**`esbuild` 0.27 → 0.28 in same commit.
3. **Next focused half-day** — Zod 4 + `@hookform/resolvers` v5 together. Coupled because resolvers v5 supports both. Codemod-able.
4. **24 weeks** — Next 15 → 16 + `eslint-config-next` 16 + `eslint` 10. Lockstep. Run `@next/codemod` first.
5. **When a tester-friendly afternoon opens up** — Tailwind 4 via the official upgrade tool, with visual review across critical pages.
6. **Defer indefinitely** — archiver 8, react-day-picker 10 (neither is delivering us anything we need).
**Non-goal:** chasing the bleeding edge on every dep. The audit's baseline finding stands — we are secure today. These are mostly developer-experience and perf wins, not security blockers.
---
## 35. Package adoption + PDF stack overhaul (Context7-assisted follow-up)
Companion to section 34. The deps-upgrade analysis answered "should we bump
what we already have?" — this section answers two follow-on questions:
1. **PDF stack** — are pdfme + pdfkit + pdf-lib the right tools? (No.)
2. **What aren't we using that we should be?** — comprehensive sweep of the
modern ecosystem against our actual pain points and codebase patterns.
User-directed exclusions:
- `react-hotkeys-hook` (no keyboard-shortcut UX target).
---
### 35.A — PDF stack overhaul
#### Current state (5 packages, 4 distinct use cases)
| Package | Where it lives in our code | Use case |
| ----------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------- |
| `@pdfme/common` + `generator` + `schemas` v6.1.2 | `src/lib/pdf/generate.ts` + 8 template files | Declarative report/invoice/EOI templates |
| `pdf-lib` v1.17.1 | `src/lib/pdf/fill-eoi-form.ts`, `src/lib/services/berth-pdf-parser.ts` | AcroForm fill (EOI) + uploaded-PDF parsing (berth specs) |
| `pdfkit` v0.18.0 + `@types/pdfkit` | `src/lib/services/expense-pdf.service.ts` (only site) | Streaming receipt-attached expense reports |
| `tesseract.js` v7.0.0 | `src/lib/ocr/tesseract-client.ts` + scan-shell | Berth PDF OCR fallback |
| **Bridge layer**: 571-line `src/lib/pdf/tiptap-to-pdfme.ts` | Admin template builder | Tiptap JSON → pdfme schema converter |
#### Pain points
- **The 571-line `tiptap-to-pdfme.ts` bridge** is fragile glue between a rich-text
format (Tiptap JSON) and a declarative PDF schema (pdfme). Every supported
formatting subset (bold, italic, headings, lists, tables, images) is
hand-coded. Adding `blockquote` / `codeBlock` / `horizontalRule` /
`taskList` is currently rejected at save time because the bridge doesn't
support them.
- **pdfme** templates are JSON blobs with positional `{ x, y }` coordinates.
Reading/editing them is painful (compare `invoice-template.ts` vs a
declarative React component).
- **`@pdfme/generator` ships a heavy runtime** including the schema engine
and font loaders — irrelevant for our use case because we're code-driven,
not visual-editor-driven.
- **3 different generation libraries** (pdfme + pdfkit + pdf-lib) means three
different mental models, three different test patterns, three different
failure modes.
#### Recommendation per use case
**Use case 1 — Template-driven PDFs (8 templates):** invoice, client-summary,
interest-summary, berth-spec, revenue-report, occupancy-report, pipeline-report,
eoi-standard-inapp.
**→ Replace with `@react-pdf/renderer`** (`/diegomura/react-pdf`, 161 snippets,
benchmark 87.75).
Why it wins for us:
- **Declarative React components** — uses the same skills we already have. No
more positional `{ x, y }` JSON.
- **Server-side rendering modes**: `renderToBuffer` (HTTP responses),
`renderToStream` (large reports), `renderToFile` (background jobs). All
three usage patterns are documented and idiomatic — replaces pdfme's
`generate()` call cleanly.
- **First-class page-break controls** — `break`, `wrap={false}`,
`minPresenceAhead`, `orphans`, `widows`. pdfme has none of these; we'd be
hand-implementing them today if we needed them.
- **Fixed headers/footers via `fixed` prop** with auto page-number rendering
(`render={({ pageNumber, totalPages }) => …}`). We currently re-render
header content per page in pdfme.
- **The Tiptap bridge problem dissolves**: a rich-text component renders
Tiptap JSON directly via a recursive component (~80 LOC, replaces 571 LOC).
No more constrained-subset rejections at save time.
- **Tree-shakes** — only the components we import ship; pdfme's generator
pulls the full schema engine regardless.
Concrete migration cost: rewrite 8 templates as JSX. The shape is 1:1
with our current pdfme schemas (header section, repeating items, footer
totals), so it's a mechanical translation. ~4-6 hours total. Bridge layer
(571 LOC) goes to zero.
Caveats from Context7:
- Font registration is explicit (`Font.register({ family, src })`) — our
current fonts move from pdfme's font loader to a one-time call at boot.
- No Tailwind class support — uses `StyleSheet.create({ ... })` with a
flexbox-style subset. Familiar to React Native devs.
**Use case 2 — AcroForm fill (EOI):**
**→ Keep `pdf-lib`.** Best-in-class for editing existing PDFs. No replacement
candidate is better. Already used correctly in `fill-eoi-form.ts`.
**Use case 3 — Uploaded PDF parsing (berth specs):**
**→ Add `unpdf`** (`/unjs/unpdf`, 66 snippets) for text extraction; keep
`pdf-lib` for AcroForm field extraction.
Why:
- `unpdf` is the unjs ecosystem's serverless-friendly PDF parser built on
pdf.js. Returns `{ totalPages, text }` per page in one call.
- Better than `pdf-lib` for text extraction because pdf-lib's text APIs are
read-positional, not read-flow.
- `getDocumentProxy()` lets us share one parse across `extractText`,
`extractLinks`, `getMeta` — useful for the 3-tier berth parser (AcroForm
first, OCR fallback, AI fallback) because we can grab all metadata in one
pass.
Our current parser uses `pdf-lib`'s low-level text extraction which has known
issues with positionally-rendered text (the OCR fallback fires more often
than necessary). `unpdf.extractText` would reduce that fallback rate.
**Use case 4 — Streaming receipt-attached expense reports:**
**→ Keep `pdfkit` short-term, migrate to `@react-pdf/renderer.renderToStream`
medium-term.**
Why keep:
- `expense-pdf.service.ts` is the only `pdfkit` consumer. Its streaming
pattern (500 receipts at <100MB RSS) is the load-bearing reason for
pdfkit's existence in our deps.
- `@react-pdf/renderer.renderToStream` documented in Context7 supports the
same use case — but verification needs an actual perf test against a
500-receipt fixture before committing.
Migration plan:
- Phase 1 (now): replace pdfme templates with @react-pdf/renderer.
- Phase 2 (after we have @react-pdf/renderer in the codebase): re-test
expense-pdf with `renderToStream` against the 500-receipt fixture. If
memory stays under 200MB, swap pdfkit out. If not, keep pdfkit and
document the constraint.
#### Net result after Phase 1
Remove: `@pdfme/common`, `@pdfme/generator`, `@pdfme/schemas`, 571-line
bridge file.
Keep: `pdf-lib` (AcroForm), `pdfkit` (streaming expenses, pending Phase 2),
`tesseract.js` (OCR).
Add: `@react-pdf/renderer`, `unpdf`.
Deps net: 2, 571 LOC of bridge code, +standard declarative API for all
templates.
---
### 35.B — High-value package additions (prioritized)
Each row below has been validated via Context7 unless marked otherwise.
#### Tier 1 — Adopt alongside the planned Zod 4 / Tailwind 4 work
| Package | Replaces / unlocks | Where it lands in our code | Effort |
| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------- |
| **`drizzle-zod`** (already in `drizzle-orm`) | ~30 hand-maintained validators in `src/lib/validators/` | `createInsertSchema(clients).omit({ id, portId })` etc. | 2-3h |
| **`@react-pdf/renderer`** | 8 pdfme templates + 571-line tiptap bridge | `src/lib/pdf/templates/*` | 4-6h |
| **`react-email`** + `@react-email/components` | 8 hand-strung HTML templates in `src/lib/email/templates/` | Each becomes a `.tsx` component, rendered via `await render(<…/>)` then handed to nodemailer unchanged | 2-3h (one template at a time) |
| **`@tanstack/react-virtual`** | Pagination on `client-list`, `yacht-list`, `berth-list`, `audit-log-list`, `inbox` | `useVirtualizer({ count, estimateSize })` inside the list shells | 1h per list × 5 lists |
| **`ts-pattern`** | 19-case dispatch in `search.service.ts`, 13-case Documenso webhook, 12-case `client-restore.service.ts`, 10-case `recently-viewed/route.ts`, 10-case `custom-fields/[entityId]/route.ts` | `match(input).with(...).exhaustive()` | 30 min per site; start with the Documenso webhook |
| **`unpdf`** | Hand-rolled text extraction in `berth-pdf-parser.ts` | `extractText(await getDocumentProxy(buf))` | 1h |
#### Tier 2 — Independent adopts (polish + perf)
| Package | What it does for us | Effort |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- |
| **`@formkit/auto-animate`** | One-liner `useAutoAnimate()` ref on any list. Drops into: deal pipeline kanban (pipeline-board.tsx), reminders rail, alerts rail, files list, notes list. Zero CSS. ~2kb. | 5 min per site |
| **`motion`** (formerly framer-motion) | Layout animations for kanban reorder (currently snaps), Vaul drawer enter/exit polish, sheet/drawer slides, `<AnimatePresence>` for inline edits. ~15kb gzip but tree-shakes well. | 1-2h to wire the kanban first |
| **`use-debounce`** | Replaces ad-hoc `setTimeout` debounce in `yacht-picker`, `client-picker`, `audit-log-list`, `send-document-dialog`, `custom-fields-section`, `berth-picker`, `interest-picker`, `dedup-suggestion-panel` (8 sites). Typed `useDebouncedCallback`. ~3kb. | 30 min total |
| **`fast-deep-equal`** | Memo comparator for `DataTable` and React Query `select` functions. Drops re-renders when stable references arrive with new identity. ~1kb. | 20 min |
| **`@upstash/ratelimit`** | Replaces hand-rolled rate limiters in `src/lib/rate-limit.ts`, `api/helpers.ts`, `route-helpers.ts`, `document-sends.service.ts`. Uses our existing Redis. Sliding-window / fixed-window / token-bucket algorithms tested at scale. | 1-2h |
#### Tier 3 — Strategic adopts (bigger commitments)
| Package | What it unlocks | Notes |
| ------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`next-safe-action`** | Type-safe server actions with built-in Zod validation, ownership middleware, `useHookFormAction` hook. Each form drops ~30 LOC of `apiFetch + toastError + mutation-hook` plumbing to ~5. Pairs with `useHookFormAction` which already speaks Zod/RHF. | Migrate gradually — use for new forms first, keep API routes for external callers. Couples with Zod 4 (since safe-action v8+ targets Zod 4 best). |
| **`@axe-core/playwright`** | Accessibility audit during smoke tests. The 33-agent audit flagged WCAG gaps; this catches regressions automatically. | ~30 LOC of test setup. Fails CI on new violations. |
| **`@tiptap/core`** + `@tiptap/react` + extension packs | Real rich-text editor for `notes` (clients/interests/yachts/companies all have polymorphic notes). Currently plain text. Sales reps note things like "call after 4pm UTC, prefers WhatsApp" — bold/italic/links/lists/mentions would help. Tiptap's JSON output format is _already_ in our codebase (the bridge layer), so we'd be storing the same shape we already render. | Decision: keep notes plain or upgrade to rich? If yes, ~3h to wire one entity's notes; the others copy the pattern. |
| **`@next/bundle-analyzer`** | Wraps `next.config.ts`. Generates client + server bundle treemaps after every build. Catches when a tiny PR pulls in recharts on a route that shouldn't have it. The 33-agent audit flagged recharts + pdfme as bundle bloat — this is the tooling to keep that honest. | 15 min setup. Run with `ANALYZE=true pnpm build`. |
| **`@sentry/nextjs`** | Error tracking with frontend + backend correlation, release tracking, source maps, performance traces, replay (optional). We have pino logs but no aggregation/alerting/correlation. Important once we have customer-facing users. | Decision: do we want a SaaS dependency? Self-hosted GlitchTip is also an option (Sentry-protocol-compatible). |
| **`@vercel/og`** (or `satori`) | Generate Open Graph images for shared docs/portal links. Currently the portal has no social previews; if a client shares their EOI link in WhatsApp/Email, the preview is blank. ~10 LOC per route. | 1h for portal share routes. |
| **`papaparse`** | CSV import/export. Sales reps frequently ask for "export to Excel." Plays well with our existing TanStack Table data. ~17kb. | 30 min for client/interest list export. |
| **`@formkit/tempo`** OR **date-fns helpers** | We have **44 files** with hand-rolled `new Date().toLocaleString()` / `.toLocaleDateString()`. Centralize via a `formatDate(date, format, timezone)` helper using `date-fns` (already installed) — no new package needed if we use date-fns's `format`, `formatDistance`, `formatRelative` which we already have. **This is a refactor, not an adoption.** | 2-3h sweep |
#### Tier 4 — Defer or skip
| Package | Reason |
| ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `next-pwa` / `@serwist/next` | PWA assets pending (per MEMORY.md). When that lands, **`@serwist/next`** is the modern choice (next-pwa is unmaintained). For now, skip. |
| `next-intl` / `i18next` / `@lingui/core` | No i18n target today. When we localize, **`next-intl`** is the strongest Next.js App Router integration. For now, skip. |
| `@knocklabs/node` + `@knocklabs/react` | Notification center + channel routing + preferences UI. Likely overkill — we have a simple in-app + email notification system that works. Revisit if we add SMS or push. |
| `inngest` / `trigger.dev` | Background jobs with observability. We use BullMQ; revisit only if we need step functions / cross-service workflows. |
| `posthog-js` | Product analytics + feature flags + session recording. We have Umami for web analytics; PostHog adds product-level tracking. Decision pending. |
| `@growthbook/growthbook` | Feature flags only. We don't have any flagged features today. |
| `fuse.js` / `minisearch` | Client-side fuzzy search. Useful for already-loaded list filtering, but TanStack Table's built-in filter is usually enough. |
| `@uppy/core` + `@uppy/dashboard` | Rich file upload UI with resume, chunking. We have basic file inputs (0 patterns found in audit grep) — not currently a pain point. |
| `@tanstack/react-form` | Successor to react-hook-form by same team. RHF is mature, well-known, and we have 8 forms on it. No compelling migration. |
| `valibot` / `arktype` | Faster zod alternatives. We're committed to Zod 4. |
| `react-hotkeys-hook` | **Excluded per user direction.** |
---
### 35.C — Deprecation / cleanup candidates
| Package | Reason | Action |
| ------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------- |
| `@radix-ui/react-icons` | We use `lucide-react` everywhere. Audit grep shows no imports from `@radix-ui/react-icons`. | Drop after grep-confirm. ~30s. |
| `@pdfme/common` + `@pdfme/generator` + `@pdfme/schemas` | Replaced by `@react-pdf/renderer` in Phase 1. | After PDF migration. |
| `tailwindcss-animate` v1.0.7 | Last published 2024, no v4 support. Replace with **`tw-animate-css`** (the v4-native successor shadcn now recommends). | Required if we move to Tailwind 4. |
| `@types/pdfkit` | Tops at v7.0.0. We're on `pdfkit` v0.18 — types are loose but functional. Keep until we migrate expense-pdf to @react-pdf/renderer. | Defer. |
| `pino-pretty` in `dependencies` | Should be `devDependencies` only — ships ~500kb to prod worker images if it leaks into the runtime path. Audit-verify the build doesn't include it; move if it does. | 5 min check. |
---
### 35.D — Surfaced refactor opportunities (no new package required)
These came up while sweeping for package gaps. They're refactor wins, not
package adoptions.
| Opportunity | Concrete sites | Tool |
| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| Centralize date formatting | 44 files with hand-rolled `.toLocaleString()` / `.toLocaleDateString()` | `formatDate(date, format, timezone)` helper using existing `date-fns` |
| Centralize debounce | 8 picker/list components | `use-debounce` (or hand-rolled hook) |
| Centralize rate-limiting | 4 hand-rolled limiters | `@upstash/ratelimit` |
| Replace 5-9 large switch statements with exhaustive matchers | `search.service.ts` (19 cases), Documenso webhook (13), `client-restore.service.ts` (12), `recently-viewed/route.ts` (10), `custom-fields/[entityId]/route.ts` (10) | `ts-pattern` |
---
### 35.E — Final adoption order (revised, incorporating section 35)
This supersedes section 34's sequencing where they overlap.
1. **Now (one focused day)** — Zod 4 + `@hookform/resolvers` 5 + **`drizzle-zod`**. One PR. Codemod-friendly. Highest correctness payoff.
2. **Independent (any time)****`react-email`** migration of one template (`portal-auth.ts` recommended first), then expand. Independent of any version upgrade.
3. **Independent (any time)****`@react-pdf/renderer`** + **`unpdf`**. Replace 8 pdfme templates, delete 571-LOC bridge, add unpdf to berth parser.
4. **Independent (any time)****`ts-pattern`** in the Documenso webhook switch first (the audit's bug-class poster child), then sweep the other 4 sites.
5. **Independent (any time)****`@tanstack/react-virtual`** on `client-list` first, copy pattern to 4 other lists.
6. **Independent (any time)****`@formkit/auto-animate`** sprinkle. 5-minute wins per site.
7. **Independent (any time)****`@next/bundle-analyzer`** install. 15-min setup; ongoing bundle hygiene.
8. **Next focused half-day****`motion`** wire to the kanban for smooth reorder.
9. **2-4 weeks** — Next 15 → 16 + eslint-config-next 16 + eslint 10 (lockstep, codemod).
10. **Focused afternoon** — Tailwind 4 via official upgrade tool + swap `tailwindcss-animate` for `tw-animate-css`.
11. **When we have a new form to build** — pilot **`next-safe-action`** there; backfill existing forms gradually.
12. **Decision required first**`@sentry/nextjs` (SaaS dep), `@tiptap/*` (rich notes Y/N?), `posthog-js` (analytics scope), `papaparse` (CSV export priority).
---
### 35.F — Skipped per user direction
- **`react-hotkeys-hook`** — no keyboard-shortcut UX target across the platform.
---
## 36. Second-pass package sweep — mobile, fluidity, data speed, DX
Section 35 covered the headline adoption candidates. This section is the
deliberate second sweep the user requested — looking specifically for
libraries we may have missed across four dimensions: **current
functionality gaps**, **optimization (mobile included)**, **UI fluidity**,
and **data retrieval/writing speed**.
Findings are grouped by dimension. Each entry says (a) what we have now,
(b) what the library adds, (c) where in our codebase it'd land, (d) effort.
---
### 36.A — Data speed & concurrency
#### 36.A.1 `p-queue` + `p-limit` + `p-retry` (Sindre Sorhus suite)
**Concrete pain:** 74 `Promise.all(...)` sites in services/routes. 8 mass-
operation services (`expense-pdf`, `berth-pdf`, `brochures`, `backup`,
`document-templates`, `email-compose`, `documents`, `email-threads`).
Naive `Promise.all([...mapped])` will:
- Fire all 500 expense receipts to S3 simultaneously → MinIO connection
pool exhaustion + memory spike (`expense-pdf.service.ts` docs explicitly
call this out as a past problem).
- Fire all bulk-send-document calls at Documenso simultaneously → hit
Documenso's per-second rate limit, cause cascade failures.
- Fire all email-compose attachments at SMTP simultaneously → SMTP
connection limit on Mailgun/SES drops requests silently.
**`p-limit`** caps concurrency: `pLimit(5)` runs at most 5 at a time.
**`p-queue`** is `p-limit` + interval rate limiting + pause/resume.
**`p-retry`** handles exponential backoff retries for transient failures.
**Land sites:**
- `expense-pdf.service.ts` — already has streaming logic, but the
per-receipt S3 `get` calls are unbounded.
- `email-compose.service.ts` — bulk send-out is the obvious one.
- `backup.service.ts` — GDPR export streaming.
- `documents.service.ts` — multi-file folder operations.
**Effort:** 30 min per service. ~1.5kb each.
#### 36.A.2 `@tanstack/query-broadcast-client-experimental`
**Concrete pain:** A rep has the CRM open in two tabs. They update a
client in tab A — tab B's stale cache continues showing old values until
the next refetch.
**What it adds:** BroadcastChannel sync between tabs. Free cross-tab cache
coherence with no server roundtrips.
**Land site:** One line in `src/providers/query-provider.tsx`:
```ts
broadcastQueryClient({ queryClient, broadcastChannel: 'pn-crm' });
```
**Effort:** 5 minutes. ~2kb.
#### 36.A.3 Underused Drizzle ORM features (no new package)
We have `drizzle-orm` 0.45.2 and use ~60% of its capabilities.
- **`db.batch(...)`** for atomic multi-statement transactions on
Postgres. Currently we use explicit `db.transaction(async (tx) => {...})`
blocks everywhere — `batch` is shorter and lets the driver pipeline.
- **Prepared statements** via `.prepare()` — repeated queries (e.g.,
`getClient(id)` called per-request) can be prepared once at boot and
reused. Postgres saves the parse+plan cost.
- **`with` (CTE) clauses** — we have 30+ places where we'd benefit from
`WITH active_interests AS (...) SELECT ...` instead of joining the same
subquery twice. Audit found N+1 patterns; CTEs flatten them.
**Land sites:** the recommender SQL aggregate (already uses CTEs),
`dashboard.service.ts` analytics queries, `search.service.ts` graph
expansion. These are all "we already wrote raw SQL strings; rewriting as
typed Drizzle CTEs" wins.
**Effort:** opportunistic. No package change.
#### 36.A.4 `postgres.js` cursor for large reads
We have `postgres` ^3.4.9. Its `await sql\`...\`.cursor(rows => ...)`streams large result sets in batches without buffering all rows. Currently
the GDPR-export bundling and the backup`dump-tables` paths buffer
everything in memory.
**Land sites:** `backup.service.ts`, `gdpr-export.service.ts` (when we
build it — currently parked).
**Effort:** opportunistic refactor when we touch those services.
---
### 36.B — UI fluidity & animation
#### 36.B.1 `@use-gesture/react` (mobile gestures)
**Concrete pain:** mobile users can't swipe-to-dismiss the Vaul drawer,
swipe sideways between kanban columns, or pinch-zoom berth photos. The
audit's mobile pass flagged these.
**What it adds:** declarative gesture handlers (`useDrag`, `usePinch`,
`useScroll`). Composes with `motion` for spring-physics responses.
**Land sites:**
- Pipeline kanban: swipe between stage columns on mobile.
- Vaul drawer: swipe-down to dismiss (Vaul already does this, but adding
custom velocity thresholds via `@use-gesture` polishes the feel).
- Berth/yacht photo galleries: pinch-zoom.
**Effort:** 1h to wire one site as the template. ~5kb.
#### 36.B.2 `embla-carousel-react`
**Concrete pain:** berth photos and yacht photos render as static grids
(per the audit). On mobile, users want to swipe through them.
**What it adds:** lightweight, touch-native, accessibility-compliant
carousel. Plays with framer-motion if we want fancy transitions.
shadcn/ui has a `Carousel` component built on this — drop-in via the
shadcn CLI.
**Effort:** `npx shadcn@latest add carousel`, then 10 lines to render the
photo array. ~10kb gzip.
#### 36.B.3 `yet-another-react-lightbox`
**Concrete pain:** clicking a berth photo currently navigates to a fullscreen
image route or doesn't expand at all. Sales reps want lightbox-style preview.
**What it adds:** fullscreen lightbox with keyboard nav, zoom, swipe, slideshow,
captions. Plugin system for video/PDF embed if we extend.
**Land sites:** berth/yacht detail pages, client docs preview.
**Effort:** 1h. ~15kb gzip with plugins.
#### 36.B.4 `react-resizable-panels`
**Concrete pain:** the docs hub has a fixed-width folder sidebar (per
CLAUDE.md's documents-hub rewrite). Power users on wide monitors want
to drag-resize it.
**What it adds:** keyboard-accessible resizable split panes with
persistent sizing (localStorage). shadcn/ui has a `Resizable` component
built on this.
**Land sites:** docs hub (sidebar | content), email inbox (folder | thread),
admin settings (nav | section).
**Effort:** `npx shadcn@latest add resizable`, drop in. ~5kb.
---
### 36.C — Mobile optimization
#### 36.C.1 `browser-image-compression`
**Concrete pain:** the expense-scanner (`scan-shell.tsx`) and receipt
upload paths accept full-resolution phone photos (typically 4-12 MB each).
Mobile users on cellular pay bandwidth + battery for sending 4× more
data than necessary. The server then re-runs `sharp` to resize anyway.
**What it adds:** client-side image compression in WebWorker before
upload. Targets `maxSizeMB`, `maxWidthOrHeight`, `useWebWorker`. The
server still validates magic-bytes + sharp-resizes, but receives a
500KB-resized JPG instead of a 12MB original.
**Concrete win:** a rep on 3G uploading a receipt: ~30s wait → ~5s wait.
Server CPU on `sharp` resize drops to a no-op since the client did it.
**Effort:** 30 min to wire `scan-shell.tsx`. ~25kb gzip (worker-bundled so
zero main-thread cost).
#### 36.C.2 `partysocket`
**Concrete pain:** mobile users on flaky networks frequently lose the
Socket.IO connection. Our current client uses Socket.IO's built-in
reconnect, which is good but not great for mobile.
**What it adds:** drop-in WebSocket wrapper with:
- Exponential backoff with jitter (default Socket.IO is linear).
- Message queue while disconnected (Socket.IO buffers via volatile flag
only).
- Auto-reconnect on `online` event + `visibilitychange` (page wake).
- Optional auto-detect connection quality (slow vs fast).
**Land site:** `src/providers/socket-provider.tsx`.
**Effort:** depends — `partysocket` works with raw WS, not Socket.IO's
protocol. For Socket.IO we'd need `socket.io-client` + manual reconnect
tuning, or migrate the realtime layer to plain WebSockets (significant).
**Park as a "mobile flake" investigation, not an immediate adoption.**
#### 36.C.3 `react-virtuoso` (alternative to TanStack Virtual)
**Concrete pain:** the inbox (`src/components/layout/inbox.tsx`) uses a
plain `<ScrollArea className="max-h-[400px]">` with no virtualization.
For users with hundreds of unread items, mobile scrolling chugs.
**What it adds:** specialized virtualization for chat-like / inbox-like
UIs with variable-height items and "scroll to bottom on new message"
semantics. **TanStack Virtual is more headless / generic; Virtuoso is
opinionated and better for inbox-shaped UIs.**
**Land site:** `inbox.tsx` specifically. For the regular lists
(client/yacht/berth), TanStack Virtual is still the right call (section
35.B.4).
**Effort:** 45 min. ~10kb.
#### 36.C.4 `@formkit/auto-animate` (revisit for mobile)
Already in section 35.B but worth re-emphasising: on mobile, list items
appearing/disappearing without animation feels janky. Free polish.
---
### 36.D — Input quality & forms
#### 36.D.1 `react-imask` or `react-number-format`
**Concrete pain:** we have currency inputs, phone inputs, date inputs
spread across berth-form, expense-form, invoice-form, client-form. The
audit flagged inconsistent formatting (decimals, thousand-separators,
phone-prefix handling).
**What it adds:** declarative input masks — `<IMaskInput mask="$num"
scale={2} thousandsSeparator="," />`. Plays cleanly with react-hook-form.
`react-number-format` is the lighter-weight, currency-specific option.
`react-imask` covers more patterns (phone, date, custom).
**Land sites:** ~6 form components.
**Effort:** 30 min per form × 6 = 3h. **OR** keep our hand-rolled
formatters and don't add the dep. Decision pending.
#### 36.D.2 `@hookform/devtools` (dev-only)
**What it adds:** a floating panel in the browser showing react-hook-form
state in real time (values, errors, isDirty, isValid, touched fields).
Massive debug-time win.
**Land site:** wrap forms in `<DevTool control={form.control} />` in dev
builds only.
**Effort:** 15 min. dev-only, ships zero to prod.
---
### 36.E — Security & sanitization
#### 36.E.1 `isomorphic-dompurify`
**Concrete pain:** `src/lib/utils/markdown-email.ts` hand-rolls HTML
escape + safe-link rendering for email bodies. The audit raised XSS
concerns (CRIT-2 in section 4) about admin-supplied content in templates
and email bodies. Our hand-rolled `escapeHtml` is correct for the basic
cases, but DOMPurify handles edge cases the audit listed (data URLs,
nested encoding, javascript: in href attrs).
**What it adds:** battle-tested HTML sanitizer used by Google, Microsoft,
GitHub. Works in Node + browser (the `isomorphic-` prefix is the
SSR-compatible wrapper around the regular `dompurify`).
**Land sites:**
- `renderEmailBody()` in `markdown-email.ts`.
- Anywhere we render user-supplied HTML (template preview, document
body display).
**Effort:** 1h migration + audit. ~25kb (Node) / ~50kb (browser),
acceptable.
#### 36.E.2 `@noble/hashes` (already covered by `better-auth`)
We use `better-auth` for password hashing. No need to add.
#### 36.E.3 WebAuthn / Passkeys (`@simplewebauthn/server` + `/browser`)
**What it adds:** passwordless authentication via device passkeys (Touch
ID, Windows Hello, YubiKey). Better Auth has a WebAuthn plugin that
wraps these.
**Decision required:** is passwordless a 2026 roadmap item?
---
### 36.F — Observability & perf measurement
#### 36.F.1 `web-vitals`
**Concrete pain:** we have no real-user perf data. We don't know our
P75 LCP, P75 INP, or P75 CLS across our user base. Any future perf
optimization (Cache Components, Tailwind 4, dynamic imports) is shooting
in the dark without baseline measurement.
**What it adds:** Google's official Core Web Vitals library. Ships
`onLCP`, `onINP`, `onCLS`, `onFCP`, `onTTFB` callbacks. Reports values
once per page lifecycle.
**Land site:** `src/app/(dashboard)/layout.tsx` — wire a listener that
POSTs vitals to `/api/v1/internal/vitals` (new endpoint, append to
existing `client_metrics` table or similar). 30 LOC end-to-end.
**Effort:** 1h including backend logging. ~2kb. **High value** because
without this we're guessing about perf wins.
#### 36.F.2 `pino-http`
**Concrete pain:** we have request logging via custom middleware. `pino-http`
is the canonical pino HTTP request logger with automatic request-id
propagation, response time, status code, and integration with our pino
logger. Likely already partially implemented via our hand-rolled
middleware.
**Effort:** check existing middleware first — may already cover this.
#### 36.F.3 `@sentry/nextjs` (revisit from section 35)
Covered in 35.B Tier 3. Adoption gated on the SaaS-dep decision.
---
### 36.G — TypeScript ergonomics
#### 36.G.1 `@total-typescript/ts-reset`
**Concrete pain:** TypeScript's stdlib types have well-known foot-guns:
- `Array.isArray(x)` narrows to `any[]` (drops the actual type).
- `JSON.parse(s)` returns `any` (defeats type safety entirely).
- `fetch().json()` returns `Promise<any>`.
- `.filter(Boolean)` doesn't remove `null | undefined` from the type.
- `Array.prototype.includes` is too strict on its argument.
ts-reset is **a single `.d.ts` import** (`import '@total-typescript/ts-reset'`)
that fixes all of these globally. Used by Anthropic, Stripe, Vercel internally.
**Concrete impact:** likely catches 10-20 latent bugs across our 1000+
TS files where someone called `JSON.parse(body)` and continued treating
the result as a typed object without parsing through Zod.
**Effort:** 1 line in `src/types/globals.d.ts`. **dev-time only**, ships
zero runtime.
#### 36.G.2 `type-fest`
**What it adds:** ~150 utility types (`SetRequired`, `SetOptional`,
`PartialDeep`, `MergeDeep`, `Promisable`, `Jsonifiable`, etc.) that
extend TypeScript's built-ins.
**Land sites:** anywhere we're hand-rolling `Omit<X, Y> & Pick<Z, W>`
gymnastics — type-fest usually has a named util that's clearer.
**Effort:** opportunistic. ~0kb runtime (types only).
#### 36.G.3 `tsc-files`
**Concrete pain:** pre-commit hook runs ESLint on staged files (fast) but
no type-check. Type errors slip through to CI.
**What it adds:** typecheck _only the staged TS files and their
dependencies_, not the full repo. Drops a pre-commit hook from "skip
because too slow" to "always on, sub-2-second."
**Land site:** `.husky/pre-commit` + `lint-staged.config.mjs`
`"*.ts": ["tsc-files --noEmit"]`.
**Effort:** 15 min.
---
### 36.H — In-browser PDF viewing
#### 36.H.1 `pdfjs-dist` + a viewer wrapper
**Concrete pain:** the docs hub (per CLAUDE.md) lets users upload and
file PDFs. There's currently no in-app preview — clicking a file likely
downloads it or opens in a new tab. A real CRM should preview the PDF
inline.
**What it adds:**
- **`pdfjs-dist`** is Mozilla's pdf.js — the engine.
- **`@react-pdf-viewer/core`** is the most feature-rich React wrapper
(zoom, search, annotations).
- Alternatively, **`react-pdf`** (Wojtek Maj's, not @react-pdf/renderer)
is a lighter wrapper.
**Land site:** docs hub file detail / preview pane. EOI signing preview
in admin.
**Effort:** 2-3h for a polished viewer with zoom + page nav. ~150kb gzip
(pdf.js is unavoidable; lazy-load only when preview opens).
**Note vs section 35.A:** `@react-pdf/renderer` (generator) and `pdfjs-dist`
(viewer) are complementary. We need both: one to _make_ PDFs, one to
_show_ them.
---
### 36.I — Testing & development data
#### 36.I.1 `@faker-js/faker`
**Concrete pain:** seed data is currently hand-maintained (mostly).
Faker would replace hand-rolled fake names, emails, addresses, phone
numbers, vehicle/yacht names, dates, marina locations with reproducible,
locale-aware fakes.
**Land site:** `src/lib/db/seed.ts`, `src/lib/db/seed-synthetic.ts`.
**Effort:** 1-2h. ~3MB gzip — **dev-only**, not shipped.
#### 36.I.2 `msw` (Mock Service Worker)
**Concrete pain:** integration tests that hit external services
(Documenso, SMTP, IMAP) either skip in CI or fail intermittently.
**`msw`** intercepts fetch/HTTP at the network layer in tests so we can
mock external responses deterministically.
**Land site:** `tests/integration/` setup — wrap Documenso + SMTP
clients with MSW handlers.
**Effort:** 2-3h. dev-only.
---
### 36.J — Workflow & state machines
#### 36.J.1 `@xstate/react`
Audit found only one multi-step flow (`send-document-dialog.tsx`).
EOI signing has steps but they're sequential, not state-machine-y. The
GDPR export job is a backend state machine but `bullmq` handles it.
**Verdict:** **not warranted right now.** Revisit if we build the
client-onboarding flow or the multi-step EOI-with-multi-berth-and-
payment-and-signing wizard the roadmap mentions.
---
### 36.K — Search & filtering
#### 36.K.1 Postgres-native FTS (no new package — schema migration)
**Concrete pain:** `search.service.ts` uses `LIKE '%term%'` on client/yacht/
company tables. Slow at scale; doesn't rank.
**What we could add:** Postgres `tsvector` columns + `GIN` indexes + a
single `to_tsquery()` call per search. This is **all native Postgres**
— no new npm dep. Drizzle supports it via `sql\`...\`` template literals.
**Effort:** migration (30 min) + service refactor (2h) + e2e re-run.
#### 36.K.2 External search engines (`meilisearch`, `typesense`)
**Verdict:** overkill until we're past 100k clients per port. Postgres
FTS will hold for years. **Defer indefinitely.**
---
### 36.L — Final updated adoption order (incorporating section 36)
Layered on section 35.E:
**Same-day adopts (low-risk, high-leverage):**
- **`@total-typescript/ts-reset`** — 1-line type-safety upgrade. Do this
before any Zod 4 work — it'll catch latent bugs along the way.
- **`web-vitals`** — establish perf baseline before any optimization.
- **`@hookform/devtools`** — dev-only DX win.
**Adopt alongside section 35.B Tier 1:**
- **`p-limit`** — pair with the section 35 mass-operation refactors. The
Documenso bulk-send path is the highest-priority site.
- **`@tanstack/query-broadcast-client-experimental`** — 1-liner in the
query provider.
**Adopt with mobile/UX work:**
- **`browser-image-compression`** — wire into scan-shell first.
- **`embla-carousel-react`** + **`yet-another-react-lightbox`** — pair
with berth/yacht photo gallery work.
- **`react-resizable-panels`** — pair with docs hub UX work.
- **`@use-gesture/react`** — pair with kanban-on-mobile polish.
**Adopt with security pass:**
- **`isomorphic-dompurify`** — replaces hand-rolled escapeHtml. Pair
with the audit's XSS hardening pass.
**Adopt with the docs hub Phase 2:**
- **`pdfjs-dist`** + viewer wrapper — when in-app PDF preview becomes a
user request.
**Park / defer:**
- `partysocket` (requires Socket.IO investigation first).
- `@xstate/react` (no current target).
- External search engines.
- WebAuthn / passkeys (roadmap decision).
---
### 36.M — Final summary
The first sweep (section 35) found the headline replacements:
**Zod 4 + drizzle-zod + react-email + @react-pdf/renderer** is the
single highest-leverage week of work.
This second sweep (section 36) found the **operational hardening
layer**:
- **`p-limit` family** for the 74 unbounded `Promise.all` sites.
- **`@total-typescript/ts-reset`** for free type safety across 1000+ files.
- **`web-vitals`** to establish a perf baseline before we optimize.
- **`isomorphic-dompurify`** to harden the email/template rendering.
- **`browser-image-compression`** for mobile bandwidth / battery.
- **`@tanstack/query-broadcast-client-experimental`** for free cross-tab
cache sync.
- **`react-resizable-panels`** + **`embla-carousel-react`** +
**`yet-another-react-lightbox`** for the photo/preview surfaces.
Together with section 35, this gives us a concrete shopping list of
~20 packages with explicit land-sites in our code and effort estimates,
plus 5-6 cleanup-candidate removals. Adopting all of them would shed
~600 LOC of hand-rolled code, eliminate ~5 categories of latent bugs
(timezone, XSS, race conditions, type stdlib quirks, missing
exhaustiveness), and meaningfully improve mobile UX + perf measurability.
---
**Bottom line:** the deps audit (section 34) showed we're secure today.
This section (35) shows where we can make the codebase _meaningfully better_
— smaller, cleaner, more declarative — by leveraging packages we don't yet
use. The single highest-leverage move is **Zod 4 + drizzle-zod + react-email
in the same focused day**: it kills the validator-drift problem, lands the
14× parse-perf win, and starts paying down the hand-strung-email-templates
debt all at once. The PDF stack overhaul (35.A) is the second-highest-leverage
move: removing pdfme + the 571-line Tiptap bridge in favor of declarative
React components is a category-of-bug eliminator, not just a refactor.