Four low-risk adds before the Zod 4 / drizzle-zod headliner: - @total-typescript/ts-reset: tightens TS stdlib types globally (JSON.parse → unknown, fetch().json() → unknown, .filter(Boolean) narrows, Set literals respect typed Set targets). Caught 179 latent type errors; fixed all production sites (8 files) and added `any` cast escape hatch in test files (ESLint exemption scoped to tests/). - web-vitals + /api/v1/internal/vitals endpoint + WebVitalsReporter client component: establishes Core Web Vitals baseline (LCP/INP/CLS/ FCP/TTFB) via navigator.sendBeacon. Required before optimisation work. - @hookform/devtools + FormDevtool wrapper: dev-only RHF state inspector, lazy-loaded via next/dynamic so the chunk is excluded from prod bundles entirely. - @tanstack/query-broadcast-client-experimental: cross-tab cache sync via BroadcastChannel — wired in query-provider.tsx, 1-liner. Audit doc updated with sections 35 + 36 (PDF stack overhaul + comprehensive second-pass package sweep) covering ~20 package adoption candidates and 4-5 deprecation candidates. Verified: tsc clean, vitest 1293/1293 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
476 KiB
Port Nimara CRM — Comprehensive Platform Audit
Generated: 2026-05-12 (session run)
Branch: feat/documents-folders
Method: 19 parallel audit agents on Claude Opus 4.7, read-only static analysis. Each agent owned a single domain and wrote a CRITICAL/HIGH/MEDIUM-grouped report. This document consolidates the reports and overlays the fixes already shipped during the session.
How to read this document
- Executive summary lists every CRITICAL finding (must address before production), per domain.
- Already fixed in this session is a manifest of the changes I shipped while the audit was running. Don't re-fix these.
- Cross-cutting priority queue is the top ~15 highest-impact findings across the entire codebase, ordered. Tackle these first.
- Per-domain reports below contain the full text of every agent's report verbatim — useful when you sit down to actually fix a specific area.
- Methodology + agent roster appendix at the bottom lists who looked at what.
Severity is the auditor's judgment, not mine — I have not re-graded findings. Treat anything tagged CRITICAL as a real block on shipping.
Executive summary
CRITICAL findings (must address)
| # | Domain | File | Issue | Status |
|---|---|---|---|---|
| 1 | Security | src/app/api/v1/admin/users/[id]/permission-overrides/route.ts |
Admins could grant themselves every permission leaf via self-target | FIXED this session |
| 2 | Security | src/app/api/auth/resolve-identifier/route.ts |
Username enumeration via hit/miss response shape + no rate limit | FIXED this session |
| 3 | Services | src/lib/services/users.service.ts (admin email-change) |
account.accountId not updated → user can't sign in with either old or new email after admin rotation; sessions also not revoked |
FIXED this session |
| 4 | Observability | src/lib/services/search-nav-catalog.ts |
10 NAV_CATALOG entries pointed at routes that don't exist (/admin/audit-log, /admin/error-events, /user-settings, 7×/settings/<x>) |
FIXED this session |
| 5 | Auth flow | src/middleware.ts |
Token-gated email confirm/cancel routes blocked by session 401 | FIXED this session |
| 6 | src/lib/env.ts + src/lib/email/index.ts |
EMAIL_REDIRECT_TO has no NODE_ENV=production guard — a stray prod env value silently funnels every email to one inbox |
Open | |
| 7 | every template | URL interpolations into href="…" and link text are unescaped — a " in any URL breaks out, no scheme rejection |
Open | |
| 8 | Data model | src/lib/db/migrations/0052_audit_critical_fixes.sql |
CREATE INDEX CONCURRENTLY silently never runs because there's no real db:migrate runner — six composite indexes missing in prod |
Open |
| 9 | Data model | db:push flow |
Two structural constraints (berths.current_pdf_version_id circular FK, system_settings NULLS NOT DISTINCT) not in db:push; fresh-deploy diverges from prod |
Open |
| 10 | Services | documents.service.ts: handleDocumentCompleted |
Orphan-blob window — failure between storage.put and documents.update leaves the blob and marks status='completed' with no signedFileId |
Open |
| 11 | GDPR | src/lib/services/gdpr-bundle-builder.ts |
Article-15 export missing portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions | Open |
| 12 | GDPR | src/lib/services/client-hard-delete.service.ts |
"Right to be forgotten" doesn't actually erase — verbatim PII survives in email_messages.body_html, files, document_sends.recipient_email forever | Open |
| 13 | GDPR | src/app/api/auth/resolve-identifier/route.ts (post-fix) |
Still echoes the real canonical email on a successful username hit (rate-limited but enumerable) | Partial — see Open follow-ups |
| 14 | GDPR | audit_logs.metadata field |
Not covered by maskSensitiveFields; raw PII (emails, IPs, names) accumulates unbounded with no retention cron |
Open |
| 15 | Observability | src/app/api/webhooks/documenso/route.ts |
Webhook handler bypasses the platform-error pipeline entirely — admin/errors silent on Documenso webhook crashes | Open |
| 16 | UI/UX | 16 sites use native window.confirm() |
Bypasses ConfirmationDialog / AlertDialog for destructive flows (cancel signing, delete files, archive interest/company/yacht…) |
Open |
| 17 | Documenso | documenso-client.ts v1↔v2 routing |
(Pending full report) | In progress |
| 18 | Concurrency | (see report) | Various race windows on multi-rep edits + partial-unique-index inserts | Open |
HIGH-priority queue
Listed after CRITICALs in the priority queue section below.
Already fixed in this session
These changes are on the feat/documents-folders branch (post-commit 660553c and onward). Do not re-fix.
Security
- Self-target privilege escalation block —
src/app/api/v1/admin/users/[id]/permission-overrides/route.tsnow refusesPUTwhentargetUserId === ctx.userId. Additionally, the body now sanitises against a canonicalALLOWED_RESOURCE_ACTIONSallow-list mirroringRolePermissions, so unknown resource/action keys are stripped before write. Cross-tenant pollution check added (refuses overrides for users without auser_port_rolesrow in the caller's port). - Username enumeration kill —
src/app/api/auth/resolve-identifier/route.tsnow (a) shares theauth5-per-15-min rate-limit bucket keyed by client IP, (b) returns a synthetic@auth.invalidemail on miss so hit and miss are indistinguishable in shape. (Note: GDPR auditor flagged the hit-path still echoes a real canonical email — still an information leak that's worth a deeper redesign; see Open follow-ups.) - Email-change account/session rotation —
src/lib/services/users.service.tsnow also updatesaccount.accountIdfor thecredentialprovider (Better Auth's actual login key) AND revokes every activesessionrow when an admin rotates a user's email. Previously the user could not sign in with either old or new email after rotation. - Middleware unblocks token-gated email routes —
src/middleware.tsadds/api/v1/me/email/confirm/and/api/v1/me/email/cancel/toPUBLIC_PATHSso the confirm/cancel links work in a fresh browser without an existing session.
Search + navigation
- NAV_CATALOG dead-link sweep —
src/lib/services/search-nav-catalog.tscorrected 10 entries that pointed to non-existent routes./admin/audit-log→/admin/audit,/admin/error-events→/admin/errors,/user-settings→/settings/profile, and the 7 phantom/settings/<x>entries redirected to their real/admin/<x>homes. - Topbar global search extended — every admin sub-card now indexed in
NAV_CATALOGwith curatedkeywords(client portal, ai scoring, pipeline weights, recommender heat weights, etc.). Results sort to the bottom of the cmd-K dropdown, beneath entity hits. - Admin sections page search —
src/components/admin/admin-sections-browser.tsxAdminSectiongained akeywords?: string[]field, populated for System Settings (mirrorsKNOWN_SETTINGS), AI configuration, OCR, Users, and Website analytics.filteredMatcheshaystack now includes those keywords.
User management
- Disable / enable button — third Power/PowerOff action button on the desktop user list + matching dropdown item on the mobile card. Backed by
userProfiles.isActive(already enforced bywithAuth→ 403 on disabled accounts). - UserForm tabs + permissions matrix — UserForm now wraps Profile & role + Permissions in tabs. New
UserPermissionMatrixcomponent renders the fullRolePermissionsshape with three-state per-leaf toggle (Inherit / Grant / Deny). The matrix isrole="radiogroup"+aria-checkedper option, and shows an amber callout explaining that overrides save on their own button. Dirty-state tracked via originalOverrides comparison. - First/last name + admin email change — UserForm collects first + last name (canonical) alongside displayName. Email change behind an AlertDialog confirmation; on confirm sends an automated notice to the prior address (new template
src/lib/email/templates/admin-email-change.ts). - Phone formatting — UserForm swaps the bare tel input for the shared
PhoneInput(country combobox + AsYouType + E.164 storage).
Optional username sign-in
- Migration
0054_user_profiles_username.sqladdsusernamecolumn (2..30 chars, regex^[a-z0-9._-]{2,30}$, partial unique index onLOWER(username)). - Login page now accepts email OR username via
/api/auth/resolve-identifier. - Self-service username card on
src/components/settings/user-settings.tsx. /api/v1/mePATCH now accepts username with allow-list + reserved-name check + uniqueness check before write.
Per-user permission overrides
- Migration
0055_user_permission_overrides.sqladds the table. - Effective-permissions resolver in
src/lib/api/helpers.tsnow layers user overrides on top of role + port-role overrides + residential toggle. GET / PUT /api/v1/admin/users/[id]/permission-overridesendpoints.
Role + enum normalization
formatRole()+ROLE_LABELSinsrc/lib/constants.ts— replaces the ad-hochumanizeRoleinsidebar.tsxandprettifyRoleNameinrole-list.tsx. user-list, user-card, role-list, user-form now render "Sales Agent" instead of "sales_agent".formatOutcome()+OUTCOME_LABELSfor interest outcomes. Updatedclient-columns.tsx,realtime-toasts.tsx,interest-detail-header.tsx,command-search.tsx.- Pipeline stage normalization extended to:
next-in-line-notify.service.ts,command-search.tsx(interest + residential interest bucket),yacht-tabs.tsx,interest-picker.tsx,ai.tsworker email body,pipeline-report.ts+revenue-report.tsPDF generators.
Auto-memory
- Saved feedback memory: "Be thorough — audit everything that ends in a user-facing notification". (Memory subsystem is /Users/matt/.claude/projects/...)
Cross-cutting priority queue
Tackle in this order. C-prefix = CRITICAL still open; H-prefix = HIGH.
- [C] Wire a real
db:migraterunner — without it,0052_audit_critical_fixes.sqlsilently never creates 6 composite indexes (data-model C1). Recommended: a tsx script that reads migrations in order, splits on--> statement-breakpoint, runsCREATE INDEX CONCURRENTLYoutside a tx, and tracks state in a__drizzle_migrationstable. Same script gives youdb:migrate:statusfor prod readiness. - [C] Add
EMAIL_REDIRECT_TOprod guard —src/lib/env.tsshould refine to reject whenNODE_ENV === 'production', andsrc/lib/email/index.tsshouldlogger.warnat boot when set (not debug). 5 minutes of work, prevents an extremely-bad-day class of incident. - [C] Fix orphan-blob window in
handleDocumentCompleted—src/lib/services/documents.service.ts:1100-1253. Wrap the storage.put + files.insert + documents.update sequence in a transaction or a saga with a compensating delete. The current catch-block path also incorrectly marksstatus='completed'with nosignedFileId, hiding the failure from reps. - [C] Escape URLs in email templates — every template in
src/lib/email/templates/*inlines${data.link}etc. into href/text without escaping. Move all template rendering through a sharedescapeUrlhelper and add scheme allow-listing (http(s) only). - [C] Eliminate the 16 native
window.confirm()calls — each one is a destructive flow that bypassesConfirmationDialog/AlertDialog. ui-ux-auditor lists the sites; high-leverage UX fix. - [C] GDPR export completeness —
gdpr-bundle-builder.tsmust include portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions. This is a regulator-finding-level gap. - [C] Right-to-be-forgotten actually erase —
client-hard-delete.service.tscurrently nullifies FKs but leaves verbatim PII in email_messages.body_html, files, document_sends.recipient_email. Add a true wipe path (or document the limitation in the legal text and gate the feature behind a "we cannot fully erase X" warning). - [C] Add
user_permission_overrides.user_idFK + onDelete='set null' on nullable client refs — data-model H1+H2. Migration 0056. - [C] Resolve-identifier hit-path still leaks email — replace the API entirely with a server-side signIn proxy that takes
{identifier, password}and never returns the canonical email at all. Current rate-limited hit still echoes real emails to anyone with a guessable username. - [H] Re-audit
audit_logs.metadatamasking — extendmaskSensitiveFieldsto coveraudit_logs.metadata; add a 90-day retention cron (mirroringerror_events). - [H] Webhook → error pipeline —
documenso/route.tsshouldcaptureErrorEventon handler crash. Apply the same to every other webhook route. - [H] Wire admin email-template subject editor — 5 of 8 templates ignore
overrides.subject; admins see "Saved" with zero effect.email-auditorH1+H2. - [H] Wire admin signature/footer fields —
/admin/emailwritesemail_signature_html+email_footer_htmlwhich the shell never reads. Either delete or wire. - [H] PII redaction in audit/error pipeline —
error_events.request_body_excerptsanitizer redacts password/token but not email/phone/name/dob/address. - [H] Notification email worker XSS —
workers/notifications.ts:65-71interpolatesnotif.descriptionandnotif.linkinto HTML unescaped. ApplyescapeHtml+ URL allow-list.
Per-domain reports
Each section below is the agent's report verbatim. File:line refs reference the repo as it stands at the start of the audit session — some have already been addressed (see "Already fixed in this session" above).
1. Security + API + auth audit (security-auditor + early api-security run)
Two reports — the team-spawned security-auditor and an earlier standalone run. Both included verbatim.
Report A: security-auditor (team)
Security / API / Auth Audit — feat/documents-folders branch
Read-only audit of the pn-crm repo. Scope: auth wrappers, tenant scoping,
public/webhook endpoints, the just-shipped username-resolve + permission-
overrides + admin email-change flows, CSRF posture, audit-log coverage.
No CRITICAL issues found — auth helpers (withAuth / withPermission /
requireSuperAdmin) are applied consistently across src/app/api/v1/**,
public endpoints all use timing-safe secret compares + per-IP rate limits,
and the Documenso webhook idempotency + per-port secret resolution is sound.
The findings below are HIGH / MEDIUM.
HIGH
H1. resolve-identifier leaks username→email mapping AND has no rate limit
File: src/app/api/auth/resolve-identifier/route.ts (lines 25–58)
The route's own docstring claims it "pairs with the global login-attempt
limiter" — but no enforcePublicRateLimit / checkRateLimit is actually
called in the handler. Unauthenticated attackers can POST {identifier:"matt"}
at unbounded volume; on a hit the response is {email:"matt@letsbe.solutions"},
on a miss the response echoes the raw input. That makes existence
trivially decidable (response contains @ ↔ hit), and on a hit the caller
also learns the actual email address. Usernames are typically far more
guessable than emails (first names, social handles), so this becomes a one-
way username → email harvester usable for downstream phishing / password
spraying. Fix: wrap with enforcePublicRateLimit(req, 'portalSignIn', identifier.toLowerCase()) (or a new loginIdentifier bucket) AND stop
echoing the resolved email — either return {ok:true} and require the
caller to POST (username,password) together to a single sign-in endpoint
that does the lookup server-side, or return an opaque short-lived token that
Better Auth's sign-in step can redeem internally.
H2. Admin email-change leaves emailVerified true → account takeover via reset
File: src/lib/services/users.service.ts (lines 233–262, 355–387)
updateUser rotates user.email directly when an admin edits the address
(line 246–247) but never resets emailVerified. A hostile or compromised
admin can point any victim's account at an attacker-controlled mailbox, then
trigger the existing "forgot password" flow on the new address and silently
hijack the account; the existing notifyAdminEmailChange notice fires to
the old address fire-and-forget and is documented as non-blocking
("failure to send doesn't roll back"). There is also no createAuditLog
specifically for the email-change — the generic update audit at line 287
buries the change inside newValue: data rather than emitting a dedicated
email_change action that monitoring can alert on. Fix: when
wantsEmailChange, set emailVerified: false in the Better Auth user
update, write a dedicated severity: 'warning' audit row with
{oldEmail, newEmail, changedBy}, and require the recipient to click the
existing /api/v1/me/email/confirm/[token] flow before the rotation
applies — i.e. mint a user_email_changes row rather than direct-UPDATE.
H3. Permission-overrides PUT accepts arbitrary keys → JSONB pollution + deep-merge surprise
File: src/app/api/v1/admin/users/[id]/permission-overrides/route.ts
(lines 31–35, 97–141)
updateOverridesSchema is z.record(z.string(), z.record(z.string(), z.boolean())) — no allow-list against the known RolePermissions resource/action keys. An admin (or a stolen admin session) can persist arbitrary keys into user_permission_overrides.permission_overrides. Two concrete impacts: (a) future deep-merge logic that maps unknown keys into newly added resources promotes the rogue keys silently (silent privilege creep when new permissions ship); (b) the JSONB can be bloated to harm downstream readers. Fix: validate against KNOWN_PERMISSION_LEAVES derived from RolePermissions (resource → action set), reject unknown keys with ValidationError, and bound the merged blob size as /api/v1/me/route.ts already does for preferences. The GET handler is fine — it only reads what was already persisted.
H4. /api/v1/me/email/confirm|cancel/[token] is unreachable for logged-out users (middleware 401)
File: src/app/api/v1/me/email/cancel/[token]/route.ts,
src/app/api/v1/me/email/confirm/[token]/route.ts,
src/middleware.ts (PUBLIC_PATHS list, line 8–20)
The handlers correctly skip withAuth ("the token IS the proof") but
/api/v1/me/email/... is not in PUBLIC_PATHS, so middleware.ts returns
a 401 JSON for any unauthenticated request — exactly the case a user
clicking the confirm link from email on a different device will hit. End
result: every confirm/cancel click from a logged-out browser fails with
"Authentication required". Also, the GET request applies an irreversible
state mutation with no CSRF guard (the origin-check in middleware only fires
for STATE_CHANGING_METHODS). Fix: move these handlers under
/api/auth/email-change/{confirm,cancel}/[token] so they're covered by the
/api/auth/ PUBLIC_PATHS prefix, OR add /api/v1/me/email/ to
PUBLIC_PATHS. Convert the GET mutation to a POST landing page (one-click
confirm form) so cross-site image/prefetch tags can't silently flip state.
MEDIUM
M1. Direct Schema.parse(body) instead of parseBody(req, schema)
Files: src/app/api/v1/admin/custom-fields/[fieldId]/route.ts:18-19,
src/app/api/v1/search/route.ts:11,
src/app/api/v1/files/upload/route.ts:21,
src/app/api/v1/companies/[id]/members/[mid]/handlers.ts:29,
src/app/api/public/website-inquiries/route.ts:97-98,
src/app/api/public/residential-inquiries/route.ts:51-52,
src/app/api/public/interests/route.ts:47-48,
src/app/api/portal/auth/{sign-in,forgot-password,reset-password,activate,change-password}/route.ts,
src/app/api/auth/{set-password,resolve-identifier}/route.ts.
CLAUDE.md explicitly requires parseBody so the 400 envelope + field-
errors shape stays uniform (the frontend's toastError hook depends on
it). Most of these are caught by an outer try/catch that routes ZodError
into errorResponse, which masks the issue — but the response shape
diverges (a thrown ZodError becomes a generic 500 unless errorResponse
maps it). Admin route custom-fields/[fieldId] is the worst case: a
malformed PATCH body 500s instead of 400-with-field-errors. Fix: swap
to parseBody(req, schema) in the admin/internal routes; the portal /
public auth routes intentionally use safeParse + manual ValidationError
mapping and can be left as-is.
M2. CSRF origin check disabled in development
File: src/middleware.ts (line 80)
process.env.NODE_ENV !== 'development' gates the origin check. If a
production deployment is ever booted with NODE_ENV=development
accidentally (shell export leakage, container override, "debug deploy"),
all CSRF defense-in-depth is silently off — SameSite=Lax still helps but
isn't enough for legacy browsers / extension contexts. Fix: key the
bypass on an explicit DISABLE_CSRF_FOR_LAN=1 env var that's defaulted to
unset and refused in lib/env.ts when NODE_ENV==='production'.
M3. Permission-override audit log lacks severity escalation
File: src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:124-134
Changing user permission grants is exactly the action an attacker would
take after compromising an admin; the audit row should be emitted with
severity:'warning' (matching the email_change_cancelled precedent in
src/app/api/v1/me/email/cancel/[token]/route.ts:46) so the audit UI's
default filter surfaces it. Today it's a vanilla action:'update' lost in
the noise.
M4. /api/public/interests audit row stores client phone in metadata
File: src/app/api/public/interests/route.ts:254-271
The audit row's newValue and surrounding metadata capture ip plus
foreign keys, which is fine, but data.phone is held in scope and could
easily slip in during a future edit. Today the row is OK; flag as a place
to add a regression test. (Not a finding to act on, just a watch-list item
for the broader audit team.)
M5. Filesystem storage proxy: token leak via Referer
File: src/app/api/storage/[token]/route.ts:42-119
Cache-Control: private, no-store is set on the response, but the URL
itself (with the HMAC token in the path) leaks via the Referer header
when the downloaded asset is opened inside a browser tab that then
navigates to a third-party link. Single-use replay protection mitigates
reuse, but a token still-in-window is good for one stolen download. Fix:
either rotate to a POST-with-token-in-body form (breaks <a download>),
or set Referrer-Policy: no-referrer on the response and document that
issuers should mint with the shortest possible expiry. Lower-impact
because filesystem mode is single-tenant per the boot guard.
M6. /api/v1/clients/bulk-hard-delete lacks per-IP rate-limit
File: src/app/api/v1/clients/bulk-hard-delete/route.ts (no withRateLimit)
The sibling bulk-hard-delete-request/route.ts is wrapped in withRateLimit
but the actual delete endpoint is not. A compromised admin session could
fan out hundreds of irrevocable hard-deletes in a tight loop with no
limiter to slow it down. Fix: add withRateLimit('destructiveBulk', ...)
or similar with a 5/minute cap; the existing audit row will still be
emitted, but the limiter caps the blast radius.
Verified clean (no finding)
withAuth/withPermission/requireSuperAdminapplied uniformly: everyroute.tsundersrc/app/api/v1/**was checked; the only files without the wrappers areme/email/{confirm,cancel}/[token]/route.ts(covered by H4) which intentionally use bearer-token auth.withAuthenforces port-context viaX-Port-Idheader / preferences, never from body (helpers.ts:160–168).- Documenso webhook: timing-safe per-port secret resolution, replay guard
via
signatureHashunique index, per-handlerportScopeforwarded so a documensoId reused across ports can't cross-mutate. - Public website-intake: timing-safe
verifySecretwith length-equal buffer pad, refusal-by-default whenWEBSITE_INTAKE_SECRETunset, per-IP rate-limit, unknown port slug → generic 400 (no input echo). - Raw
sql\...`usage scanned acrosssrc/lib/servicesandsrc/app/api: every interpolation is via Drizzle's parameter binding (sql`... ${foo} ...``); no string concatenation gaps found. - Storage proxy upload (PUT) does HMAC verify + single-use replay + size cap
- PDF magic-byte enforcement before disk write.
— security-auditor (read-only audit; no source files edited)
Report B: api-security (standalone earlier run)
API + Auth + Security Audit – Port Nimara CRM
Scope: src/app/api/**, src/lib/api/helpers.ts, src/lib/auth/**, src/middleware.ts,
plus the newly-added permission-overrides and resolve-identifier flows.
CRITICAL
1. Privilege escalation via PUT /api/v1/admin/users/[id]/permission-overrides
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:97-141
The PUT handler gates only on withPermission('admin', 'manage_users', …) and never
verifies that params.id !== ctx.userId. Any user who holds admin.manage_users can
target their own userId and write a userPermissionOverrides row that grants every
leaf ({ admin: { manage_users: true, manage_settings: true, … }, … }). Because
withAuth deep-merges userOverride.permissionOverrides last in the chain
(src/lib/api/helpers.ts:227-238), the row wins over the base role and instantly
escalates the caller to admin-of-everything on the next request. The companion
removeUserFromPort service in src/lib/services/users.service.ts:319 does have a
self-target guard — the same guard is missing here. Fix: in the PUT handler, throw
ForbiddenError when targetUserId === ctx.userId && !ctx.isSuperAdmin, and require
super-admin to flip admin.* leaves (or any leaf that the calling user cannot already
grant). Tier-2 fix: rotate this row to require super-admin outright; admin-of-port
shouldn't be able to mint persistent overrides for peers anyway.
2. /api/auth/resolve-identifier has no rate-limit — username enumeration
src/app/api/auth/resolve-identifier/route.ts:25-59
The endpoint is unauthenticated, sits behind /api/auth/* (so the middleware
origin check is skipped per src/middleware.ts:46-49), and does NO rate-limit /
throttling. The header comment claims it "pairs with the global login-attempt
limiter" but that limiter is only triggered when the subsequent sign-in call
runs — an attacker hitting just this endpoint with a wordlist is unconstrained.
While the response shape is the same on hit and miss ({ email: <string> }),
the content differs: a hit returns an @-bearing email, a miss returns the
unchanged raw input. So with one HTTP call per candidate an attacker
deterministically learns which usernames map to real accounts; they then funnel
only the validated emails into the rate-limited sign-in flow, defeating the
per-account brute-force ceiling. Fix: wrap in enforcePublicRateLimit(req, 'portalSignIn', normalized) (or a new bucket like usernameResolve with ~10/15min
per-IP), and consider returning a constant fake-email when the username doesn't
resolve so hit/miss are indistinguishable at the response-body level too.
HIGH
3. permission-overrides PUT does not validate the override shape
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:31-34, 97-141
updateOverridesSchema is z.record(z.string(), z.record(z.string(), z.boolean())) —
any resource name and any action key is accepted. This stores garbage in
user_permission_overrides.permission_overrides forever, and silently typo'd
keys ('clien_ts.view') won't take effect but won't 400 either. More
importantly, there is no allow-list against the RolePermissions shape defined
in src/lib/db/schema/users.ts:6, so a future code path that does
Object.keys(permissions).forEach(…) could be surprised by a foreign resource
appearing in the merged map. Fix: derive a Zod allow-list at module load from
the canonical RolePermissions shape (the same VALID_MERGE_TOKENS pattern the
templates code uses) and reject unknown resource/action keys with 400.
4. permission-overrides PUT writes for users not assigned to the current port
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:97-122
The PUT inserts/updates a (userId, portId) row without first verifying that
targetUserId actually has a user_port_roles row for ctx.portId. An admin at
port A can mint override rows for users belonging only to port B (the row is keyed
on the admin's portId, so it's a "future override that would activate if the user
ever joins this port"). Functionally inert today, but pollutes the override table
across tenants and breaks the implicit "you can only manage users in your port"
invariant the rest of the admin/users routes enforce. The GET path does the
implicit validation by failing the port-role lookup; the PUT should mirror it.
Fix: findFirst on userPortRoles with (targetUserId, ctx.portId) first; 404
if missing, mirroring updateUser at src/lib/services/users.service.ts:216-219.
5. Email-change confirm endpoint cannot be aborted after compromise window
src/app/api/v1/me/email/confirm/[token]/route.ts:42-57
Token-based unauthenticated swap. The flow looks otherwise correct (sha256-
hashed token, expiry, single-use via appliedAt, race-checked uniqueness). What's
missing: when a confirmation completes, all other outstanding userEmailChanges
rows for the same userId should be cancelled, and all existing Better Auth
sessions for that user should be revoked. Today, if an attacker compromises the
account, requests an email change to attacker-owned address, and the victim
spots the cancel email but races against the attacker — once the attacker
confirms, the victim's cancel link still works on the other pending row but
not on the now-applied change, and the attacker's existing CRM session
(pn-crm.session_token) survives the swap. Fix: in the confirm handler, after
the email UPDATE, also db.delete(sessions).where(eq(sessions.userId, pending.userId)) (or whatever the Better-Auth session table is called) and
mark all other open userEmailChanges rows for that user as cancelled. Mirror
the cancel-handler behaviour. Severity is HIGH not CRITICAL because the
attacker needs the session in the first place.
6. Public /api/auth/[...all] audits the attempted email but doesn't bound brute-force timing
src/app/api/auth/[...all]/route.ts:100-146
Better Auth handles sign-in rate-limiting internally (it has a built-in limiter
when configured), but I see no explicit enforcePublicRateLimit wrapper around
this catch-all. The loginAttempt bucket I expected in src/lib/rate-limit.ts
isn't present in the listing; the closest is portalSignIn, which is wired only
to the portal sign-in handler, not the CRM sign-in. If Better Auth's default
limiter isn't actively configured in src/lib/auth/index.ts:55-113 (and I don't
see a rateLimit: block there), the CRM login endpoint is effectively
unrate-limited and the resolve-identifier finding compounds into a real
brute-force window. Fix: add an enforcePublicRateLimit(req, 'crmSignIn', attemptedEmail) call inside withAuthAudit before forwarding to
upstream.POST(forwardReq) when isSignIn, keyed per-email; declare the bucket
in rate-limit.ts mirroring portalSignIn's shape.
MEDIUM
7. CRM updateUser cross-tenant email change has no notification when target is super-admin
src/lib/services/users.service.ts:236-262
When an admin at port A updates a user (including a super-admin who happens to
have a port-role row at port A), the email-change flow flips Better Auth's
identity instantly with only a courtesy email to the prior address. There's no
challenge / token round-trip — the admin acts unilaterally. Self-service email
change (/api/v1/me/email) DOES require token confirmation; admin-initiated
should at least block when the target is a super-admin or require the change to
go through the same confirm-token flow. Fix: gate wantsEmailChange on
!profile.isSuperAdmin || ctx.isSuperAdmin and/or always use the token flow
even for admin-initiated changes.
8. permission-overrides PUT does not write audit log atomically with the DB write
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:111-134
The existing row is read, then conditionally update-or-insert, but two
concurrent PUTs against the same (userId, portId) race: both see existing
as the same value, both call update, second writer wins silently with a
last-write audit log that's missing the intermediate state. Severity is medium
because the audit log still captures both writers' new values and there's no
correctness invariant broken — just a forensic gap. Fix: wrap the read +
update/insert in withTransaction with FOR UPDATE (or use an upsert with
returning('old')-equivalent semantics) and log oldValue from the locked row.
9. Documenso webhook returns 200 on every failure including dedup, which masks crashes
src/app/api/webhooks/documenso/route.ts:264-268
The handler's outermost try/catch logs err but always returns 200. That's
the correct posture for signature-invalid traffic (don't leak signal), but
also masks downstream handler crashes — Documenso will never retry a 5xx
because it never sees one. The handlers are documented as idempotent
(handleDocumentCompleted early-returns on duplicate completion), so a retry
storm wouldn't double-write, but the missing retry signal turns one transient
DB failure into a permanently dropped event. Fix: return 500 on the catch
branch so Documenso retries; keep 200 for secret-invalid (line 100) and
dedup (line 123) since those are intentional no-ops.
10. withAuth deep-merge: permission overrides only ADD permissions, never EXPLICITLY DENY
src/lib/api/helpers.ts:73-98, 233-238
deepMerge does a recursive shallow assignment — userOverride.permissionOverrides
overwrites leaves wholesale. So {clients: {view: false}} works as a deny.
However the override is keyed by resource → action map, and the override row
stores Partial<RolePermissions>. There's no "tri-state" (inherit/grant/deny)
expressed at the DB layer — the comment in the route says "use null at a leaf
to clear an override" but the Zod schema only accepts z.boolean() per leaf,
not null. So the UI cannot actually clear an override leaf via this endpoint
without removing the resource key entirely from the JSON. Worth aligning the
schema with the documented contract. Fix: accept z.union([z.boolean(), z.null()]) and strip null leaves server-side before writing.
11. Origin check disabled in dev — but process.env.NODE_ENV check is per-process
src/middleware.ts:79-89
CSRF defense-in-depth is skipped when NODE_ENV !== 'production'. The
dev/staging boundary is correct in principle, but staging deployments
typically run with NODE_ENV=production, while CI / preview-builds may not.
Worth confirming the Dockerfile (Dockerfile) sets NODE_ENV=production on
any environment that's reachable from the internet. Note also that the
fallback at src/middleware.ts:68-69 allows a request with neither Origin nor
Referer through — this is correct for server-side fetches but means any HTTP
client that strips both headers (curl with -H "Origin:") bypasses the check.
Combined with SameSite=strict cookies the residual risk is low.
12. me/email confirm/cancel tokens are URL-only — referer leakage risk
src/app/api/v1/me/email/route.ts:88-89, src/app/api/v1/me/email/confirm/[token]/route.ts:24-35
The confirm/cancel URLs are emailed as ${baseUrl}/api/v1/me/email/confirm/${rawToken}.
The user clicks from their inbox; the email client opens the URL in a browser
which then renders /settings?emailChange=confirmed (a redirect). If
/settings makes any third-party request before navigating away, the Referer
header carries the full confirm URL including the token. The token is
single-use and short-lived, so the post-redirect exposure window is small, but
defensively the route should Referrer-Policy: no-referrer on the redirect
response. Fix: res.headers.set('Referrer-Policy', 'no-referrer') on the
NextResponse.redirect(...) call.
Summary
Two CRITICAL findings: self-targetable permission-overrides escalation
(finding 1) and unlimited username harvesting at /api/auth/resolve-identifier
(finding 2). Both are direct consequences of the recently-added routes that
prompted this audit. The remainder are mostly hardening — the v1/* surface
overall is well-disciplined: nearly every route under /api/v1/** flows
through withAuth(withPermission(...)), body parsing consistently uses
parseBody (only public/auth handlers use raw req.json() for documented
reasons), and the few raw sql\…` usages I sampled (admin/website-submissions, admin/document-sends, search/recently-viewed) all interpolate via the parameterized tag form rather than string concat. Multi-tenant scoping looks consistent — services accept ctx.portId` and the
defense-in-depth pattern is well-applied (e.g. the berth-recommender note in
CLAUDE.md). The Documenso webhook receiver has solid replay/dedup/secret
discipline.
2. UI/UX consistency + accessibility audit (ui-ux-auditor)
UI/UX Consistency + Accessibility Audit
Scope: Form patterns, dialog/sheet/drawer choices, mobile parity, enum leakage, empty/loading states, badge tones, a11y, plus the recently added surfaces (UserForm tabs, UserList Power toggle, UserPermissionMatrix, Login identifier field, user settings username card).
CRITICAL
C1 — window.confirm() / confirm() used for destructive flows (>=15 sites)
Files using native browser confirm instead of ConfirmationDialog (which wraps AlertDialog):
src/components/clients/contacts-editor.tsx:115— remove contactsrc/components/clients/client-files-tab.tsx:50— delete filesrc/components/yachts/yacht-list.tsx:187— archive yacht (bulk)src/components/admin/document-templates/template-version-history.tsx:54— restore older versionsrc/components/shared/addresses-editor.tsx:77— remove addresssrc/components/documents/document-detail.tsx:160— cancel/void signing envelopesrc/components/interests/interest-list.tsx:314— archive interestsrc/components/interests/interest-tabs.tsx:483— outcome/archival flowsrc/components/interests/interest-eoi-tab.tsx:299— cancel EOIsrc/components/interests/interest-reservation-tab.tsx:313— cancel contractsrc/components/interests/interest-contact-log-tab.tsx:222— delete contact logsrc/components/interests/interest-contract-tab.tsx:310— cancel contractsrc/components/interests/interest-documents-tab.tsx:80— delete filesrc/components/companies/company-files-tab.tsx:50— delete filesrc/components/companies/company-list.tsx:201— archive companysrc/components/documents/document-list.tsx:136— delete document
Why it matters: native confirm cannot be styled, bypasses our <AlertDialog> keyboard semantics, no focus trap, no destructive-action red styling, fails focus-return after dismiss; inconsistent with the rest of the app which uses ConfirmationDialog. Several of these are catastrophic (cancel signing envelope, hard-delete file, archive company).
Fix: replace each with <ConfirmationDialog destructive title=… description=… onConfirm={…}> matching the pattern in user-list.tsx.
C2 — UserForm "Permissions" tab silently drops unsaved overrides
src/components/admin/users/user-form.tsx:204-212 and user-permission-matrix.tsx:175-191.
The matrix has its own "Save overrides" button; the parent Sheet's "Save changes" only persists Profile-tab fields. onSaveStateChange is declared in the matrix props but never passed by user-form.tsx (line 206), so the parent has no idea overrides are dirty. A user who toggles Inherit/Grant/Deny then clicks "Save changes" loses everything when the Sheet closes — no warning, no toast.
Fix: lift overrides state to user-form.tsx, persist both endpoints inside persist(), or track dirty state via onSaveStateChange and block Sheet close with an AlertDialog.
HIGH
H1 — Raw enum render via .replace(/_/g, ' ') outside constants.ts (40+ sites)
Examples (not exhaustive):
src/components/documents/documents-hub.tsx:292,document-detail.tsx:204,210,386,entity-folder-view.tsx:63,hub-root-view.tsx:69,signing-details-dialog.tsx:123—status,eventType,documentTypesrc/components/reservations/reservation-detail.tsx:230,285,339—tenureType, agreement statussrc/components/berths/berth-status-suggestion-dialog.tsx:61,65src/components/expenses/expense-detail.tsx:229,233,expense-card.tsx:71,expense-columns.tsx:121,expense-form-dialog.tsx:257,278,expense-filters.tsx:16src/components/admin/audit/audit-log-list.tsx:234-235,roles/role-list.tsx:223,239,roles/role-form.tsx:123src/components/admin/users/user-permission-matrix.tsx:101— localformatActionduplicates patternsrc/components/dashboard/source-conversion-chart.tsx:60,activity-feed.tsx:34,44src/components/scan/scan-shell.tsx:227,242src/components/interests/linked-berths-list.tsx:94,interest-tabs.tsx:40src/app/(portal)/portal/{my-yachts,documents,interests}/page.tsx— portal-side enum leakagesrc/components/search/command-search.tsx:939,965— fallback afterSTAGE_LABELS
Fix: route through stageLabel, formatRole, formatOutcome, formatSource (already in constants.ts); add formatDocumentStatus, formatTenureType, formatEventType, formatExpenseCategory, formatPaymentMethod, formatBerthStatus, formatPermissionAction to constants.ts and replace call-sites. Removes "manage memberships" / "Eoi Signed" inconsistencies.
H2 — Mobile parity: 18 list components have no cardRender
DataTable already supports cardRender; without it the mobile view falls back to a raw horizontal-scroll table (bad UX on iOS):
src/components/reservations/reservation-list.tsx,berth-reservations-list.tsxsrc/components/website-analytics/top-list.tsxsrc/components/shared/notes-list.tsxsrc/components/residential/residential-clients-list.tsx,residential-interests-list.tsxsrc/components/documents/document-list.tsxsrc/components/interests/linked-berths-list.tsx,recommendation-list.tsxsrc/components/email/email-accounts-list.tsx,email-threads-list.tsxsrc/components/reports/reports-list.tsxsrc/components/admin/document-templates/template-list.tsx,forms/form-template-list.tsx,roles/role-list.tsx,tags/tag-list.tsx,ports/port-list.tsx
Fix: add cardRender mirroring desktop columns. UserCard/ClientCard/InterestCard are good templates.
H3 — User settings phone field is unbound on load
src/components/settings/user-settings.tsx:69-92 — loadProfile() reads firstName, lastName, email, etc., but never reads phone into state. Yet saveProfile() at line 143 sends phone: phone || null, which clears the user's stored phone on every save. Also country as never cast at line 298 is unsound — when no country is selected the PhoneInput shows a US flag even for European users.
Fix: add phone to MeResponse + setPhone(res.data.profile?.phone ?? ''). Store country alongside phone (the PhoneInput value is {e164, country} — persist the parsed country).
H4 — UserPermissionMatrix three-state toggle has no a11y semantics
user-permission-matrix.tsx:247-267 — three sibling <button> elements with no role="radiogroup"/role="radio"/aria-checked. Screen readers announce "button, Grant" with no indication which is selected, what the options are, or that they're mutually exclusive. Also no focus ring on the active option.
Fix: wrap in <div role="radiogroup" aria-label={${action} permission}> and set role="radio" aria-checked={state === opt} on each. Or use Radix RadioGroup for keyboard arrow navigation.
H5 — Login page form errors not associated with inputs
src/app/(auth)/login/page.tsx:84-119 — <p className="text-sm text-destructive">{errors.identifier.message}</p> is rendered after the input but the <Input> has no aria-describedby pointing at it, and no aria-invalid={!!errors.identifier}. Same for password. Screen readers won't read the error message when focus lands on the input.
Fix: give each error <p id="identifier-error">, add aria-describedby={errors.identifier ? 'identifier-error' : undefined} and aria-invalid={!!errors.identifier} on the Input.
H6 — Desktop sidebar nav lacks aria-current="page"
src/components/layout/sidebar.tsx:177-201 (NavItemLink) — uses active for visual styling but doesn't set aria-current on the <Link>. Mobile bottom tabs already do this (mobile-bottom-tabs.tsx:85). Screen-reader users cannot identify the current page in the desktop sidebar.
Fix: aria-current={active ? 'page' : undefined} on the <Link>.
MEDIUM
M1 — Berth status pills use ad-hoc Tailwind colors instead of StatusPill
src/components/berths/berth-columns.tsx:117-119, berth-card.tsx:21-23, berth-detail-header.tsx:90 — bg-green-100 text-green-800, bg-yellow-100, bg-red-100. The codebase has StatusPill (src/components/ui/status-pill.tsx) with semantic tokens (success-bg, warning-bg, error-bg) already used by docs/reservations. Berth statuses (available/under_offer/sold) map cleanly to active/expired/rejected pill states.
Fix: replace ad-hoc badges with <StatusPill status={…}> and extend statusPillVariants if a new tone is needed.
M2 — UserList "Active"/"Disabled" badge inconsistent with StatusPill convention
src/components/admin/users/user-list.tsx:104-115 — uses <Badge variant="default" className="bg-green-600"> with inline green override and <Badge variant="destructive">. The Power/PowerOff icons and ShieldCheck/ShieldOff icons (also row 107/112) lack aria-hidden — but text is present so it's not blocking, just inconsistent.
Fix: use <StatusPill status="active">Active</StatusPill> / <StatusPill status="archived">Disabled</StatusPill>; add aria-hidden to all decorative lucide-react icons in the table.
M3 — Only 5 of 73 dashboard routes have a loading.tsx
Only clients/[clientId], invoices, expenses, admin/errors, admin/errors/[requestId] have route-level loading skeletons. The rest fall back to a blank flash. Lists/details that fetch via React Query show a skeleton inside the component, but full-page navigations show nothing.
Fix: add loading.tsx per route segment that returns a <Skeleton> matching the page chrome (sidebar/topbar already render via the layout).
M4 — UserPermissionMatrix loading state uses text, not Skeleton
user-permission-matrix.tsx:193-197 renders "Loading permissions…" text. Other list/detail loaders in the app use <Skeleton> from @/components/ui/skeleton. Adds inconsistency.
Fix: replace with a Skeleton grid mirroring the accordion shape.
M5 — Settings transient messages persist forever instead of toasting
user-settings.tsx lines 167, 184, 197 (usernameMsg, emailMsg, resetMsg) — these useState strings stay rendered as a <span> next to their button indefinitely. Login uses toast.error(); reset-password and other auth surfaces also use sonner.
Fix: swap to toast.success() / toast.error(). Removes stale messages and the inconsistency between auth and settings.
M6 — Email-or-username Login input: visible placeholder collides with sr-only space
src/app/(auth)/login/page.tsx:93 — placeholder="you@example.com or yourname" with two literal spaces. Mac VoiceOver reads "you at example dot com or yourname" — fine; but the double space is just sloppy formatting. Also the placeholder duplicates the Label "Email or username" — placeholder is unreliable for instructions (clears on focus).
Fix: single-space the placeholder, or move the format hint into a <p id="identifier-hint" className="text-xs text-muted-foreground"> and wire aria-describedby.
M7 — User settings username card: client-side pattern validation never surfaces inline
user-settings.tsx:359-386 — pattern="^[a-z0-9._-]{2,30}$" on the input. HTML5 validation only fires on form submit (this isn't inside a <form>); the Save button is a plain <Button onClick>. So invalid input only fails server-side with a generic 400. No aria-describedby pointing at the helper text (line 382-385).
Fix: add a zod-resolved react-hook-form mini-form OR validate on blur and show inline error; wire aria-describedby="username-help".
M8 — UserForm Tabs: focus does not follow tab switch & no dirty-tab indicator
user-form.tsx:194-212 — switching from Profile to Permissions doesn't move keyboard focus to the matrix; switching back loses scroll position. The Permissions tab trigger is disabled for new-user mode (correct) but has no tooltip explaining why.
Fix: Radix Tabs handles focus by default; verify and add a title / aria-describedby on the disabled trigger with explanation. Add a small "•" dot on the trigger label when overrides are dirty (depends on C2 fix).
M9 — Email confirmation AlertDialog in UserForm: default focus + return focus
user-form.tsx:362-387 — opens on submit. Radix returns focus to the submit button after close (good), but the dialog's <AlertDialogAction> triggers persist() without disabling itself during the network call; rapid double-click can fire two PATCHes. Also disabled={loading} is set on action but not on <AlertDialogCancel> re-enable timing.
Fix: add a submitting guard or rely on existing loading state for both buttons; close dialog only after persist() resolves.
M10 — Decorative icons missing aria-hidden
Across user-list.tsx, user-card.tsx, documents-hub.tsx, berth-status-suggestion-dialog.tsx, status pills with <ShieldCheck>, <Power>, <PowerOff>, <Globe>, etc., the icons supplement text — they should carry aria-hidden="true" so screen readers don't double-announce. Mixed across the codebase; some lucide imports get it, most don't.
M11 — Drawer vs Sheet usage drift
src/components/clients/client-interests-tab.tsx:217 uses Vaul <Drawer> for an interest preview, while every other detail-preview surface (yacht preview, company preview, reservation preview) uses <Sheet>. Vaul drawers are intended for mobile bottom-sheets; using it for an inline preview on desktop is inconsistent.
Fix: standardize on <Sheet side="right"> for desktop right-rail previews; reserve <Drawer> for the mobile More menu (more-sheet.tsx).
LOW
L1 — user-permission-matrix.tsx:264 button label cosmetic uppercase done inline
{opt[0]!.toUpperCase() + opt.slice(1)} — works but the non-null ! and inline transform inside JSX is brittle. Consider a OPTION_LABELS constant.
L2 — UserList action column uses title="…" instead of accessible tooltip
user-list.tsx:135,147,180 — relies on native browser tooltips. They don't appear on touch and don't surface to screen readers; the <span className="sr-only"> carries the label which is correct, but consider Radix Tooltip for parity with the rest of the app.
L3 — Login page brand color hardcoded
src/app/(auth)/login/page.tsx:106,123 — #007bff / #0069d9 hex hardcoded instead of using brand-500 / brand-600 design tokens. Same issue in sidebar.tsx:190,196,379 (#3a7bc8).
L4 — formatAction duplicated locally in matrix instead of in constants.ts
user-permission-matrix.tsx:100-102 re-implements the title-case replace. Move to constants.ts as formatPermissionAction (used in 3+ files: role-list.tsx, role-form.tsx, matrix).
L5 — Hard-coded "border-amber-300 bg-amber-50" warning callouts (15+ sites)
Across bulk-archive-wizard.tsx, hard-delete-dialog.tsx, smart-archive-dialog.tsx, smart-restore-dialog.tsx, pdf-reconcile-dialog.tsx, user-settings.tsx:321, etc. Need a shared <Callout tone="warning|info|danger|success"> primitive that reads from design tokens.
Verified OK
- Form helper coverage:
react-hook-form + zodResolver,PhoneInput,CountryCombobox,TimezoneCombobox,InlineEditableField,InlineTagEditorare present and used consistently in client/yacht/company/interest forms. parseBody+errorResponseenvelope convention holding for new endpoints checked.ConfirmationDialogcorrectly returns focus and traps focus via RadixAlertDialog.StatusPillis the right primitive; just under-adopted (M1, M2).- Mobile bottom tabs handle
aria-currentcorrectly (template for H6). - UserCard already adds
aria-label="Actions for ${displayName}"on the icon-onlyMoreHorizontaltrigger.
3. Data model + migrations + relations audit (data-model-auditor)
Data Model + Migrations + Relations Audit
Scope: src/lib/db/schema/*.ts (24 files) and migrations 0000–0055.
~92 tables, multi-tenant on port_id. Drizzle ORM + postgres-js.
CRITICAL
C1 — No prod migration runner; 0052 uses CREATE INDEX CONCURRENTLY
package.json exposes only db:generate / db:push / db:studio. There is no
db:migrate script, no usage of drizzle-orm/postgres-js/migrator, and no
in-repo SQL replay loop. The numbered SQL files are applied by hand via psql
(implicitly). 0052_audit_critical_fixes.sql runs CREATE INDEX CONCURRENTLY
for six composite indexes and its header explicitly forbids wrapping in
BEGIN/COMMIT — anyone running it via Drizzle's default migrator (which wraps
each file in a single tx) or psql -1 will see it abort silently. The
aggregated-projection queries on files/documents then fall back to seq scans
in prod. Action: ship a real prod migrator that respects per-file transaction
hints, or split 0052 into pre/post files, and document the runbook in
CLAUDE.md.
C2 — db:push skips two structural constraints
Both are flagged in source comments:
berths.current_pdf_version_id→berth_pdf_versions.idFK (circular dep, set up by 0030).system_settings_key_port_idxNULLS NOT DISTINCTflag (0047) — required so global settings withport_id IS NULLare unique bykeyalone.
A fresh-deploy or developer onboarding via db:push produces a structurally
divergent DB: dangling pointers on the active berth column, and silent duplicate
global (key, NULL) settings accumulating over time. Action: post-push
reconciler, or kill db:push for prod and rely solely on the SQL files.
HIGH
H1 — New user_permission_overrides.user_id lacks any FK
Migration 0055 declares user_id TEXT NOT NULL with no REFERENCES "user"(id).
Compare portRoleOverrides (cascades on both port_id and role_id). Deleting
a user leaves orphaned override rows; a future user.id collision (e.g.
re-creating a user with the same id via fixture seed) re-applies them. Same
pattern on userPortRoles.userId. The broader codebase treats better-auth user
IDs as opaque strings deliberately (~17 columns), but this is a brand-new
CRM-owned table where a real FK was straightforward. Action: ship 0056
adding FOREIGN KEY (user_id) REFERENCES "user"(id) ON DELETE CASCADE to
user_permission_overrides and user_port_roles.
H2 — Nullable client FKs without set null block hard-delete
documents.clientId (line 72), files.clientId (line 30), email_threads.clientId,
formSubmissions.clientId, documentTemplates.sourceFileId,
generatedReports.fileId — nullable, declared .references(...), no
onDelete. The new admin.permanently_delete_clients permission will fail with
FK violation on any client with attached files/documents. The aggregated
projection already preserves history via FK snapshots, so ON DELETE SET NULL
is the documented intent. Action: add onDelete: 'set null' + a 0056
migration. Same shape applies to berthReservations notNull parents
(berth_id, port_id, client_id, yacht_id) which have no onDelete
declared in Drizzle (Drizzle emits NO ACTION — correct behavior but
inconsistent with the explicit audit pattern in 0042).
H3 — yachts.current_owner_id (and friends) are polymorphic, unconstrained
The current_owner_type discriminator has the 0036 CHECK; the paired
current_owner_id has no guarantee the referenced client/company row exists.
Same hole on yacht_ownership_history.owner_id, invoices.billing_entity_id,
audit_logs.entityId, notifications.entityId. The owner-resolver returns
null for missing rows, but direct reads (audit dossier, ownership history
rendering) trust the id. Action: daily reconciler reading
(owner_type, owner_id) pairs against the discriminator's target table,
surfacing orphan counts in the admin inspector.
H4 — Migration 0042's billing_entity_id backfill is a tombstone
UPDATE invoices SET billing_entity_id = COALESCE(NULLIF(client_name, ''), id) WHERE billing_entity_id = '' writes a clientName string as if it were an
entity id. The CHECK billing_entity_id <> '' passes, but downstream
billing_entity_type='client' resolution returns null forever for these rows.
The fix is right (won't fail the migration) but no follow-up tooling logs the
tombstones. Action: count post-0042 rows where resolver returns null and
expose in the admin inspector.
H5 — System-folder write protection is service-only
assertNotSystemManaged lives in the folders service. Nothing at the DB level
rejects UPDATE document_folders SET name='x' WHERE system_managed=true. The
0052-tightened chk_system_folder_shape constrains shape but not write-access.
One careless db.update away from breaking the system roots invariant.
MEDIUM
M1 — Missing partial indexes on archived_at
0046 partial-archived indexes covered clients, interests, yachts,
residential_clients, residential_interests. Missing: companies.archivedAt
(filtered in companies.service), document_folders.archived_at (filtered in
hub list queries). Volume is low so it's M, not H.
M2 — userPortRoles allows multiple roles per (user, port)
Unique index is on (userId, portId, roleId) — two role rows for the same
(user, port) are permitted. getEffectivePermissions reads findFirst without
an ORDER BY and silently picks one. Either tighten to (userId, portId) or
union-OR the permissions across all assigned roles.
M3 — interest_berths.berth_id is restrict with no UI escape hatch
onDelete: 'restrict' is the right protective behaviour, but admins hard-deleting
a berth hit a raw FK error message. Offer a "detach this berth from N interests"
admin button before delete, or soften to set null with a service-side warning.
M4 — audit_logs.searchText (tsvector) lacks a GIN index in Drizzle
The column is declared but only btree indexes appear in the Drizzle table
definition. Confirm 0044 (or earlier) ships USING gin (search_text) — if
absent, FTS scans linearly. Action: verify and add GIN if missing.
M5 — Username docstring drift
user_profiles.username CHECK is ^[a-z0-9._-]{2,30}$ (matches the validator),
but the TS docstring (src/lib/db/schema/users.ts:249) says "3–30 chars".
Cosmetic.
M6 — Polymorphic CHECK coverage gap on document_folders.entity_type
The CHECK round (0036+0042) covered yachts.current_owner_type,
invoices.billing_entity_type, yacht_ownership_history.owner_type,
document_sends.document_kind. Missing: a constraint that
document_folders.entity_type IN ('root','client','company','yacht') for
user-created folders. chk_system_folder_shape only fires when
system_managed = true.
M7 — JSONB blobs without DB-level validators
system_settings.value, audit_logs.metadata/oldValue/newValue,
notifications.metadata, savedViews.filters/sortConfig/columnConfig,
berth_pdf_versions.parseResults. The permission-overrides PUT route is well
sanitized (ALLOWED_RESOURCE_ACTIONS allow-list before write). userProfiles.preferences is validated and 8KB-capped at the API. The others rely on per-caller validators only.
M8 — scratchpadNotes.linkedClientId crosses ports without enforcement
Notes are user-scoped (no portId), but the linked client lives in a port. A
user reassigned between ports could open stale notes pointing at clients in a
port they no longer access. UI port-scoped queries hide them, but raw API
exposure does not.
M9 — 0027 nationality-ISO backfill is non-idempotent on dirty data
Re-running after manual edits overwrites the nationality_iso column. CLAUDE.md
notes the last_imported_at guard for berths (0024/0034 mooring normalization)
but 0027 has no such guard.
M10 — currency_rates has no retention
(base, target) is the only unique index; daily polling accumulates rows
forever. Low-priority (daily volume is small).
Migration replayability — verdict
Idempotency is strong across 0036+: DO $$ … EXCEPTION WHEN duplicate_object
blocks, IF NOT EXISTS on every CREATE INDEX, NOT VALID + VALIDATE pattern
in 0042/0044/0052. The 0028→0029 split (data move then DROP interests.berth_id)
is correct. 0046 DROP IF EXISTS + CREATE IF NOT EXISTS is correct. The
0050/0051/0052 folder-lifecycle chain forms a clean migration sequence with the
shape CHECK tightened in the right order.
The single replayability cliff is C1 above: 0052 + the absent migrator.
Partial-unique indexes — all verified present
| Constraint | Index | Source |
|---|---|---|
| one primary berth per interest | idx_ib_one_primary WHERE is_primary |
interests.ts |
| one default brochure per port (non-archived) | idx_brochures_one_default_per_port WHERE is_default AND archived_at IS NULL |
brochures.ts |
| username case-insensitive | idx_user_profiles_username_unique ON LOWER(username) WHERE NOT NULL |
0054 |
| one open alert per fingerprint | idx_alerts_fingerprint_open WHERE resolved_at IS NULL |
insights.ts |
| one active yacht owner | idx_yoh_active WHERE end_date IS NULL |
yachts.ts |
| one primary contact per (client, channel) | idx_cc_one_primary_per_channel WHERE is_primary |
clients.ts |
| one active reservation per berth | idx_br_active WHERE status='active' |
reservations.ts |
| one subfolder per entity per port | uniq_document_folders_entity WHERE entity_id IS NOT NULL |
documents.ts (0051) |
| one global setting per key (NULLS NOT DISTINCT) | system_settings_key_port_idx |
0047 (see C2) |
| one primary client address | idx_ca_primary WHERE is_primary |
clients.ts |
| one primary company address | idx_compa_primary WHERE is_primary |
companies.ts |
Summary
- CRITICAL (2): no prod migration runner for 0052's CONCURRENTLY indexes;
db:pushskips two structural constraints (circular FK, NULLS NOT DISTINCT). - HIGH (5): missing FK on new
user_permission_overrides.user_id; nullable client FKs withoutset nullblock hard-delete; polymorphic owner_id un-validated; 0042 billing_entity_id tombstones invisible; system-folder write-protection is service-only. - MEDIUM (10): missing partial indexes on companies + document_folders; userPortRoles allows duplicate roles; interest_berths
restricthas no UI escape; audit_logs FTS GIN to verify; misc docstring drift, polymorphic CHECK gap on folders, JSONB writes without DB validators, scratchpad cross-port, 0027 idempotency, currency_rates retention.
Recommended sequencing: ship a real prod migration runner (C1), then a 0056 follow-up that closes H1 + H3 (FKs on user_permission_overrides.user_id and on nullable client FKs).
4. Services + realtime + queue + storage audit (services-auditor)
Services + Realtime + Queue + Storage Audit
Scope: business-logic correctness, webhook idempotency, BullMQ workers, Socket.IO fan-out, storage backend, cross-entity port isolation, the just-added notifyAdminEmailChange helper.
Repo: new-pn-crm @ feat/documents-folders. Audit window: ~22 min. Read-only.
CRITICAL
C1. updateUserInPort email-change bypasses Better Auth account row
Files: src/lib/services/users.service.ts:236-262, 355-387
db.update(user).set({ email: ... }) writes the new email directly to the Better Auth user table. The Better Auth account table (src/lib/db/schema/users.ts:194-210, providerId='credential') carries an accountId column that is typically the user's email — used by Better Auth's password-login flow to resolve a credential row. The update does NOT touch account.accountId, does NOT invalidate active sessions, does NOT update account.updatedAt, and does NOT use Better Auth's admin API (auth.api.updateUser / setEmail). Failure modes:
- After cutover the user cannot sign in with the new email (Better Auth resolves the credential by old
accountId). - Existing sessions (cookie keyed to userId) continue to work with the new email already showing in profile — confusing UX, no forced re-auth.
- The whole flow runs outside any transaction —
userProfilesupdate (line 230),userupdate (line 247),userPortRolesupdate (line 281), audit log, and notification-fire are five independent writes. A failure between them leaves partial state with no rollback. - No idempotency under retry: there is no guard that the email actually differs from the current
account.accountId, and the email-change notification is fire-and-forget — a retried admin request re-fires the courtesy email and rewrites all rows.
Fix: route through auth.api.updateUser (or write account.accountId + bump session invalidation) and wrap in a transaction.
C2. handleDocumentCompleted orphan blobs on mid-flight failure
File: src/lib/services/documents.service.ts:1100-1253
The idempotency early-return (doc.status === 'completed' && doc.signedFileId) only fires when both flags are set. The sequence is:
downloadSignedPdf(line 1120) — may throw.storage.put(storagePath, signedPdfBuffer)(line 1134) — succeeds → blob exists.ensureEntityFolder(line 1148) — best-effort.db.insert(files)(line 1166) — succeeds → file row exists pointing at blob.db.update(documents).set({status:'completed', signedFileId})(line 1185) — if this fails (e.g. transient connection loss afterfilesinsert), the document keepssignedFileId = NULL.
On the retry from Documenso, the early-return short-circuit is bypassed (signedFileId still NULL). The function re-downloads, re-generates a new UUID (crypto.randomUUID() at line 1131), re-puts to a new key, inserts a second files row, and only then updates the document. The first blob from step 2 + the first files row are now orphaned (unreachable via document, but the file row still exists and may surface in aggregated listings with no docs link).
Additionally, the catch block (line 1244) marks status='completed' with no signedFileId — this means the document is presented to the rep as "complete" while the signed PDF was never persisted. Subsequent webhook retries will retry (no early-return) but if Documenso stops retrying after Nth attempt, the document is permanently stuck "completed with no file."
Fix options: (a) wrap files.insert + documents.update in one transaction; (b) delete the blob in the catch when the file row insert succeeds but the document update fails; (c) refuse to mark status='completed' in the catch — leave as-is so the next retry / cron poll succeeds.
HIGH
H1. Notification-email worker HTML injection via notif.link
File: src/lib/queue/workers/notifications.ts:65-71
`<p>${notif.description ?? notif.title}</p>${
notif.link ? `<p><a href="${process.env.APP_URL}${notif.link}">View in CRM</a></p>` : ''
}`;
notif.description, notif.title, and notif.link are interpolated into HTML with no escaping. notif.link is mostly internal-generated (/documents/{id}) but several call sites push user-derived values into description (filenames, client names, custom alert text). A description of <img src=x onerror=...> ships as live HTML to the recipient's inbox. Lower-severity than C1 because most notifications are admin-only and the recipient is internal staff, but still an XSS-via-email primitive. Use the same renderEmailBody (allowlist) helper the send-out flow uses.
H2. expense-dedup.markBestDuplicate lost-update race
File: src/lib/services/expense-dedup.service.ts:58-73
scanForDuplicates returns candidates, then markBestDuplicate writes duplicateOf. Two concurrent dedup-engine runs on a pair (A,B) can each mark the other as the duplicate → mutual duplicateOf cycle, both archived later by mergeDuplicate. No advisory lock, no transaction encompassing scan + update. Also: scanForDuplicates does not filter archived_at IS NULL, so already-merged sources can resurface as candidates.
H3. notes.service dead-code dispatch helper
File: src/lib/services/notes.service.ts:80-98
tableForEntity is defined and immediately void-discarded — every CRUD branch inlines its own switch. New entity types (e.g. residential_clients) added to the type union are silently missed by inlined branches because the exhaustive-switch compiler check is absent. This is the actual drift-vector for the polymorphic dispatch CLAUDE.md called out. Either delete the helper or refactor every CRUD operation to go through it.
H4. Socket-server max-connections race
File: src/lib/socket/server.ts:103-106
const userSockets = await io!.in(`user:${session.user.id}`).fetchSockets();
if (userSockets.length >= 10) return next(new Error('Maximum connections reached'));
Between fetchSockets() and the eventual socket.join(user:${userId}) at line 132, another concurrent handshake can pass the same check. Under burst reconnect (e.g. flaky network across many tabs), users get 11+ sockets. The Redis adapter's fetchSockets is multi-pod-aware, but the gating is not atomic. Use a Redis INCR keyed by user:${id}:conn_count with TTL fallback, decrement on disconnect.
H5. Documenso webhook timing side-channel
File: route.ts:60-68 + documenso-webhook.ts:13-21
verifyDocumensoSecret short-circuits on length !== expected.length before timingSafeEqual. Combined with the linear scan across all per-port secrets, response-time deltas leak the number of ports and the length of each secret. Marginal but easy fix: pad to fixed size.
H6. Global Documenso secret silently drops events under multi-tenant ambiguity
File: src/lib/services/documents.service.ts:967-996
resolveWebhookDocument correctly refuses to mutate when documensoId matches multiple ports AND no portId was passed. The webhook route now resolves portId from the matched secret (good — see comment at line 138-143). But the global env.DOCUMENSO_WEBHOOK_SECRET fallback entry returns portId: null (port-config.ts:370), and any port still using the global secret falls back to the "ambiguous → refuse" path. Result: if two ports share the global secret, valid completion events get silently dropped instead of routed. The dedupe + dead-letter on the inbound side doesn't surface this — it just looks like Documenso never delivered. Recommend: require per-port secrets for production and warn loudly when more than one port resolves to portId: null.
MEDIUM
M1. Storage migration loads each blob fully into Node memory
File: src/lib/storage/migrate.ts:170-204 (copyAndVerify)
for await (chunk of stream) { chunks.push(chunk) } materializes the full blob in memory twice (source read + verify re-read) per file. A 200MB signed PDF or GDPR export blows the worker. Consider piping through crypto.createHash('sha256') + tee to the target backend instead of Buffer.concat. The pre-flight free-disk check (line 298-310) does Promise.all(refs.map(head)) for every blob in the table — for large files tables that's thousands of round-trips before any copy starts.
M2. archiveInterest next-in-line dossier outside transaction
File: src/lib/services/interests.service.ts:1067-1112
The IIFE that builds next-in-line notifications fires after softDelete(interests, ...) and evaluateRule — both already queued via void. If the IIFE throws after the interest is archived but before notifications send, only a logger.error lands; the archived interest stays archived with no rep notification. Acceptable as best-effort, but the dossier doesn't run inside the same audit-context request (the createAuditLog call happens earlier), so an operator reading the audit trail sees "archived" without seeing what notifications were attempted. Consider attaching the dossier result to the audit metadata.
M3. attachWorkerAudit always records portId: null
File: src/lib/queue/audit-helpers.ts:50-86
Every job-failure audit row is written with portId: null. Multi-port operators querying their port-scoped audit log will not see worker failures that affected their port (e.g. a documenso-void job carrying portId in job.data). The worker has access to job.data.portId for most queues — extract it where present.
M4. RECURRING_JOB_NAMES drift
File: src/lib/queue/audit-helpers.ts:27-48
Hardcoded Set requires manual sync against scheduler.ts. Typos silently demote cron heartbeats to regular completion logs. Either co-locate or compute from the scheduler module at boot.
M5. Aggregated workflow listing surfaces draft workflows
File: src/lib/services/documents.service.ts:1888
INFLIGHT_STATUSES = ['draft', 'sent', 'partially_signed'] includes draft. CLAUDE.md describes the UI section as "Signing-in-progress" — drafts have not been sent. Confirm intent.
M6. Documenso secrets stored plaintext in system_settings
File: src/lib/services/port-config.ts:351-373
listDocumensoWebhookSecrets reads systemSettings.value directly — no decryption. SMTP/IMAP passwords are AES-256-GCM-encrypted per CLAUDE.md; the Documenso webhook secret should be too.
M7. import worker is a no-op
File: src/lib/queue/workers/import.ts:13-17
process() body is // TODO(L2). Any job pushed to the import queue silently completes with no work — every CSV import is a silent success if the producer side ships first.
Observations on what is solid
handleDocumentCompletedidempotency gate (line 1110) is correct when reached. The hazard is the partial-write window above (C2), not the gate itself.resolveWebhookDocumentcorrectly refuses to mutate on multi-port ambiguity.- Socket auth middleware (
server.ts:91-124) cross-checks the client-suppliedauth.portIdagainstuserPortRoles— closes the prior tenant-room hijack. - Storage filesystem backend correctly refuses to start when
MULTI_NODE_DEPLOYMENT=true(filesystem.ts:218) using the zod-validated env, not rawprocess.env. - Magic-byte verification is enforced both for brochures (
brochures.service.ts:241-263) and berth PDFs (berth-pdf.service.ts:234-262) with delete-on-mismatch cleanup. - File-aggregation projection (
files.ts:316-379, 526-579) appliesport_idat the entry-point assert, oncompanies.port_id/clients.port_id/yachts.port_idjoins, onfiles.port_idin the predicate, and on thedocumentsLEFT JOIN's residual (line 567). Defense-in-depth is consistent. - Webhook worker has DNS-rebinding SSRF re-resolution at dispatch (
webhooks.ts:18-45) and dead-letter handling with operator notifications.
Headline asks: C1 (Better Auth identity rotation), C2 (orphan-blob window), H1 (notification email XSS), H6 (global webhook secret ambiguity drops events silently).
5. Performance + code-trim + render-smoothness audit (perf-test-auditor)
Performance + Testing-Coverage Audit
Branch: feat/documents-folders · Scope: static analysis only.
Numbers: 116 vitest files / 1293 tests · 33 smoke specs · 68 services
files (15 with a unit-test file → 78 % of services have zero unit tests).
CRITICAL
C1. Zero test coverage for the user-mgmt + permission-override slice just shipped (commit 660553c)
git diff main adds: username sign-in, identifier resolver, per-user
permission-override matrix, role-label rendering, search keyword index,
user disable/email-change paths, dashboard widget toggles.
grep -rn 'username\|resolve-identifier\|permission-overrides' tests/ →
no matches. Not one smoke spec, not one integration test, not one
unit test. The feature ships dark.
Highest-risk slices:
POST /api/auth/resolve-identifier— public, unauthenticated, rate-limited via a sharedauthbucket. Anti-enumeration relies on a synthetic@auth.invalidfall-through. A wrong shape regression here silently re-enables username enumeration. Needs a vitest test with hit/miss/ empty/error paths.PUT /api/v1/admin/users/[id]/permission-overrides— the schema allow-list (lines 47–80 of the route) is hand-maintained againstRolePermissions. A drift here lets an admin grant themselves unlisted leaves. There's already aif (targetUserId === ctx.userId)self-target check; no test ensures it stays.- The
UserPermissionMatrixis the only UI for the new overrides table and is not rendered by any spec.
→ Fix: add at minimum one smoke spec under tests/e2e/smoke/24-admin-features.spec.ts
that logs in with username, opens an admin user, toggles a grant/deny,
and reloads. Add a vitest test against resolve-identifier covering the
four branches.
C2. Documents-hub aggregated projection runs 2 × (N companies + N yachts + N clients) sequential queries
src/lib/services/documents.service.ts:1923-1956 (workflow groups) and
the file-aggregation cousin do this:
for (const {id, name} of related.companies) {
const g = await fetchWorkflowGroupRows(portId, eq(documents.companyId, id));
…
}
fetchWorkflowGroupRows itself issues a SELECT + a separate COUNT
(2 round-trips). For a client with 5 companies + 5 yachts + 3 sibling
clients, opening the Documents tab fires (5+5+3)×2 = 26 sequential
queries on the inflight projection alone, plus another ~26 on the
files-aggregated cousin (mentioned in CLAUDE.md), so ~50 sequential
round-trips for a single tab open.
→ Fix: switch to a single SQL WHERE … IN (UNNEST(:companyIds)) GROUP BY :source_kind returning grouped rows + a count window, or at minimum
Promise.all the per-id calls so latency is parallel.
HIGH
H1. listUsers is sequential and unbounded (no pagination)
src/lib/services/users.service.ts:16-104 — two sequential SELECTs
(port-role rows then super-admin rows). Should be one query with a
UNION/LEFT JOIN, or at minimum Promise.all. No limit/offset. For
the multi-tenant install where a port could grow to thousands of users
this becomes O(N) memory + payload per admin page open. GET /api/v1/admin/users
also lacks pagination.
→ Fix: collapse to one SQL with LEFT JOIN userPortRoles … OR userProfiles.isSuperAdmin,
add limit/offset, surface { data, total, hasMore }.
H2. DataTable rebuilds the columns array on every render
src/components/shared/data-table.tsx:109-137 constructs allColumns
on every render with no useMemo. TanStack Table's docs explicitly warn
this resets internal state (sorting, column resizing, virtual scrolling
indices) every render. For the clients/interests lists with 50+ rows
and 10+ columns this stalls every parent state change.
→ Fix: useMemo(() => […selectColumn?, …columns], [bulkActions, columns]).
H3. Recharts is statically imported in widget-registry.tsx — every dashboard chart ships in the initial bundle
src/components/dashboard/widget-registry.tsx:15-25 static-imports 7
chart files which in turn pull recharts (~80–150 KB gzipped). The
registry is the only entry point for the dashboard so the first
dashboard load pays the entire recharts cost even for users whose
widgets are all hidden.
→ Fix: const PipelineFunnelChart = dynamic(() => import('./pipeline-funnel-chart').then(m => m.PipelineFunnelChart), { ssr:false, loading: () => <WidgetSkeleton/> }) per chart. Same fix for website-analytics/pageviews-chart.tsx.
H4. tiptap-to-pdfme.ts (571-line module) ships to the client just for TEMPLATE_VARIABLES
src/components/admin/document-templates/{template-form,template-preview}.tsx
import TEMPLATE_VARIABLES from @/lib/pdf/tiptap-to-pdfme. The named
import drags the whole module (~570 lines of TipTap→pdfme transform
logic) into the client bundle even though only ~60 lines of constant
data are used. The @pdfme/common import is type-only so that part is
stripped, but the runtime code still ships.
→ Fix: split TEMPLATE_VARIABLES into a leaf file (@/lib/pdf/template-variables.ts)
that has no other imports; have tiptap-to-pdfme.ts re-export it for
server-side callers.
H5. notifications.service.ts:updatePreferences runs N sequential upserts in a loop
src/lib/services/notifications.service.ts:368-385 — one INSERT … ON
CONFLICT per preference row. For ~30 notification types that's 30 round
trips per "Save preferences" click. Trivially batchable as a single
db.insert().values(rows).onConflictDoUpdate(…).
H6. GET .../permission-overrides chains 5 sequential round-trips
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:99-138
goes profile → portRole → role → portOverride → userOverride
sequentially. Each is independent on the userId once profile is
loaded; collapse the trailing four into Promise.all.
H7. command-search.tsx invalidates two query keys every time the dropdown opens
src/components/search/command-search.tsx:142-146:
useEffect(() => {
if (!showDropdown) return;
queryClient.invalidateQueries({ queryKey: ['search', 'recently-viewed'] });
queryClient.invalidateQueries({ queryKey: ['search', 'recent-terms'] });
}, [showDropdown, queryClient]);
Each time the user clicks the search box, two queries refire. The
useSearch hook already sets staleTime: 30_000 for these. Invalidating
on every open defeats the staleTime entirely. Use the existing staleTime
or refetchOnMount: 'always' for a single trigger.
MEDIUM
M1. UserPermissionMatrix re-creates setOverrides closures every render
src/components/admin/users/user-permission-matrix.tsx:158-169 defines
setState (non-memoized) and passes it down inside .map() rows. The
component itself is small (~180 leaves) so the impact is modest, but
the 3-state buttons render 540 closures every save. Wrap setState/getState
in useCallback, or pull them out as module-scope pure helpers
taking (overrides, setOverrides).
M2. dashboard.service.ts:getPipelineForecast scans every active interest into memory
Lines 119-156 fetch every non-archived interest + primary berth and
reduce in JS. Push to SQL with SUM(price * CASE WHEN stage = … END) GROUP BY pipeline_stage.
M3. documents.service.ts:listDocuments LEFT-JOINs documents on signed_file_id, but no pagination on folder views
grep indicates listDocuments has no limit/offset when folderId
is set. A folder with 1000+ files would dump the whole set. Verify
with a quick read; if missing, add the same { limit, offset } pattern
used by /api/v1/clients.
M4. service tests gap: 53 of 68 service files have zero unit-test files
Service files with no test include several high-risk surfaces:
interests.service.ts(multi-berth, primary-flag invariants)documents.service.ts(folder soft-rescue, owner-wins chain, system-folder lock)document-folders.service.ts(cycle prevention, sibling uniqueness)notifications.service.ts(preference dedupe key, watcher fan-out)users.service.ts(createUser, role assignment, deactivate path)dashboard.service.ts(forecast math, hot-deal rank)client-merge.service.ts,client-hard-delete.service.ts,client-archive.service.ts(destructive paths — exactly the surfaces most worth testing)interest-berths.service.ts(the "never query from outside this service" rule has no integration test enforcing the partial unique index logic)email-accounts.service.ts(AES-256-GCM round-trip, no test ensures the decrypt path stays sound after a key rotation)recently-viewed.service.ts,search.service.ts(search bucket expansion already partly tested but service-level branches missing)
Even one happy-path + one edge-case test per file would lift the coverage floor enormously.
M5. Playwright coverage gap matches the new flows verbatim
grep -rn 'username\|permission-overrides\|disable.*user\|email[-_]change\|widget.*toggle' tests/e2e/ → no hits.
Specs that should exist but don't:
tests/e2e/smoke/01-auth.spec.ts— currently only tests email login; needs a username-only path.tests/e2e/smoke/24-admin-features.spec.ts— needs a permission-override three-state toggle path and a user disable path.tests/e2e/smoke/10-dashboard.spec.ts— needs a widget visibility toggle- reload assertion (the persistence path in user_profiles.preferences has only the strict-allow-list cap, no UI-level integration test).
M6. realtime-toasts.tsx was modified without a Playwright spec or vitest unit
Modified in this branch; no spec covers toast deduplication / port-id filtering. Realtime fan-out is a textbook noisy-neighbor surface — a regression here floods users with toast spam.
M7. interest-detail-header.tsx, yacht-tabs.tsx, user-card.tsx modifications
These three changed in this branch with no corresponding spec change.
The smoke 02-crud-spine.spec.ts exercises the underlying CRUD but
doesn't assert the new inline-edit visuals shipped in commit 04a5949.
M8. clients.service.ts:986 db.query.clients.findMany with no limit
The function looks like a "find all matching" helper. If it's reached
by any non-internal call path it could dump every client. Worth a
direct read and a limit arg.
M9. command-search.tsx paste handler awaits apiFetch inside onPaste synchronously
Lines 189-206 — onPaste is async and awaits apiFetch before
e.preventDefault(). By the time the fetch resolves, the paste event
has already been processed and the text dropped into the input. The
preventDefault inside the if (res.found && res.href) block
silently no-ops on most pastes. Either preventDefault unconditionally
up-front, or read e.clipboardData and treat as plain "lookup +
navigate" without trying to cancel.
M10. audit-search.service.ts:80 and gdpr-export.service.ts:266 use findMany without limit
Both are admin-only but a long-running port renders the audit page in tens of seconds.
Good news / verified safe
serverExternalPackagesinnext.config.tskeeps pino/bullmq/ioredis/ minio/postgres/better-auth/nodemailer off the client.- NAV_CATALOG (175 entries) is only reached via dynamic
import()fromsearch.service.ts. Server-only, not in any client bundle. lucide-react,pdfme/generator,pdf-lib,rechartsare absent from RSC client boundaries except the dashboard widgets (H3).lucide-reactimports are all named (tree-shake safe).- No sync crypto, no sync PDF rendering in request handlers.
JSON.parseonly on cheap surfaces. useSearchdebounce +keepPreviousData+staleTime: 30sis correct.
6. Observability + i18n + docs-drift audit (obs-i18n-docs-auditor)
Observability + i18n + Docs-Drift Audit
Auditor: obs-i18n-docs-auditor • Branch: feat/documents-folders • Date: 2026-05-12
Scope: A) createAuditLog coverage, pino discipline, error-event pipeline; B) timezone / currency / country / date-picker; C) CLAUDE.md, BACKLOG.md, numbered specs, admin-search keywords, NAV_CATALOG hrefs.
CRITICAL
C1. NAV_CATALOG has dead links — global topbar search jumps to 404s
src/lib/services/search-nav-catalog.ts — three confirmed-dead entries; cmd-K search routes users to non-existent routes:
| Catalog href | Actual route | Notes |
|---|---|---|
/:portSlug/admin/audit-log |
/:portSlug/admin/audit |
Audit log card link |
/:portSlug/admin/error-events |
/:portSlug/admin/errors |
Super-admin platform errors |
/:portSlug/user-settings |
/:portSlug/settings/profile |
User-menu uses correct path |
Also dead — these :portSlug/settings/<X> paths have no folder under src/app/(dashboard)/[portSlug]/settings/; the only subroute that exists is profile/:
/:portSlug/settings/email/:portSlug/settings/branding/:portSlug/settings/templates/:portSlug/settings/storage/:portSlug/settings/recommender/:portSlug/settings/tags/:portSlug/settings/notifications
These look like aliases that were intended to deep-link inside /settings tabs but never wired up. Either redirect them to /admin/<x> (which all exist) or render real settings/<x> pages.
C2. Webhooks bypass the platform-error pipeline
src/app/api/webhooks/documenso/route.ts is the only webhook route in the repo and it does NOT call errorResponse(...) / captureErrorEvent(...). The handler always returns 200 with logger.error(...) only, so admin/errors never sees Documenso webhook crashes — the CLAUDE.md/docs imply errors flow into error_events universally but webhooks are silently outside that flow. Recommended: wrap the handler in a try/catch that calls captureErrorEvent({ statusCode: 500, error, metadata: { source: 'webhook', event } }) before returning 200.
HIGH
H1. PDF templates hard-code en-GB date locale (ignores user prefs)
Every numbered PDF template hardcodes toLocaleString('en-GB', …) / toLocaleDateString('en-GB') regardless of the rendering user's locale/timezone:
src/lib/pdf/templates/interest-summary-template.ts:85,162src/lib/pdf/templates/client-summary-template.ts:97,133,143,156src/lib/pdf/templates/berth-spec-template.ts:172,187src/lib/pdf/templates/invoice-template.ts:116src/lib/pdf/templates/reports/{activity,occupancy,pipeline,revenue}-report.ts— all use'en-GB'src/lib/email/templates/document-signing.ts:141—completedAt.toLocaleString('en-GB', …)
CLAUDE.md and the new dashboard greeting / timezone-drift banner suggest the rep's locale + timezone is honoured end-to-end. It isn't — at the PDF / signing-email surface we silently revert to en-GB. User-preference timezone/locale from user_profiles is plumbed nowhere into these templates.
H2. PDF templates hard-code USD price formatting & build raw Number().toLocaleString() strings
interest-summary-template.ts:112,berth-spec-template.ts:127,172—berth.priceCurrency ?? 'USD'followed byNumber(price).toLocaleString()(no Intl currency formatter, no grouping conventions per locale).reports/pipeline-report.ts:93,reports/revenue-report.ts:78,86—Number(...).toLocaleString()with no currency code at all in revenue report.
Single-source formatCurrency() exists at src/lib/utils/currency.ts and is used everywhere else — these templates should call it.
H3. Dashboard widgets hard-code 'USD' despite per-port berths_default_currency
berths_default_currency is a system_settings key (admin/settings/settings-manager.tsx:223). But:
src/components/dashboard/kpi-cards.tsx:19—formatCurrency(value, 'USD', …)src/components/dashboard/revenue-forecast.tsx:25— samesrc/components/dashboard/pipeline-value-tile.tsx:45,47— same (the inner data field ispipelineValueUsd— backend converts to USD before sending). The "pipeline value tile" claim that the comment says "USD-denominated" is fine, but the KPI / forecast tiles silently render Euro/GBP ports as USD.
H4. CLAUDE.md missing two new auth surfaces
Migrations 0054 (user_profiles.username) and 0055 (user_permission_overrides) shipped in this branch. CLAUDE.md has zero mention of:
- Username sign-in alternative (login form +
resolve-identifierendpoint +src/lib/validators/username.ts). - Per-user permission overrides (effective-permission chain is now:
role → port_role_overrides → user_permission_overrides).
The "Conventions / Auth" section currently implies user_port_roles.role is the leaf authority. New developers won't know to apply user-level overrides when reasoning about effective permissions.
H5. feedback_pwa_assets_pending memory is stale
User memory says PWA assets (icon-192.png, icon-512.png, icon-512-maskable.png) must be added before shipping Phase B scanner. All three exist in public/ plus apple-touch-icon.png. Memory should be cleared.
MEDIUM
M1. archiveBrochure has no createAuditLog call
src/lib/services/brochures.service.ts:191 — service-level archive (archivedAt + isDefault: false) commits without an audit row. Every other archive/delete in this branch (yachts, clients, companies, interests, berths, documents, document-folders, files, invoices, document-templates, email-accounts, users, roles, portal-auth, custom-fields) creates audit logs. Brochures is the outlier — same UX risk as the others (admin can swap default brochure with no trail).
M2. PII risk: portal-auth logs the email address of unknown / disabled-portal users
src/lib/services/portal-auth.service.ts:356,373,423 log email, user.email. Logger redact paths cover passwords / tokens / encrypted blobs but not email / *.email. For most CRM logging this is fine (emails are not secret in this app), but the portal-reset paths specifically log emails of users outside the active session — a quiet PII surface in log aggregators. Recommend either (a) hash-prefix the email (hash6(email)) before logging, or (b) accept-but-document.
M3. Pino logger.info discipline — Documenso & IMAP chatter
src/app/api/webhooks/documenso/route.ts:122,156,187,243,258,262— sixlogger.info(…)per webhook fire (duplicate skip, lifecycle event, unhandled event type). At realistic Documenso traffic + retry pressure this is noisy. Consider downgrading the'Documenso lifecycle event'line at L258 (fires on every valid event) tologger.debug.src/lib/services/email-threads.service.ts:290,298,358— IMAP sync logsmailbox.exists,messageCount,'No new messages to sync'at info on every poll. At 5-min poll cadence × 24h × N accounts this floods info-level logs. Should belogger.debug.
M4. Timezone-aware reminder dueAt storage looks correct but UI hands off naïve strings
reminders.dueAt is stored as TIMESTAMPTZ (reminders.service.ts:179 — new Date(data.dueAt)). The validator accepts an ISO string. <DateTimePicker> in date-time inputs reads new Date(input) from the browser — interpretation is local-TZ for YYYY-MM-DDTHH:mm strings, UTC for full ISO with Z. Worth a focused look on the picker component to confirm it emits Z-terminated ISO (else "reminder at 9 AM" means 9 AM browser-local on creation, but server's formatInTimeZone against the rep's chosen TZ will misalign). I did not open <DateTimePicker> itself in this audit — flagging as a 30-minute follow-up.
M5. CLAUDE.md numbered-spec section frames 01-15 as authoritative
CLAUDE.md says:
"Numbered spec files in repo root (01-…through 15-…) contain detailed architecture decisions, feature specs, DB schema docs, API catalog, and implementation sequence."
These specs document the pre-rebuild Nuxt 3 / NocoDB system being migrated FROM. 01-CONSOLIDATED-SYSTEM-SPEC.md header reads "Compiled: 2026-03-11" and the stack tables describe Nuxt 3 SPA, NocoDB, Keycloak OIDC, etc. — none of which are the live Next.js + Drizzle + better-auth stack. New contributors reading CLAUDE.md will be sent down the wrong path. Recommend reframing to "legacy reference for the rebuild target" or moving them to docs/legacy/.
M6. BACKLOG.md doc-folders entry stale vs reality
docs/BACKLOG.md E. "Hidden / stubbed UI tabs" still lists Company Documents tab as ✅ landed 2026-05-08 but in the same section says Berth Waiting List + Maintenance Log tabs are "Removed entirely; revisit if/when product asks" — yet src/components/berths/berth-tabs.tsx still imports and renders the tabs strip (the comment in CLAUDE.md is silent on this). Not blocker — just a doc/code drift.
M7. Admin sections browser missing two real admin routes
src/components/admin/admin-sections-browser.tsx registers 30 hrefs. Two /[portSlug]/admin/* routes exist but are NOT surfaced in the browser:
/admin/brochures(full UI exists atsrc/app/(dashboard)/[portSlug]/admin/brochures/page.tsx)/admin/errors(super-admin platform-errors inspector, real route)
The NAV_CATALOG catalog (Cmd-K) covers "Platform errors" via the (dead — see C1) error-events href but no entry for /admin/brochures. Reps cannot discover the brochure admin surface from either the section card grid or global search.
M8. Settings-manager keyword catalog drift
admin-sections-browser keywords list (settings card) is in sync with settings-manager.tsx KNOWN_SETTINGS keys today (21 keys, 39 aliases). However, two settings exist in production that are NOT in either list:
documenso_signing_order(CLAUDE.md L46) — typeable via Documenso admin page, not the generic Settings card; reasonable to omit but flag if you want unified search.documenso_redirect_url— same.
Not a bug — just confirming the drift surface is intentional.
What was NOT in scope but worth a quick note
- Audit-coverage spot-check for sensitive mutations: clients/yachts/companies/interests/berths/documents/document-folders/document-templates/invoices/users/roles/portal-auth/files/custom-fields/document-sends/email-accounts ALL call
createAuditLog. Onlybrochures.archiveBrochureis missing (M1). No other gaps spotted in services that I sampled. - Pino redact paths cover passwords, tokens, secrets, encrypted blobs, Authorization headers, cookie headers, two-level nesting — comprehensive. Only soft gap is
emailfield (M2). - Error pipeline:
errors.ts → captureErrorEventis invoked on every 5xx route response; theerror_eventstable is read by admin/errors. Looks complete for API routes — gap is webhooks (C2). - Country/nationality: consistently stored as ISO-3166-1 alpha-2 across clients, companies, residential — validators centralized in
src/lib/validators/i18n.ts. Good.
Recommended fix order
- (C1) Fix the 10 dead
search-nav-catalog.tshrefs — pure typo fixes, very high user-visible impact. - (C2) Wrap webhook handler in
captureErrorEvent. - (H4) Add CLAUDE.md sections for username + user permission overrides.
- (H1, H2, H3) PDF & dashboard locale/currency consistency pass — plumb user prefs through, kill remaining
'en-GB'/'USD'hardcodes. - (M1)
createAuditLoginarchiveBrochure. - (M5) Reframe or relocate numbered specs
01-15. - (M3) Demote 2-3 chatty
logger.infolines tologger.debug. - (H5) Clear stale
pwa_assets_pendinguser memory.
~1450 words.
7. Concurrency + race conditions audit (concurrency-auditor)
Concurrency & Race-Condition Audit — pn-crm
Scope: 22-minute read-only sweep of services, queue workers, webhook handlers, and
schema invariants. Findings grouped by severity. Line references against
feat/documents-folders.
CRITICAL
C-1. handleDocumentCompleted TOCTTOU lets two concurrent webhook retries both download + persist the signed PDF
src/lib/services/documents.service.ts:1100-1190
The idempotency gate if (doc.status === 'completed' && doc.signedFileId) return;
is read outside any row lock. The CLAUDE.md note ("idempotent — early-returns
when …") is true for sequential retries but not for concurrent ones.
Real-world hit: Documenso retries DOCUMENT_COMPLETED on a 5xx, and the local poll
worker also reconciles. If both arrive within milliseconds (e.g. the receiver was
slow once then retried while the poll worker also fires), both pass the gate,
both call downloadSignedPdf + storage.put, both db.insert(files), and both
UPDATE documents.signed_file_id. The losing file row stays in files, but its
blob has no documents row pointing at it → permanent orphan blob plus a
duplicate file in the entity folder.
Fix: wrap the whole block in db.transaction + SELECT … FROM documents WHERE id = $1 FOR UPDATE before re-checking the gate. (Or add a partial unique index
(document_id) WHERE document_type='signed_pdf' on files so the DB rejects the
second insert.)
C-2. BullMQ jobs have NO jobId — every webhook retry / duplicate enqueue creates a new job
src/lib/queue/index.ts:24-39 (queue defaults), every call site of queue.add(...)
(see inquiry-notifications.service.ts:51,118, webhook-dispatch.ts:59,
webhooks.service.ts:323,374, invoices.ts:661,790, gdpr-export.service.ts:101,
reports.service.ts:90, email-draft.service.ts:59, notifications.service.ts:165,
expenses.ts:190, documents.service.ts send-out paths).
BullMQ deduplicates only when callers pass { jobId: stableKey }. Nothing in the
repo does. Implications:
- A second Documenso 5xx retry that comes through
handleDocumentCompleteddoesn't go through BullMQ, but webhook outbound deliveries viawebhook-dispatch.tsgo throughqueue.add('deliver', …). IfdispatchEventis called twice for the same event id, both rows get a delivery job and the external endpoint sees the event twice. notifications.service.ts:165enqueues a notification-email job per insert; the dedupeKey collapses DB rows but not jobs. AcreateNotificationthat fails after the dedupe collapse but before the job add can leave the queue short; a successful dedupe still adds a fresh job each call.- The maintenance queue (concurrency 1, attempts 3) backs off on failure and
with no
removeOnFailcap on count, a misbehaving job that errors thousands of times can balloon Redis.
Fix: every "logically once-per-X" job needs jobId: \${name}:${entityId}`, plus a removeOnFail: { count: 1000 }` cap on the queue defaults.
C-3. advanceStageIfBehind double-fires evaluateRule on parallel webhook deliveries
src/lib/services/interests.service.ts:881-908. The current-stage read is plain
db.query.interests.findFirst, no FOR UPDATE. Two concurrent calls (DOCUMENT_SIGNED
- DOCUMENT_COMPLETED, both routed to
eoi_signed, arriving in parallel because Documenso may send the RECIPIENT_SIGNED + DOCUMENT_COMPLETED pair near-simultaneously) both observecurrentIdx < targetIdx, both callchangeInterestStage, both triggerevaluateRule('eoi_signed', …). The downstream berth-rule then auto-flips the berth status twice — and if the rule has any side effect like queue.add (it does — see berth-rules-engine), you get two of them.
The comment at line 1274 ("Guard against double-fire") shows the author noticed
the risk but only added an idempotency check on the eoi-signed-and-beyond
branch, not on the advanceStageIfBehind path itself.
Fix: pull the interest row with FOR UPDATE inside advanceStageIfBehind
and call changeInterestStage in the same transaction, OR move the rule-fire
side effect inside changeInterestStage and gate it on the actual UPDATE
returning a row (i.e. .returning() + check updated.pipelineStage === target).
HIGH
H-1. moveFolder cycle check is not under a row lock — concurrent moves can create a cycle
src/lib/services/document-folders.service.ts:212-275. The cycle check walks
ancestors of the destination outside any transaction; the actual UPDATE happens
in a separate statement. Two reps moving folders simultaneously can each pass
the cycle check against pre-state, then both commit, leaving folder A under B
under A. The system folders are protected by assertNotSystemManaged, but any
two user folders are vulnerable. Subsequent reads (the cursor walks in
listDocuments(..., includeDescendants=true)) would infinite-loop until the
seen guard bails — but the tree is now inconsistent.
Fix: open a transaction at the top, SELECT FOR UPDATE the moving folder, walk
the ancestor chain inside the same transaction, then UPDATE. PostgreSQL's
default READ COMMITTED isolation doesn't see other in-flight updates without
the lock.
H-2. Berth-PDF upload writes the blob BEFORE acquiring the advisory lock
src/lib/services/berth-pdf.service.ts:204-294. Step 3 calls backend.put(...);
step 4 takes the per-berth advisory lock + inserts the version row. If the
transaction in step 4 fails for any reason other than the unique-index conflict
(e.g. FK violation, network blip, statement timeout), the blob is already at
its UUID path with no DB row pointing at it → orphan blob. The author noted
the unique-index ⇒ orphan risk is mitigated by the UUID path (the second blob
gets its own path so no overwrite), but didn't address the "tx aborts, blob
stays" branch.
Fix: stage the blob in storage after allocating the version row (or wrap both in a saga that deletes the blob on tx rollback via a finalizer).
H-3. upsertInterestBerth isPrimary=true race demotes nothing then both inserts succeed
src/lib/services/interest-berths.service.ts:181-265. Inside the transaction it
UPDATE …SET isPrimary=false WHERE interestId=$1 AND isPrimary=true then
INSERT … (isPrimary: true). At default READ COMMITTED, two concurrent
setPrimaryBerth(X, A) + setPrimaryBerth(X, B) will: both UPDATE (no rows on
the first call so no lock — but the second's UPDATE may now hit the freshly
inserted row from the first). The partial unique index on (interestId) WHERE isPrimary=true catches the second insert — but only after the first tx
commits. If both txns interleave their UPDATE+INSERT before commit, postgres
serializes the unique-index check and one fails with 23505. Currently that
bubbles up as a generic 500, not as a friendly conflict — and a fast retry
would succeed because the loser saw the winner's row and would simply demote
it. So the data invariant holds, but the UX surfaces a confusing error.
Fix: catch 23505 on idx_interest_berths_one_primary and either retry once or
map to a ConflictError so the toast says "another rep just changed the
primary berth, refreshing".
H-4. Admin email-change leaves orphan sessions on the old email
src/lib/services/users.service.ts:233-262. The admin UI flips
user.email directly on the Better-Auth user row but never deletes the
target user's existing sessions. Concurrent sessions of the affected user
keep working under the new email (because Better-Auth indexes sessions by
userid, not email) — that's fine. **But the _previous email is now free**
to be claimed by a fresh signup before the admin sends the "your email was
changed" notice. There's no unique constraint that prevents an attacker
from re-registering as old@example.com and taking over outgoing identity
artefacts (audit logs reference user_id not email, so this is just identity
hygiene; still, the surface exists). Worse — there's no emailVerified = false reset on the swap, so the new email is auto-treated as verified
without ever receiving a confirmation.
Fix: in the same transaction, also revoke the user's sessions if the change
is admin-initiated (db.delete(session).where(eq(session.userId, …))), and
re-set emailVerified = false so the next sign-in goes through the
re-verify flow.
H-5. userEmailChanges has no partial unique index on (userId) WHERE not applied/cancelled
src/lib/db/schema/users.ts:360-379. A user can spam the self-service
email-change endpoint to create unlimited pending rows. Each row mails the
NEW address. Anti-abuse is missing at the DB layer — only application-side
rate limit (which I didn't fully audit) stands between a user and unbounded
email send-out from your domain.
Fix: CREATE UNIQUE INDEX user_email_changes_one_pending ON user_email_changes (user_id) WHERE applied_at IS NULL AND cancelled_at IS NULL;
H-6. Email-confirm token isn't atomically consumed
src/app/api/v1/me/email/confirm/[token]/route.ts:28-57. Three separate
statements: SELECT pending, UPDATE user, UPDATE pending.appliedat. No
transaction wrapper. A user who double-clicks the email link (or a link
preview pre-fetcher like Outlook SafeLinks) fires two near-simultaneous
GETs. Both pass the appliedAt IS NULL check, both flip user.email
(idempotent — same value), both mark applied. Functional, but the second
audit-log entry is misleading. More importantly: if the second click
arrives 200ms later AND the user re-fired a _different change in between
that the first click happened to apply, you've stomped state.
Fix: single transaction, SELECT … FOR UPDATE the pending row, branch on
its post-lock state.
MEDIUM
M-1. Unbounded fan-out on Promise.all per recipient
src/lib/services/inquiry-notifications.service.ts:116-129— fans out oneemailQueue.addper external recipient with no concurrency cap. With a ports admin who lists 500 emails (no UI cap I saw), one inquiry submission pushes 500 queue inserts concurrently. Redis survives it; the surge in pipelined Redis commands can stall co-tenant queues.src/lib/services/notifications.service.ts:344-358— same shape for document events: onecreateNotificationper recipient, fully parallel. EachcreateNotificationdoes its own DB insert AND its ownqueue.add('send-notification-email', …). Big-port notifications can fan out to dozens of users simultaneously per document event.
Fix: use p-limit(10) (already in similar shape elsewhere) or batch with
queue.addBulk. Not data-corrupting; tail-latency / resource concern.
M-2. Username uniqueness TOCTTOU surfaces as 500 instead of ConflictError
src/app/api/v1/me/route.ts:137-145. The LOWER(username) SELECT runs outside
any lock, and the partial unique index idx_user_profiles_username_unique is
the actual guard. Two reps claiming dm concurrently: one succeeds, the other
gets a generic 500 (the 23505 is not caught and rewritten). The pre-check
shows "available" right before the failed write, which is a worse UX than a
clean "already taken" message.
Fix: catch 23505 on the unique index name and translate to ConflictError.
M-3. ensureSystemRoots self-heal recursion not bounded
src/lib/services/document-folders.service.ts:512-516. If ensureSystemRoots
throws transiently (e.g. DB hiccup), ensureEntityFolder recurses into itself
with no depth guard. In normal operation the second pass will find the root
and return; in a pathological case (root is created but the post-insert SELECT
fails repeatedly), this can stack. Low-likelihood but trivial to fix with a
"called from self-heal already" flag.
M-4. Optimistic UI rollback drops the user's pending edits
src/components/interests/pipeline-board.tsx:120-143. The optimistic update
overwrites query data without snapshotting the prior value — on error, the
rollback path just invalidateQueries, which refetches from the server. If
two reps drag the same card, the second's drop happens after the first's
server commit; the server then accepts the second drag, but the first rep's
view briefly shows their own change before the next invalidation pulls in
the truth. Last-write-wins semantics with no warning. Acceptable today for
single-rep ports; will get reported as a bug when teams scale. No version
header (If-Unmodified-Since / ETag) anywhere in the API.
M-5. Filesystem storage backend not multi-node — silent corruption if MULTI_NODE_DEPLOYMENT mis-set
CLAUDE.md and src/lib/storage/ say the filesystem backend refuses to start
when MULTI_NODE_DEPLOYMENT=true. If that env var defaults to unset and the
operator forgets to flip it, two app nodes both write to their own local FS,
each thinking they own the only copy. Not a code bug but a configuration
cliff edge — worth re-stating in the deploy runbook.
M-6. BullMQ default removeOnFail retains failed jobs for 7 days without count cap
src/lib/queue/index.ts:33. Same volume risk as C-2: a noisy worker that
fails 10k times per day fills Redis. Add count: 1000.
Notes on what looks GOOD
archiveClientWithDecisions(client-archive.service.ts:165-300) — properFOR UPDATEon client + berth rows inside one transaction. Exemplary.- Berth-PDF version-number allocation under
pg_advisory_xact_lock— correct pattern, modulo H-2 above. - NocoDB import (
scripts/import-berths-from-nocodb.ts:137) — stable 64-bit advisory key, scoped to transaction. - Document-folder backfill (
scripts/backfill-document-folders.ts:77) — per-port advisory key viahashtext(portId), idempotent. ensureSystemRootsON CONFLICT DO NOTHING pattern is race-safe.ensureEntityFoldercorrectly distinguishes entity-id race vs sibling-name race and re-SELECTs the winner.interestBerthspartial unique index onis_primaryenforces the invariant at the DB layer (H-3 is only a UX gripe, not a data integrity issue).brochures.is_defaultpartial unique index works the same way.- Documenso
verifyDocumensoSecretuses timing-safe equality (good).
Recommended sequencing
- C-1 (signed-PDF orphan) — add row lock or unique index. Highest data-loss risk.
- C-2 (jobIds everywhere) — broad blast radius, mechanical fix.
- H-4/H-5 (admin email change + pending-change unique index) — security-adjacent.
- C-3, H-1, H-2, H-3, H-6 — correctness in real-world retry/burst scenarios.
- Medium tier as time allows.
~1430 words.
8. GDPR + privacy + PII audit (gdpr-auditor)
GDPR + Privacy + PII Audit
Repo: new-pn-crm @ feat/documents-folders. Read-only audit. Findings are grouped by severity. Line numbers are approximate.
CRITICAL
C1. GDPR export bundle is materially incomplete — Article 15 violation
src/lib/services/gdpr-bundle-builder.ts enumerates only a subset of tables that hold the data subject's PII. The following tables reference the client (client_id FK) but are NOT included in the bundle:
portal_users— the portal account itself (email, name,lastLoginAt,isActive,createdBy). Strictly required: a copy of the account record is core "data we hold about you."email_threads/email_messages— full inbound/outbound correspondence includingbodyText,bodyHtml, attachment IDs. This is the most PII-dense table in the system.document_sends— brochure / send-out audit withrecipient_email(brochures.ts).reminders— operations table withclientIdFK.formSubmissions(public form intake) — already collected viadocumentsfor the linked path, but rows whereclient_idis set directly are missed.files— files attached directly to a client (files.client_id≠ viadocuments). The builder pullsdocumentsonly.scratchpadNotes.linkedClientId— rep-side free-text notes that reference the client.clientMergeLog— historical merge records that survived earlier deduplications.contact_log(referenced fromoperations.ts).website_submissions(raw inbound inquiries before they were promoted to interests).
The bundle currently advertises itself as the Article-15 dump. Hand-delivering it would expose the controller to a regulator finding of incomplete disclosure. Fix: widen buildClientBundle to cover every client_id-referencing table (the schema grep below produces the complete list); the audit-log limit of 500 events should also be lifted or paginated for long-tenured clients.
C2. "Right to be forgotten" leaves email correspondence bodies intact
client-hard-delete.service.ts nullifies emailThreads.clientId so the thread (with bodyHtml / bodyText of every inbound + outbound message + the subject's address in from_address / to_addresses) survives the delete in perpetuity. Same pattern for files.clientId, documents.clientId, formSubmissions.clientId, reminders.clientId, documentSends.clientId. The justification in the file ("keep their audit history") is reasonable for audit metadata but the actual PII content (email body, file contents, form answers, recipient emails on brochure sends) is preserved verbatim. A subject who exercised their right to erasure has not, in practice, been erased.
Fix options: (a) cascade-delete email_threads / email_messages on client hard-delete; (b) blank body_text / body_html / address columns inline; (c) require a separate "destructive erasure" mode that the smart-archive flow ladders into.
C3. Username → email enumeration on public endpoint
src/app/api/auth/resolve-identifier/route.ts returns the canonical email when a known username is supplied (line 88: return NextResponse.json({ email: rows[0]!.email })). The miss-path returns a synthetic .invalid address, which protects hit/miss equality, but any successful hit leaks the linked email to an anonymous caller. Rate-limit is 5/15min/IP — sufficient to thwart wordlist brute-force but trivial to walk known/leaked usernames. This is also the entire point a malicious actor would call this endpoint (compromised-credentials stuffing).
Fix: don't echo the resolved email back to the client. Instead, set a short-lived signed cookie / Redis key keyed by the IP+identifier that the subsequent signIn call consumes, or hand the resolved email straight to Better Auth server-side and return only { ok: true }.
C4. Audit-log metadata is unmasked and stores raw PII forever
src/lib/audit.ts maskSensitiveFields covers oldValue/newValue only — metadata is written raw. Multiple call-sites stuff full email addresses into metadata:
client-hard-delete.service.ts:135(metadata: { sentTo: u.email })client-hard-delete.service.ts:350(bulk variant)portal-auth.service.ts:123 / 187 / 336 / 363 / 380 / 403(every portal lifecycle event)crm-invite.service.ts:206 / 264(metadata: { email: invite.email })email-accounts.service.ts:76 / 145(emailAddress)
Compounded with C5 (no audit-log retention), every staff member's, invitee's, and portal user's email lives in audit_logs.metadata indefinitely with ip_address + user_agent next to it. This is GDPR data-minimisation/storage-limitation breach territory.
Fix: extend maskSensitiveFields to walk metadata recursively, or stop emitting full emails into metadata (use user IDs + a join on demand). The masking set also needs emailAddress and sentTo aliases.
HIGH
H1. No retention policy on audit_logs
src/lib/queue/scheduler.ts registers retention crons for ai_usage, error_events, website_submissions, but not audit_logs. The schema docs the table being kept indefinitely (no pruning worker exists). With IP + user-agent on every row, plus PII in metadata (C4), the table grows unbounded and replays PII forever.
Fix: add a audit-log-retention maintenance job. Recommended split: keep severity ∈ {warning, error, critical} and source = auth for 2 years (legal/security), prune everything else after 12 months. Make the window admin-configurable.
H2. error_events request-body excerpts redact only secret-shaped keys
error-events.service.ts SENSITIVE_KEYS redacts password/token/apiKey/creditCard/ssn etc. — but NOT email, phone, name, dob, address. Any 5xx on POST /api/v1/clients, POST /api/v1/portal-users, POST /api/v1/clients/[id]/contacts, POST /api/v1/admin/users lands the requester's full client-create payload in error_events.request_body_excerpt. Retention is 90 days (good) but the captured rows are visible to every super-admin via the inspector. Fix: add PII keys to SENSITIVE_KEYS or whitelist-only the body schema per route.
H3. Email recipient address logged at debug level
src/lib/email/index.ts:154 — every outbound email logs { to, originalTo, subject }. In prod this is info only if LOG_LEVEL=debug, so usually safe, but the originalTo field also leaks the redirect-target's real address when EMAIL_REDIRECT_TO is set in dev. Tighten to messageId + portId + bool once redirect path is exercised.
H4. Portal lastLoginAt & email kept after client hard-delete
On client hard-delete the portal_users.client_id cascade fires, so the portal user is removed — good. But portal_users.email has a global unique index (idx_portal_users_email_unique) with no port_id. A previously-deleted portal user blocks a new portal account at a different port from re-using that email until/unless the cascade fires. More importantly, if cascade ever doesn't run (e.g. archive-only, no hard delete), the portal account row survives with the email. Verify the archive path also disables/erases the portal user, or document the asymmetry.
H5. Encryption-key rotation is non-incremental for SMTP/IMAP creds
src/lib/utils/encryption.ts hard-codes a single env var (EMAIL_CREDENTIAL_KEY) with no key-version or KID stored on the ciphertext. Rotating the key requires an offline mass re-encrypt; there is no migration path. The same applies to the S3 secret key (storage/s3.ts:74), webhook secret (workers/webhooks.ts:116), and storage_proxy_hmac (storage/filesystem.ts:415). Each decrypt-failure path falls through silently. Fix: prefix ciphertexts with a kid field, support 2 active keys at once, and ship a rotation script that re-wraps ciphertexts to the new key.
H6. Activation/reset tokens travel in URL query strings
portal-auth.service.ts:147 / 408 and crm-invite.service.ts:71 / 233 ship ?token=… in the activation/reset links. The token hash is stored server-side (good), but URL-borne tokens land in browser history, reverse-proxy access logs, Cloudflare logs, and Referer headers if the activation page links anywhere external (e.g. terms-of-service). Common pattern but worth flagging — consider hash fragments (#token=…) which browsers never put in Referer.
H7. IP address recorded on every audit event without a lawful-basis note
audit_logs.ip_address (system.ts:38) and the legacy second copy (line 305) are populated unconditionally. Storing IPs is lawful under "legitimate interest" for security-relevant events, but for routine update/view/create of a client record the lawful-basis argument is much thinner under recent EU regulator guidance. Fix: only retain ip_address on source ∈ {auth, webhook} and on severity ∈ {warning, error, critical}; null it on routine user-source events at write time.
MEDIUM
M1. Recent-search Redis key holds verbatim free-text queries
search.service.ts:2147 saves the raw search term to Redis under recent-search:<userId>:<portId>. If a rep types a client's email/phone/SSN to find them, that string lives in Redis with a 7-day TTL (per the constant) and is not in the GDPR bundle. Low-volume, but document and add to the bundle.
M2. GDPR-export confirmation email contains client name verbatim
client-hard-delete.service.ts sends a confirmation code email to the requester with the deleted client's fullName in subject + body. Reasonable for human verification, but it means the operator's mailbox (often Gmail/Outlook) holds the to-be-erased client's name after deletion. Document this in the privacy notice or strip to initials.
M3. GDPR export ZIP retention overlaps with subject-erasure
The bundle expires 30 days after generation (EXPIRY_DAYS = 30) — but if a subject requests both export and erasure inside that window, the staged ZIP in MinIO will outlive the database row. The cleanup cron only checks expires_at. Fix: when hardDeleteClient runs, delete any non-expired gdpr_exports blobs for that client immediately.
M4. Documenso webhook & document_sends body leak via audit metadata
docs.documenso webhook handler logs signatureHash only (good), but document_sends rows store full recipient_email on set null cascade — so when the linked client is hard-deleted, the recipient email survives on the send row. Same pattern as C2 but for the brochure channel.
M5. portal_auth_tokens.tokenHash is SHA-256, not constant-time-compared
Tokens are hashed before storage (good) but the lookup where idx_portal_tokens_hash_unique uses normal equality. Since the index lookup is O(1) indexed-equality, timing attacks are not viable here — flagged for documentation only.
M6. error_events.error_stack may contain user-supplied strings
Stack traces are 4 KB capped and only fire on 5xx, but PG-driver errors include the offending statement / parameter values in .message (e.g. duplicate-key violations expose the conflicting email). Already mitigated by error-coding most known cases through CodedError, but a defensive scrub on errorMessage for @-shaped or +\d{6,} substrings would harden the inspector.
M7. EMAIL_REDIRECT_TO is enforced only by env, not by a build assertion
The README warns it must be unset in production but there's no runtime guard. A misconfigured prod could silently redirect ALL outbound client mail to a single inbox. Fix: in src/lib/env.ts, refuse EMAIL_REDIRECT_TO when NODE_ENV === 'production'.
M8. Cross-port portal_users.email unique index leaks tenancy
Multi-tenant model says ports are isolated, but the global-unique email index means port A can probe whether email X is already a portal user at any other port by attempting to invite them and reading the conflict error. Tiny enumeration vector, fix by scoping the unique index to (port_id, lower(email)).
Notes / good practices observed
audit.tsmaskSensitiveFieldsexists and is applied toold/newJSON.logger.tsships a thoroughredact.pathslist covering auth headers, encrypted credentials, cookies.- GDPR-export uses presigned 7-day URL (acceptable, behind email auth).
- Hard-delete is two-factor (permission + email code + typed name) and gated on prior archive.
error_eventshas a 90-day retention cron (migration 0040).- AES-256-GCM for encryption at rest is correctly implemented with random IV + auth tag.
Top-3 fix priority
- C1 + C2 — complete the GDPR export bundle and make hard-delete actually erase email/file/document content (not just sever FKs).
- C3 — stop echoing real emails out of the public
resolve-identifierendpoint. - C4 + H1 + H2 — mask PII in
audit_logs.metadataanderror_events.request_body_excerpt; add anaudit_logsretention cron.
9. Email deliverability + template quality audit (email-auditor)
Audit #9 — Email deliverability + template quality
Scope: src/lib/email/**, src/components/admin/email-*, src/app/(dashboard)/[portSlug]/admin/email/page.tsx. Read-only.
Severity legend: CRITICAL = production-breaking / security; HIGH = silent feature bug or data leak; MEDIUM = rendering / brand / spam risk; LOW = polish.
CRITICAL
C1. EMAIL_REDIRECT_TO has no production guard
src/lib/env.ts:41 declares it z.string().email().optional() with no NODE_ENV constraint. src/lib/email/index.ts:131-133 silently rewrites every recipient when it is set. CLAUDE.md says "must be unset in production" but nothing enforces it: a stray prod .env value would funnel every client/portal/EOI invitation to one address with zero alarms. The only signal is a logger.debug line (index.ts:154-156) — debug, not warn, and no startup banner. The companion documenso-client.ts, email-compose.service.ts, and webhooks.ts paths also silently honour it. Add a refinement in env.ts rejecting it (or at least logger.fatal-ing) when NODE_ENV === 'production', and emit a startup logger.warn whenever it is set so it shows up in container logs.
C2. Unescaped URL interpolation into href attributes (XSS-able in browser previews)
Every template inlines ${data.link} / ${data.signingUrl} / ${data.loginUrl} / ${item.link} / ${data.inboxLink} / ${data.crmDeepLink} / ${data.signingUrl} / ${crmUrl} (the last is escaped in inquiry-sales-notification.ts:34 — sole exception) directly into href="…" and into the visible link text. escapeHtml is applied to every recipient name and copy string but never to URLs:
crm-invite.ts:42, 48portal-auth.ts:50, 56, 106, 110admin-email-change.ts:48document-signing.ts:92, 98, 211, 216notification-digest.ts:58, 74residential-inquiry.ts:76
Most URLs come from server-built strings (token endpoint + base URL), but notification-digest.items[].link is sourced from notification rows whose deep links can include user-typed entity titles / search queries depending on the producer. A single " in any of those will break out of the attribute. Email clients (Apple Mail, Outlook Web) render the resulting HTML and attribute injection becomes click-jacking / open-redirect. Cheapest fix: pass every URL through encodeURI or an escapeAttribute helper before interpolation, and reject javascript: / data: schemes at the helper level. None of the templates currently verify https:// prefix.
HIGH
H1. Template-subject override mechanism is silently disconnected for ~half the catalog
src/lib/email/template-catalog.ts:39-98 advertises 8 templates as customisable, and src/components/admin/email-templates-admin.tsx exposes a subject editor. But several templates don't accept overrides.subject and so the admin's edit is silently ignored:
inquiry-client-confirmation.ts:23-26— nooverrides.subjectpathinquiry-sales-notification.ts:24— sameresidential-inquiry.ts:19, 61— both functions, no override pathcrm-invite.ts:26— no override path
Only portal-auth.ts (activation/reset) and document-signing.ts honour overrides.subject. Admins editing the "Inquiry — client confirmation" subject in /admin/email-templates see the green "Saved" toast and nothing changes. This is the kind of bug users don't report; they assume their override worked.
H2. Catalog defaultSubject strings DO NOT MATCH the literal subjects the code emits
The admin UI shows Default: <catalog string> so users can tell whether they have customised, but every comparison is broken because the strings diverge from the actual templates:
| Template | Catalog default | Code-emitted subject |
|---|---|---|
crm_invite |
You have been invited to {{portName}} CRM |
You're invited to the ${portName} CRM |
inquiry_client_confirmation |
We received your inquiry — {{portName}} |
Thank You for Your Interest in Berth ${mooringNumber} (or …in a ${portName} Berth) |
inquiry_sales_notification |
New berth inquiry — {{clientName}} |
New Interest - ${portName} |
residential_inquiry_client_confirmation |
We received your residential inquiry — {{portName}} |
Thank You for Your Interest - ${portName} Residences |
residential_inquiry_sales_alert |
New residential inquiry — {{clientName}} |
New Residential Inquiry - ${data.fullName} |
portal_activation |
Activate your {{portName}} client portal account |
matches |
portal_reset |
Reset your {{portName}} client portal password |
matches |
Combined with H1 this means the entire admin-customisation surface is half wired. Pick one of: (a) wire overrides.subject through every template and remove the divergence, or (b) drop the catalog rows for templates that can't be customised yet.
H3. Email Settings page exposes dead form fields
src/app/(dashboard)/[portSlug]/admin/email/page.tsx:34-48 lets admins set email_signature_html and email_footer_html. getPortEmailConfig (port-config.ts:153-154,179-180) reads them into PortEmailConfig.signatureHtml / footerHtml. But sendEmail (src/lib/email/index.ts) never reads or injects them, and shell.ts only reads the unrelated branding_email_footer_html / branding_email_header_html keys (via getBrandingShell → getPortBrandingConfig, lines 39-40 of shell.ts).
Result: the "Default signature (HTML)" and "Email footer (HTML)" controls on /admin/email are write-only sinks. Admins customise the footer; outbound emails never include it. There is a real customer-confidence hit here — a port admin will set a legal disclaimer expecting it on every send. Either (a) wire cfg.footerHtml/signatureHtml into renderShell or sendEmail, or (b) delete the fields from the admin page and consolidate on the Branding-page keys.
H4. residential-inquiry.ts returns NO plaintext fallback
residentialClientConfirmation (line 41) and residentialSalesAlert (line 78) return { subject, html } only — no text. Every other template returns text. Lack of a text part materially hurts spam scoring and breaks plain-text-only readers (some BlackBerry, screen-reader bridges, legacy MTAs). sendEmail (index.ts:144-152) honours text when present, so the consequence is "no plain-text MIME part is attached" — Gmail will still render it but Spamassassin's MIME_HTML_ONLY adds points.
H5. inquiry-sales-notification includes crmUrl (which the admin may set) without scheme validation
inquiry-sales-notification.ts:34 does escapeHtml(crmUrl) then drops it into both href and visible text. Escaping prevents attribute breakout but a javascript: or data:text/html scheme survives entity-encoding. Validate the scheme is https?: server-side before passing in (the producer probably already does; defence-in-depth here is one regex).
MEDIUM
M1. Admin-authored emailHeaderHtml / emailFooterHtml injected raw
shell.ts:39-40, 64-66 interpolates branding HTML directly: ${headerHtml ? \
. Source is system_settings.branding_email_header_html (admin-only write). An admin account compromise → arbitrary HTML in every outbound email (, tracking pixels exfiltrating recipient IPs, phishing forms in some clients). Email clients largely strip script, but , `, and CSS-position overlays still work in Apple Mail / Outlook desktop. Mitigation: run admin input through a server-side sanitiser (DOMPurify / sanitize-html with an email-safe allowlist).
M2. No dark-mode safety
shell.ts:42-74 ships no <meta name="color-scheme" content="light dark"> / <meta name="supported-color-schemes"> and no @media (prefers-color-scheme: dark) rules. Apple Mail and Gmail auto-invert backgrounds: the #ffffff card stays white but body text (#333333) gets pseudo-darkened in some clients, and the #666 muted copy can drop below contrast threshold. Cards with box-shadow: 0 2px 4px rgba(0,0,0,0.1) render as halos in dark mode.
M3. No MSO / Outlook fallbacks
The shell has no <!--[if mso]> conditionals. The CTA buttons are CSS-padded <a> — Outlook 2016/2019 on Windows renders them as tiny underlined text (the link works but the button shape is gone). Recommended: VML rect fallback inside <!--[if mso]> for every CTA, or switch to bulletproof-button pattern (mso-padding-alt, text-decoration:none etc.).
M4. Background image won't render where it matters most
shell.ts:55 sets background-image: url('…Overhead_1_blur.png') on the outer <table>. Outlook strips background-image entirely (no VML fallback supplied); Gmail mobile sometimes too. The background-color:#f2f2f2 fallback works but the brand impression is lost in the highest-volume client. Either drop the bg image (CLAUDE.md flags moving the asset off s3.portnimara.com anyway) or add VML rect for Outlook.
M5. No preheader
No hidden inbox-preview text. Currently the first visible line ("Welcome to {portName} CRM" or "Just a quick reminder") leaks into the preview pane after the subject. A 1px hidden preheader (<div style="display:none;max-height:0;overflow:hidden;">…</div>) is one of the highest-ROI deliverability tweaks; missing here.
M6. Logo width="100" without 2x source / explicit height
shell.ts:62. Apple Mail on retina renders this scaled — acceptable since the source PNG is 250px wide. But height is unset which forces some clients to recompute mid-render (jank). Add height="100" (assuming square) and the asset is fine.
LOW
- L1. Hardcoded
en-GBdate locale.document-signing.ts:141callstoLocaleString('en-GB', …). CRM is positioned as multi-port; once a non-UK port arrives this is wrong. Read locale from port-config. - L2. No
List-Unsubscribe/List-Unsubscribe-Postheaders.sendEmailadds none. Gmail's Feb-2024 bulk-sender requirements madeList-Unsubscribe: <mailto:>table-stakes for any sender exceeding 5k/day. Transactional senders technically exempt, but with notification digests + inquiry confirmations flowing through the same SMTP path, hitting that threshold is plausible. One-line fix. - L3. No
Message-ID/In-Reply-Tothreading for digest-style mails. Each digest is a new thread; users will hate this once volume rises. - L4.
logger.debuginsendEmail(index.ts:154) emits recipient address. PII in log lines. At debug level so prod typically masks, but worth pino-redactingtoandoriginalTo. - L5.
crmInviteEmail,adminEmailChangeEmail,notificationDigestEmailnot inTEMPLATE_KEYS. Means the admin can't customise their subjects at all — inconsistent with the rest. Either add them or document the omission. - L6. Hardcoded English copy. Every template — buttons ("Sign in", "Activate account", "Set up your account"), greetings ("Dear …", "Hi …"), legal-ish boilerplate ("If you didn't request this …"). No i18n hook. Out-of-scope for v1 but flag for the Phase-7 cutover note in
project_email_ownership_at_cutover.md. - L7.
SMTP_FROMfallback insendEmailbuildsnoreply@${env.SMTP_HOST}. IfSMTP_HOSTissmtp.gmail.comthe From becomesnoreply@smtp.gmail.com— invalid sender, instant SPF fail. Acceptable because production setsSMTP_FROMexplicitly, but worth alogger.warnwhen this fallback is hit. - L8. Subject prefix when redirecting (
index.ts:132-134) —[redirected from x@y]appears verbatim and is fine in dev, but ifEMAIL_REDIRECT_TOever slips into prod (see C1) this is the only forensic trail.
What's good
- All admin-supplied content (names, descriptions, custom messages, notes) is consistently
escapeHtml-ed before interpolation. The only escape gaps are URLs (C2). - Per-port branding shell is well-isolated;
getBrandingShellfalls back to defaults cleanly. resolveAttachmentsenforcesportIdcross-tenant isolation (index.ts:94-96).- SMTP timeouts are explicit (
SMTP_TIMEOUTS,index.ts:20-24) — averts the BullMQ-slot starvation the comments warn about. EMAIL_REDIRECT_TOplumbing is consistent acrosssendEmail,documenso-client,email-compose,webhooksworkers — when set, every outbound channel honours it.
Suggested fix order
- C1 (production guard for
EMAIL_REDIRECT_TO) - C2 (URL escaping / scheme allowlist)
- H3 (delete or wire up the dead Email Settings fields — fastest unblocker for admins)
- H1 + H2 (fix catalog / wire override paths)
- H4 (add plaintext to residential templates)
- M1 (sanitise admin HTML)
- M2 / M3 / M5 (dark mode + MSO + preheader)
- The Lows whenever convenient.
Total: ~1460 words.
10. Error UX + failure-mode resilience audit (error-ux-auditor)
Error UX + Failure-Mode Resilience Audit
Repo: /Users/matt/Repos/new-pn-crm — branch feat/documents-folders.
Scope: route-segment error/not-found/loading coverage, error-boundary placement,
toast quality, leak surface, degradation when Redis / SMTP / Documenso / MinIO
are down.
CRITICAL
C1. Only ONE error.tsx for the entire app, no not-found.tsx per group, no global-error.tsx
find src/app -name "error.tsx" returns exactly src/app/(dashboard)/error.tsx.
Plus src/app/not-found.tsx (root) and a single loading.tsx under
clients/[clientId]. 73 dashboard pages, 0 portal/auth/scanner error files.
Consequences:
- A throw inside
/portal/*(dashboard, invoices, documents, profile) has no boundary — Next default unstyled page, no branding, no requestId. (auth)(login, reset-password, set-password) same exposure.(scanner)/[portSlug]/scan— receipt scanner on a phone, throws on Tesseract/OpenAI failure, no fallback.notFound()calls inside portal routes fall through to the rootnot-found.tsx, which links to/dashboard— wrong destination for portal users (lands them at CRM login).- No
global-error.tsx— ifRootLayoutthrows (it reads cookies + ALS), user gets Next default.
Add at minimum:
src/app/(portal)/error.tsx+src/app/(portal)/portal/not-found.tsxsrc/app/(auth)/error.tsx(wrapped inBrandedAuthShell)src/app/(scanner)/[portSlug]/scan/error.tsxsrc/app/(dashboard)/[portSlug]/not-found.tsx(port-aware link target)src/app/global-error.tsx
C2. 14+ naked toast.error(err.message) call sites bypass toastError()
grep "toast.error.*err.message" returns 14 hits: client-list.tsx,
hard-delete-dialog.tsx, bulk-archive-wizard.tsx, smart-restore-dialog.tsx,
smart-archive-dialog.tsx, bulk-hard-delete-dialog.tsx,
portal/change-password-form.tsx, more. These drop:
- the stable error code line
- the Reference ID line
- the "Copy ID" action button
On a 500 they show "Internal server error" with nothing copyable. On a
network failure they show "Failed to fetch" (raw TypeError). Several of
these files already import toastError for other call sites — the swap is
mechanical: toast.error(err instanceof Error ? err.message : 'X failed')
→ toastError(err, 'X failed').
C3. apiFetch collapses 502/504 with non-JSON body to "Bad Gateway", no requestId
src/lib/api/client.ts:75:
const error = await res.json().catch(() => ({ error: res.statusText }));
Reverse-proxy error pages (nginx, Cloudflare) deliver HTML, not JSON. The
user gets ApiError{ message: "Bad Gateway", code: null, requestId: null }
— no "Copy ID" action. When proxy fails, the user has nothing to paste to
support. Synthesize a client-side correlation ID + a "The server is
unreachable. Please try again." string when status >= 500 && JSON.parse fails.
C4. Redis outage wedges every rate-limited route — including login
src/lib/redis.ts uses maxRetriesPerRequest: 3. After exhaustion every
call from checkRateLimit() (rate-limit.ts:44) throws ioredis errors.
withRateLimit doesn't try/catch, so it bubbles to errorResponse() as 500. /api/auth/* is wrapped in withRateLimit('auth') — a Redis
blip 500s login. Same exposure for portal sign-in (portalSignIn
limiter on /api/portal/auth/sign-in).
Fix: in checkRateLimit, catch redis errors and fail-open for auth /
portal-signin (log "rate-limit subsystem unavailable, allowing request")
or fall back to a local in-memory limiter.
Same audit needed for BullMQ getQueue().add() calls — confirm
user-blocking enqueues (sendForSigning, requestGdprExport) degrade to
"we'll process this later" instead of 500.
HIGH
H1. SMTP failure semantics differ across callers
sendEmail() has 10s/10s/30s timeouts (email/index.ts:20) — good. Callers
diverge:
users.service.ts:381(admin email-change notify) —logger.warn+ swallow. ✓me/email/route.ts:93—Promise.allSettled. ✓document-signing-emails.service.ts:169,211,247— throws → 500. Documenso already sent; user sees a 500 even though the workflow succeeded. Wrap in try/catch + markdelivery_status: 'failed'so the inbox panel surfaces a retry button.queue/workers/email.ts— BullMQ retries 5× then permanent failure. No DLQ admin surface (webhooks have one atwebhooks.ts:281; mirror it).
H2. Storage timeout error lacks semantic name → bad classifier hint
src/lib/storage/s3.ts:52:
throw new Error(`S3 ${label} timed out after ${ms}ms`);
error-classifier.ts ERROR_NAME_HINTS looks for TimeoutError; this
throws plain Error. The path-based classifier catches "Storage backend"
first but loses the timeout-vs-misconfig distinction. Define a
class TimeoutError extends Error and throw it from withTimeout.
H3. Documenso outage: error codes good, UI feedback poor
documenso-client.ts:42-60 maps to DOCUMENSO_TIMEOUT, _AUTH_FAILURE,
_UPSTREAM_ERROR. Toasts render cleanly. Missing:
- The signing page doesn't show "Documenso is unreachable, your draft is saved." Users refresh and assume the draft is gone.
- The webhook receiver has no per-port rate-limit on 5xx. Documenso retry storms can land if our handler regresses.
H4. Heavy components have no error boundaries except /dashboard/page.tsx widgets
WidgetErrorBoundary is used in 4 dashboard widgets. NOT wrapped around:
command-search.tsx(1177 lines) — mounted in header; one render throw kills the entire shell.invoice-pdf-preview.tsx(pdfme — known to throw on malformed font/image).pageviews-chart.tsx,pipeline-funnel-chart.tsx, charts outside/dashboard/page.tsx.- The signed-PDF iframe inside
documents/[id]/page.tsx— when MinIO is down, chrome-internal error renders in-place with no retry.
Wrap each in WidgetErrorBoundary with a sensible fallback.
H5. New + public routes bypass errorResponse()
grep "errorResponse" shows 691 hits. The exceptions don't propagate
X-Request-Id and produce inconsistent shapes:
src/app/api/storage/[token]/route.ts— bareNextResponse.json({error:'Invalid or expired token'}), no requestId.src/app/api/public/website-inquiries/route.ts:75,122— bare{error:'Unauthorized'}/{error:'Unknown port'}.src/app/api/webhooks/documenso/route.ts:100—{ok:false, error:'Invalid secret'}200. Returning 200 is correct (no Documenso retry storms), but the literal string "Invalid secret" confirms the endpoint expects a secret. Drop the string.src/app/api/auth/resolve-identifier/route.ts:91— defensive 200 returning synthetic email. By design — keep.
MEDIUM
M1. Root not-found.tsx link target wrong for non-CRM users
Links to /dashboard. Portal users hit /portal/dashboard, unauth users
need /login or /portal/login. Detect cookie/route prefix.
M2. Suspense boundaries are sparse — 9 across src/app, 0 in components
Only set-password, portal/activate, portal/reset-password wrap
useSearchParams in Suspense. Every detail page (yacht, company,
interest, berth, document, invoice, expense, reservation) flashes empty
header on direct URL visits because there's no loading.tsx. Only
clients/[clientId]/loading.tsx exists — replicate the pattern across
detail routes.
M3. error_events capture is fire-and-forget — DB write failure swallowed silently
void captureErrorEvent({…}) (errors.ts:146, 170). If the DB is up but
the insert fails (FK to a deleted user, etc.) the row is lost forever and
the super-admin can't trace the original error. Add a fallback that
writes to pino with tag:'error_event_capture_failed' so the super-admin
can grep server logs as a last resort.
M4. PG-error 23505 sometimes leaks as 500 instead of ConflictError
berth-reservations.service.ts:29-43 and document-folders.service.ts:18-31
explicitly map 23505 → ConflictError. Confirm clients.service.ts,
companies.service.ts, yachts.service.ts write paths do the same — at
least one or two likely bubble a raw PG error and 500 on duplicate email
/ duplicate mooring instead of a 409 with a friendly "this name is
already in use" message.
M5. /api/ready doesn't exist
/api/health (liveness) returns 200 unconditionally — correct. The
comment promises /api/ready for deep checks, but find shows no such
file. /api/public/health does deep checks gated by WEBSITE_INTAKE_SECRET
— wire k8s readiness probe to it or stub /api/ready.
M6. formatErrorBanner (admin inline forms) doesn't render "Copy ID" action
Lives in toast-error.ts next to toastError(). The toast version has a
button; the banner is plain text. Admin users hitting a 500 from inline
forms get the reference ID printed but can't click-to-copy. Either build
<ErrorBanner err={…}> as a React component or accept the gap.
M7. Worker BullMQ failures have no user-visible surface beyond webhooks
logger.error({jobId, err}, '<queue> job failed') is uniform across all
workers (email, documents, notifications, reports, export, ai, webhooks).
Only webhooks.ts:281 plumbs a dead_letter notification on permanent
failure. Notification/email/export workers should follow suit — for
example, a stuck GDPR export should email the user "your export failed,
retry from /settings/data."
M8. Portal auth pages would lose brand on render error
Portal-auth pages wrap content in BrandedAuthShell. A throw inside the
shell or form lands at Next default page (no (portal)/error.tsx). Add
(portal)/portal/error.tsx that renders <BrandedAuthShell> around the
error so the brand survives.
Summary
| Severity | Count |
|---|---|
| CRITICAL | 4 |
| HIGH | 5 |
| MEDIUM | 8 |
Highest leverage: ship the 4 missing route-segment files (C1), sweep the
14 bare toast.error(err.message) sites to toastError() (C2), make
checkRateLimit fail-open when Redis is down (C4). Together these mean
every user-visible degradation is branded + every 5xx surfaces a
copy-pasteable reference ID.
14. Documenso integration depth audit (documenso-auditor)
Documenso Integration Depth Audit — Task #14
Scope: documenso-client.ts, documenso-payload.ts, eoi-context.ts, app/api/webhooks/documenso/route.ts.
Read-only. Severity: CRITICAL / HIGH / MEDIUM.
CRITICAL
C1. In-app EOI pathway bypasses per-port Documenso config
generateAndSignViaInApp in document-templates.ts calls documensoCreate(...) and documensoSend(...) without portId (lines 831–843). resolveCreds then returns the global env triple (DOCUMENSO_API_URL, DOCUMENSO_API_KEY, DOCUMENSO_API_VERSION).
Consequences on multi-tenant deployments:
- Per-port
apiVersionignored → a v2 port silently hits v1 endpoint paths (or vice versa);createDocument/sendDocumentpick the wrong branch. - Per-port
apiKeyignored → auth fails on tenants whose key is only insystem_settings.documenso_api_key_override. redirectUrlandsigningOrderSEQUENTIAL/PARALLEL settings never plumbed — the in-app pathway passes nometaarg. Signers always land on Documenso's default thank-you page and v2 ports always sign PARALLEL regardless of admin choice.
Fix: thread portId and a CreateDocumentMeta built from getPortDocumensoConfig(portId) into both calls — mirror generateAndSignViaDocumensoTemplate at lines 894–910.
C2. handleDocumentCompleted idempotency has a real cross-channel race
The early-return at documents.service.ts:1110 (if (doc.status === 'completed' && doc.signedFileId) return;) is necessary but not sufficient. Two write paths can race:
- Webhook receiver →
handleDocumentCompleted. - Background poll worker
jobs/processors/documenso-poll.ts:63→ same call (same args).
The route-level documentEvents.signatureHash dedup only catches webhook→webhook repeats. It does not catch webhook + poll, because the poll worker bypasses the webhook entry point and has no signatureHash row. Both can:
- Resolve
doc(status=sent,signedFileId=null). - Pass the gate.
downloadSignedPdf→storage.put→db.insert(files)→db.update(documents).set({ status:'completed', signedFileId }).
Outcome: two files rows, two MinIO blobs; the second UPDATE overwrites signedFileId, orphaning the first row + blob (no DB pointer, never GC'd).
Fix: wrap gate-and-write in a transaction with SELECT ... FOR UPDATE on the documents row, or a pre-claim UPDATE documents SET status='completing' WHERE id=? AND status != 'completed' RETURNING * that atomically reserves the row.
C3. Webhook silently swallows handler errors → permanent event loss
route.ts:264–266 catches every handler throw, logs, returns 200 (intentional — "always 200"). But a transient storage/DB failure inside handleDocumentCompleted is lost forever — Documenso records the event as delivered and never retries. Poll worker is the only safety net.
Fix: on handler throw, return non-200 so Documenso retries (bounded budget); or push the raw body onto a BullMQ replay queue.
HIGH
H1. Multi-berth Berth Range Documenso template field still pending
buildDocumensoPayload writes formValues['Berth Range']: context.eoiBerthRange (line 157), and eoi-context.ts:128–135 populates it from interest_berths.is_in_eoi_bundle=true via formatBerthRange(). The live Documenso v1 template does not yet have this field (CLAUDE.md confirms). Documenso v1's templates/{id}/generate-document silently drops unknown formValues keys — multi-berth EOIs currently render with only the primary mooring in Berth Number.
The in-app pathway (pdf/fill-eoi-form.ts:62–67) fails loudly when its AcroForm field is missing; the Documenso pathway fails silently. Add a startup GET /api/v1/templates/{id} preflight that warns when Berth Range is absent.
H2. placeFields v2 path is unverified against a live Documenso 2.x instance
documenso-client.ts:636 has an explicit "must be confirmed against a live Documenso 2.x instance — top v2 risk" comment. Concerns:
- Body uses
recipientId: String(f.recipientId); v2 may want numeric ID or string token — unverified. - Geometry name mapping (
positionX/positionY/width/heightvs v1pageX/...) is correct in shape, unverified in field naming. fieldMetashipped verbatim; v2'screate-manyschema unpinned.
Any port flipped to apiVersion='v2' using upload-and-place is rolling the dice until realapi run is green.
H3. v1 fallback for CHECKBOX/DROPDOWN/RADIO is broken — silently
fieldTypeNeedsMeta permits CHECKBOX/DROPDOWN/RADIO. On v1, placeFields strips fieldMeta (lines 663–671 omit it) and v1's /documents/{id}/fields doesn't accept option metadata. A CHECKBOX placed on a v1 port renders as an unconfigured input with no options.
Code comment acknowledges ("falls back to blank-input behaviour"), but the placement UI gives no signal. Add a v1-aware preflight that disables these field types when apiVersion='v1'.
H4. sendDocument v2 redistribute recipient scoping is unverified
sendReminder v2 (lines 391–407) ships { envelopeId, recipientIds: [signerId] } to /api/v2/envelope/redistribute. The leading comment contradicts the body: "redistributes to all pending recipients on the envelope. Single-recipient targeting requires admin-side filtering."
If v2 ignores recipientIds, every "remind one signer" click resends to everyone, including already-completed signers — embarrassment risk on multi-signer EOIs. Realapi verification needed; reconcile comment with implementation either way.
MEDIUM
M1. apiVersion='v1' template-flow caveat correct but locks out v2 features
generateDocumentFromTemplate is hard-coded to /api/v1/templates/{id}/generate-document regardless of apiVersion. v2 instances accept this via backward-compat. Risk: a v2-native admin who built a template in the v2 UI may have field IDs but no stable field names — formValues keyed by name won't match. If Documenso drops v1 compat, every template-flow EOI breaks atomically. Plan now to capture per-template field-ID metadata in admin settings.
M2. getPageDimensions cache + A4 assumption
documenso-client.ts:597 returns DEFAULT_PAGE_DIMENSIONS = { 595, 842 } (A4 portrait, pt) unconditionally — the cache is dead code. Fine for the A4 EOI source PDF; for admin-uploaded contracts in Letter/A3/landscape, percent→pixel conversion is wrong by 5–30%, placing fields off-page or in the wrong band. Capture real page size via pdf-lib at upload time.
M3. normalizeDocument recipient id collapses to '' on missing fields
Line 75: id: String(rec.recipientId ?? rec.id ?? ''). When both keys are absent (malformed response), id becomes ''; downstream maps keyed by recipient id collapse all phantoms into one bucket. Throw or filter when id is empty.
M4. applyPayloadRedirect /email$/i regex is fragile
documenso-client.ts:148 matches keys ending in email. A future field like notificationEmailAddress or cc_email_2 would be missed and could leak past EMAIL_REDIRECT_TO. Either widen the heuristic, or declare email fields explicitly in DocumensoTemplatePayload and rewrite only those.
M5. voidDocument 404-idempotency loses tenant signal
On 404, log + return silently. The local doc may still have status='sent', so a retry re-attempts. Mostly benign — but set local status='voided' on 404 so DB converges with remote-not-found reality.
M6. EOI hard-gate error code
eoi-context.ts:206 produces Cannot generate EOI - missing required client details: .... Labels are clean (good), but no structured code/field array — UI can't deep-link to the missing tab. Add code: 'EOI_GATE' + missing fields array.
M7. Webhook signatureHash covers replay but not v2 timestamp drift
Confirmed body-sha256 dedups same-payload retries. If v2 ever varies signedAt on retry, the per-recipient ${signatureHash}:signed:${email} keys differ → repeat processing. The per-recipient document_events index protects writes there, but handleRecipientSigned likely also advances interest stage — verify that side-effect is idempotent too.
What's solid
normalizeDocumentid↔documentId symmetric; downstream consumes the legacyidform consistently — no stray reads ofdocumentId/recipientId.canonicalizeEventcorrectly mapsDOCUMENT_SIGNED↔document.signedand routes v2 aliases (RECIPIENT_SIGNED,RECIPIENT_VIEWED) to v1 equivalents with a telemetry log line.verifyDocumensoSecrettiming-safe, iterates per-port + global env, rate-limits bad-secret IPs.handleDocumentCompletedearly-return is the right shape for the common same-channel retry case. Cross-channel race (C2) is separate.eoi-context.eoiBerthRangeplumbing correctly walksinterest_berths.is_in_eoi_bundle=trueand produces the compact range. Gap is template-side (H1).- SEQUENTIAL/PARALLEL
signingOrdercorrectly wired ingenerateAndSignViaDocumensoTemplate(document-templates.ts:909). Gap is the in-app pathway (C1). buildDocumensoPayload.meta.distributionMethod = 'NONE'— distribute invoked separately bysendDocument. Correct on both versions.- EOI hard-gate matches Section 2 requirements (name/address/email); yacht + berth correctly optional.
Pending — all complete
All 19 audit tasks finished. Every report is inlined above.
Appendix: methodology + agent roster
Audit was run as a single pn-crm-audit Claude Code team. Each teammate was a separate Claude Opus 4.7 instance with read-only static-analysis scope (no file edits permitted by the brief). Time budget: 22 minutes per agent. Reports were written to /tmp/audit-*.md and consolidated here.
Team members
| Agent | Task | Output |
|---|---|---|
| security-auditor | #1 Security + API + auth | /tmp/audit-security.md |
| ui-ux-auditor | #2 UI/UX + a11y | /tmp/audit-ui-ux.md |
| data-model-auditor | #3 Data model + migrations | /tmp/audit-data-model.md |
| services-auditor | #4 Services + realtime + storage | /tmp/audit-services.md |
| perf-test-auditor | #5 Performance + code-trim + render | /tmp/audit-perf-test.md |
| obs-i18n-docs-auditor | #6 Observability + i18n + docs | /tmp/audit-obs-i18n-docs.md |
| concurrency-auditor | #7 Concurrency + races | /tmp/audit-concurrency.md |
| gdpr-auditor | #8 GDPR + PII | /tmp/audit-gdpr.md |
| email-auditor | #9 Email deliverability | /tmp/audit-email.md |
| error-ux-auditor | #10 Error UX + failure modes | /tmp/audit-error-ux.md |
| reporting-auditor | #11 Reporting math | pending |
| onboarding-auditor | #12 Onboarding UX | pending |
| pdf-auditor | #13 PDF + brand assets | pending |
| documenso-auditor | #14 Documenso depth | /tmp/audit-documenso.md |
| copy-auditor | #15 Copy + terminology | pending |
| deps-auditor | #16 Deps + supply chain | pending |
| build-auditor | #17 Build + prod readiness | pending |
| recommender-auditor | #18 Berth recommender | pending |
| search-auditor | #19 Search relevance | pending |
11. Reporting + analytics math correctness (reporting-auditor)
Task #11 — Reporting + Analytics Math Correctness
Scope: dashboard widgets, kanban "active deals", pipeline-report PDF, revenue-report PDF, dashboard.service.ts, report-generators.ts, analytics.service.ts. Read-only audit.
Canonical pipeline stages live in src/lib/constants.ts → PIPELINE_STAGES:
open, details_sent, in_communication, eoi_sent, eoi_signed, deposit_10pct, contract_sent, contract_signed, completed. STAGE_WEIGHTS matches.
CRITICAL
C1. Hot-deals card ranks/labels on non-existent stage names
src/lib/services/dashboard.service.ts:198-208 (getHotDeals) builds a CASE that references 'in_comms' and 'deposit_10'. The DB column interests.pipeline_stage stores 'in_communication' and 'deposit_10pct'. Both real stages fall through to ELSE 0, collapsing the rank ladder so any eoi_sent deal outranks every in_communication/deposit_10pct deal, and ordering inside the top tier becomes "newest updatedAt wins" instead of "furthest along."
The frontend mirror in src/components/dashboard/hot-deals-card.tsx:26-36 (STAGE_LABELS) uses the same wrong keys (deposit_10, in_comms), so the badge for those two stages renders the raw enum string deposit_10pct / in_communication instead of "Deposit 10%" / "In Comms." Fix both files; prefer importing STAGE_LABELS from @/lib/constants rather than re-declaring it.
C2. Revenue PDF "TOTAL COMPLETED REVENUE" silently includes lost & cancelled deals
setInterestOutcome in interests.service.ts:919-943 forces pipelineStage = 'completed' for every outcome (won, lost_*, cancelled). fetchRevenueData in report-generators.ts:126-140 then sums berth prices for pipelineStage='completed' AND archivedAt IS NULL with no outcome filter, and the PDF prints the result as TOTAL COMPLETED REVENUE (revenue-report.ts:97). Result: a marina with 1 won + 10 lost deals at €1M berths reports €11M completed revenue. Add eq(interests.outcome, 'won') to the completedRevenue query (and probably to the per-stage breakdown).
C3. Pipeline PDF stageCounts query has no GROUP BY
report-generators.ts:54-60:
db.select({ stage: interests.pipelineStage, count: count() })
.from(interests)
.where(...); // ← no .groupBy()
Postgres rejects a non-aggregated column without GROUP BY (42803). Either the pipeline PDF report has been crashing silently in the worker queue for any port with rows, or every run produces a single row that misses every stage but one. Add .groupBy(interests.pipelineStage).
HIGH
H1. "Active interest" means four different things across surfaces
| Surface | Filter |
|---|---|
getKpis / getPipelineCounts / getRevenueForecast (dashboard tiles + forecast) |
archivedAt IS NULL AND (outcome IS NULL OR outcome='won') |
computePipelineFunnel (analytics funnel) |
same — but additionally bounded by createdAt BETWEEN range |
listInterestsForBoard (kanban) — interests.service.ts:194 |
archivedAt IS NULL only ⇒ lost & cancelled cards still appear on the board (they all sit in the completed column because of C2) |
getHotDeals |
archivedAt IS NULL AND outcome IS NULL (also excludes won — intentional per comment but worth flagging) |
fetchPipelineData / fetchRevenueData (PDF reports) |
archivedAt IS NULL only ⇒ includes lost & cancelled |
computeRevenueBreakdown (invoices) |
unrelated definition — by invoice status |
A rep who reads "12 Active Deals" on the tile then opens the kanban can see 17 cards, because the kanban silently includes 5 lost deals routed to the completed column. Consolidate into a single activeInterestsWhere(port) helper and reuse everywhere.
H2. Occupancy rate uses two different sources, same dashboard
getKpis(KPI tile) +fetchOccupancyData(PDF) compute occupancy fromberths.status IN ('sold','under_offer').computeOccupancyTimeline(chart on the analytics page) computes occupancy fromberth_reservationsoverlap with each day, withtotal = COUNT(berths).
The two are unrelated: a berth marked sold with no active reservation contributes to the tile but not the timeline; a berth marked available with an active reservation contributes to the timeline but not the tile. Reps will see the tile read 64% and the chart's right-most point read 12% on the same day. Pick one definition (status-based is the documented one in CLAUDE.md) and align the timeline.
H3. Revenue PDF stage breakdown is unweighted; dashboard forecast is weighted
fetchRevenueData.stageRevenue (report-generators.ts:107-118) does SUM(berths.price) per stage with no pipeline_weights multiplier. The dashboard RevenueForecast widget multiplies by pipeline_weights[stage]. So:
- Tile shows €420K (weighted).
- Revenue PDF "Revenue by Pipeline Stage" for the same data shows €1.6M (unweighted). The two are reconcilable in principle but no rep will guess that. Either weight the PDF the same way, or rename the PDF column to "Berth Price by Stage (gross)".
H4. pipeline_weights defaults duplicated in two source files
src/lib/constants.ts:68 (STAGE_WEIGHTS) and src/components/admin/settings/settings-manager.tsx:76-86 hard-code the same object. Drift between the two means admins editing settings could see different defaults than the forecast actually uses. The settings form should import { STAGE_WEIGHTS } from '@/lib/constants' and spread it as defaultValue.
H5. getRevenueForecast silently zeroes out stages with missing weight keys
dashboard.service.ts:139 does weights[stage] ?? 0. If an admin saves pipeline_weights as { "in_comms": 0.2, ... } (legacy key) or simply omits a stage, every active interest at the missing stage contributes €0 to the forecast — no warning, no fallback to STAGE_WEIGHTS[stage]. Validate the saved JSON against PIPELINE_STAGES at write time, OR fall back to the constant per-key (weights[stage] ?? STAGE_WEIGHTS[stage]).
MEDIUM
M1. Interests with no primary berth disappear from "pipeline value"
getKpis and getRevenueForecast use INNER JOIN interest_berths ON isPrimary=true. An interest without a primary-berth link (legitimate while the rep is still sourcing) contributes 0 to pipelineValueUsd and to totalWeightedValue, but is still counted in activeInterests and on the kanban. Mismatch between deal count and value. Surface a footnote (e.g. "5 deals not yet matched to a berth") or LEFT JOIN with a price-coalesce.
M2. "Top Interests by Value" PDF includes lost deals
fetchPipelineData.topInterestsRows (lines 68-83) orders by berths.price DESC NULLS LAST with no outcome filter. A €4M lost deal will sit at the top of the report. Add (outcome IS NULL OR outcome='won').
M3. PDF stage order hardcoded inside both templates
pipeline-report.ts:58-68 and revenue-report.ts:55-65 redeclare the canonical stage order. Renaming a stage in constants.ts will leave the renamed stage appended to the "unknown stages" tail block instead of in its proper position. Import and iterate PIPELINE_STAGES.
M4. selectDistinct in pipelineValueUsd is correct but fragile
dashboard.service.ts:39-47 selectDistinct({ berthId, price }) happens to dedupe correctly because berthId is unique. If a future schema lets two interest_berths rows reference the same berth as primary (the partial unique index permits this if the other row has isPrimary=false), the join would still emit one row per primary-only match. Today's behaviour is fine; a comment in the code claims correctness but doesn't explain why. Add a one-line note tied to the partial unique index.
M5. getHotDeals ordering tiebreaker uses updatedAt while UI shows lastContact
The query orders by desc(rank), desc(updatedAt) (dashboard.service.ts:234) but the card surfaces last touched X ago from dateLastContact. When a stage rank ties, the card with the most recent edit (rename, tag change, stage move) wins, not the most recent contact. Reps will be confused why an interest with 30-day-old lastContact sits above one with 2-day-old contact. Either order by coalesce(dateLastContact, updatedAt) or drop the "last touched" copy.
M6. Source-conversion total includes archived-but-active deals only? No — also includes "still open"
getSourceConversion denominator is "every non-archived interest of that source" (dashboard.service.ts:262). For a source with 100 leads / 5 won / 0 lost / 95 still open, conversion = 5%. A source with 5 leads / 5 won shows 100%. The metric isn't wrong, but the description text "Won deals as a percentage of leads per source" implies a closed funnel; consider switching denominator to won + lost for the "true" rate, or rename the label.
Summary
3 CRITICAL bugs (hot-deals stage typos, lost-revenue mislabelled "completed", missing GROUP BY in pipeline PDF), 5 HIGH inconsistencies (active-deal definition splits 4 ways; occupancy split 2 ways; weighted vs unweighted revenue; duplicated weight defaults; silent zero-weighting), and 6 MEDIUM polish issues. The single most leveraged fix is consolidating one activeInterestsWhere() helper used by every surface, plus adding a outcome='won' filter to the revenue PDF and a GROUP BY to the pipeline PDF.
13. PDF + brand-asset correctness (pdf-auditor)
PDF + brand-asset correctness — audit
Scope: src/lib/pdf/**, src/lib/templates/{merge-fields,berth-range}.ts,
src/lib/services/{documenso-payload,brochures,berth-pdf}.service.ts,
docs/eoi-documenso-field-mapping.md, assets/eoi-template.pdf.
Severity bands: CRITICAL = customer-visible silent data loss / crash; HIGH = visible quality regression / wrong number on a customer-facing artefact; MEDIUM = polish + future-proofing.
CRITICAL
C-1. Live Documenso template still missing Berth Range field
src/lib/services/documenso-payload.ts:157always emits theBerth RangeformValue, andformatBerthRange()produces compact range strings for the multi-berth bundle.docs/eoi-documenso-field-mapping.md:34flags that the live template (id8) does not yet have theBerth Rangefield. Documenso silently ignores unknownformValueskeys.- Net effect: every multi-berth EOI shipped via the Documenso pathway
currently renders only the primary
Berth Number. The expanded range (e.g.A1-A3, B5-B7) is dropped end-to-end, with no warning on the Documenso side — the bundle context is lost from the signed PDF. - Same field is also addressed defensively in the in-app pathway
(
src/lib/pdf/fill-eoi-form.ts:60-72), which logs a warning, but only when the in-app template is the one being used. - Action: Add the
Berth Rangetext field to Documenso template8(mirror the AcroForm field name + size on the source PDF). Once added, single-berth EOIs are unaffected becauseformatBerthRangecollapses a single mooring to its raw form.
C-2. tiptap→pdfme page break is wrong for letter / mixed for A4
src/lib/pdf/tiptap-to-pdfme.ts:51-54:PAGE_WIDTH_MM = 170is correct for A4 (210 − 2×20) but is treated as the only page format.PAGE_BREAK_THRESHOLD = 250is hard-coded; A4 page height is 297 mm and the threshold of 250 leaves 47 mm of unused space at the bottom and ignores the real bottom margin (≈ 20 mm).
eoi-standard-inapp.ts:67declares@page { size: letter; ... }, i.e. the seeded HTML template is Letter-sized while the serialiser is working in A4 millimetre coordinates. The template body is authored at a different page size than the engine that lays it out.- Net effect: long custom templates either truncate (overflow into the bottom margin, content clipped by pdfme when fields run past page height) or break at the wrong vertical position. The bug is invisible in the seeded template because its content is short, but any port that edits the template to add a few clauses sees clipped output.
- Action: Make page format a per-template attribute (Letter vs
A4), drive both the page width and the break threshold from it
(Letter content height ≈ 254 mm, A4 ≈ 277 mm with 10 mm bottom
margin), and reject HTML-template
@page size:values that disagree with the per-template setting.
C-3. tiptap→pdfme silently drops inline italic / underline + the
whole image node
extractParagraphContent(tiptap-to-pdfme.ts:146-164) only recordsboldand ignoresitalicandunderlinemarks. The validator accepts these marks (they're not inUNSUPPORTED_NODES) so an admin saves a template with italics, the preview renders bold-only, and they ship the wrong artefact to a client.processNodeforimage(line 354) doesstate.y += 20and never adds a field. The serialiser reserves 20 mm of whitespace and drops the image entirely. The "Insert image" affordance in the template editor (if exposed) is non-functional today.- The validator does NOT list the visible mark names it supports, so admins cannot reason about what's safe to use.
- Action: Either honour italic/underline via per-segment fields,
or reject them at validation time the same way
blockquoteis rejected. For images, either implement theimagepdfme schema or rejectimagenodes outright.
HIGH
H-1. No font registration → unsupported glyph silent fallback
src/lib/pdf/generate.tscallspdfme/generator.generate({ template, inputs })with nooptions.font. pdfme ships only Roboto by default.- The tiptap serialiser sets
fontName: 'Helvetica' | 'Helvetica-Bold'(tiptap-to-pdfme.ts:205-237). pdfme without a registered Helvetica font silently falls back to its embedded Roboto; the bold variant is also a substitution. This is invisible in dev because Roboto has full Latin + Latin-1 coverage, but non-Latin glyphs (Greek, Cyrillic, Hebrew for AED-tagged clients, theد.إAED symbol fromcurrency.ts:14) tofu out to□. - The currency dropdown advertises AED and JPY, both of which use
non-Latin glyphs that Roboto does NOT cover (
د.إArabic,¥is fine butد.إisn't). - Action: Register a Unicode-coverage font (Noto Sans + Noto
Sans Arabic + Noto Sans CJK) once and pass it to
generate(). Mirror the same font on the in-app EOI when that pipeline is built. Until then, the AED currency code inSUPPORTED_CURRENCIESis a footgun on every PDF that renders price.
H-2. Locale inconsistency in money + date formatting
- Mixed locale strategy in the reporting + summary templates:
revenue-report.ts:78,86→Number(...).toLocaleString(undefined, ...)(default locale; in Node 20 inside Docker this isen-US.UTF-8via the standalone image'sLANG; on the dev mac it picks the OS locale → different decimal/thousands separators server-side).pipeline-report.ts:93→Number(...).toLocaleString()(default locale, no formatting opts).- Almost every other template hard-codes
'en-GB'for dates.
- The interest-summary and berth-spec templates render
Priceas${currency} ${Number(price).toLocaleString()}— they bypassformatCurrency()and therefore drop the proper currency symbol formatting (USD 45,000instead of$45,000.00).invoice-template.tsusesformatCurrency()correctly; the inconsistency is a UX bug. - Pipeline report renders "Berth Price" with no currency at all
(
pipeline-report.ts:92-94): a 45 000 figure is meaningless without it. - Action: Route every money render in
src/lib/pdf/templates/**throughformatCurrency()(src/lib/utils/currency.ts:37), with an explicitlocale: 'en-GB'to match the dates. Same for the reports' date stamps.
H-3. Page overflow in fixed-height schemas
- Every template in
src/lib/pdf/templates/uses fixedpositionandheightslots:client-summary-template.tsreserves 80 mm for the interests list (line 51) and 60 mm for recent activity (line 60). pdfme truncates text that exceeds the slot height; there is no "overflow → next page" mechanism in the template definition.interest-summary-template.ts:65-69reserves 85 mm for the timeline; with 30 events at 8 pt that's ~3 lines/event = clipped after ~10 events.activity-report.ts:46reserves 120 mm foractivityDetails, and the data layer slices todata.logs.slice(0, 30)(line 70) — the slice masks the bug, but if the report layer sends more logs the bottom rows are clipped.pipeline-report.ts:38-50allocates 100 mm for summary and 100 mm for details; both can spill on ports with many stages- many top interests.
- pdfme's failure mode is silent clipping, not visible truncation
with
…or a "continued on next page" marker. - Action: Either move large lists onto multi-page schemas (push
fields onto subsequent
schemas[i]) or add explicit pagination insidebuild*Inputswith a deterministic "showing N of M" tail.
H-4. Numeric/date inputs pass undefined/null through new Date()
invoice-template.ts:117rendersDue: ${invoice.dueDate}raw. WhendueDateis null the field readsDue: null. Other templates useformatDate()-style helpers that return'N/A', but the invoice template doesn't.client-summary-template.ts:97,143andinterest-summary-template.tscallnew Date(client.createdAt as string | Date)without guarding againstundefined.new Date(undefined)yieldsInvalid DatewhosetoLocaleDateStringreturns'Invalid Date'— that string ends up in the PDF.- Action: Add a single
formatDate(value, fallback='—')helper insrc/lib/utils/date.ts, reuse across all templates; the existing private one ininterest-summary-template.ts:83-86should be hoisted.
MEDIUM
M-1. No accessibility / tagged PDF output
- All PDFs we produce are untagged (pdfme uses raw pdf-lib under the
hood;
generate.tsdoes not callsetTitle,setLanguage,setProducer, or anything to enableStructTreeRoot). - WCAG 2.1 7.1 / PDF/UA-1 compliance is unmet. For a port that contracts with a public-sector tenant or runs accessibility reviews on outbound EOIs, this is a procurement blocker.
- The in-app EOI HTML template has zero
aria-*attributes and a table-based layout (eoi-standard-inapp.ts:184-209). - Action: At minimum set
Title,Author,Subject,Lang=en-GBmetadata in the in-app EOI fill path (fill-eoi-form.ts) — pdf-lib supportsdoc.setTitle()etc. without adding accessibility tags. Track tagged-PDF / PDF/UA as a follow-up item.
M-2. EOI in-app source PDF — silent field-name drift
fill-eoi-form.ts:42-50swallows everygetTextField()/getCheckBox()exception so a re-cut template whose AcroForm field names changed (e.g.Berth Number→Berth_Number) will produce a "successful" PDF with empty fields. OnlyBerth Rangeis special-cased to log when missing.- Action: Promote the silent-skip pattern to also log a warning
per missing field (already done correctly for the new
Berth Rangefield — apply same treatment toName,Email,Address,Yacht Name,Length,Width,Draft,Berth Number,Lease_10,Purchase). Without it, the only way to notice a corrupted template is QA on a signed PDF.
M-3. Form not flattened → signer can edit pre-filled fields
fill-eoi-form.ts:124saves the doc unflattened. The comment on line 94 explicitly justifies this ("recipient can still tweak fields if needed before signing"). For an EOI/LOI this is risky: the signer can edit the address, yacht dimensions, or berth number after the fact, and the unflattened PDF carries the edits without the developer/approver re-acknowledging.- Documenso pathway is fine — Documenso flattens server-side before producing the signed artefact — but the in-app pathway emits the raw filled AcroForm to the storage backend as-is.
- Action: Flatten the AcroForm (
form.flatten()beforedoc.save()) for the in-app pathway, OR mark the relevant fields as read-only viafield.enableReadOnly(). The "tweak before signing" justification belongs to a draft preview, not the production artefact.
M-4. formatBerthRange warning is noisy at warn-level
berth-range.ts:64logsWARNper non-canonical mooring. The CLAUDE.md mooring spec (^[A-Z]+\d+$) was data-normalised in Phase 0, but historical archived rows + the(deleted)/(archived)suffix scheme on entity folders can leak into the bundle. Every multi-berth EOI containing a legacy mooring spins a stack of warnings.- Action: Downgrade the per-mooring warning to
debug; emit a singlewarnsummary performatBerthRange()call when the passthrough list is non-empty.
M-5. berth-spec-template.ts waitingList truncation
- 50 mm × 8 pt ≈ 12 lines (
berth-spec-template.ts:67-70); the waiting-list join key ispositionordered 1..N and there is no data-side cap. Ports with > 12 waitlisted clients silently lose the tail of the list on the spec PDF. - Same shape problem as H-3 but lower impact (berth-spec is internal).
M-6. assets/eoi-template.pdf — opacity / single source
- The whole in-app pathway depends on a single committed binary at
assets/eoi-template.pdf. There is no sha256 pinned inassets/README.md, no script that regenerates it from a known good source, and the AcroForm field shape is documented only in the mapping doc + the JSDoc ofloadEoiTemplatePdf. A swap of this file by anyone with repo access changes legal output silently. - Action: Add
EXPECTED_SHA256toassets/README.md+ a startup-time check (or test) that the source PDF's sha matches before falling back toEOI_TEMPLATE_PDF_PATH. Same applies to any shipped brochure default.
M-7. Reports / pdfme schemas — no portName brand asset
- Every report template hard-codes
'Port Nimara'as the fallback inbuild*Inputs. The CRM is multi-tenant; an admin generating a report for a different port falls back to the wrong brand if the port lookup fails (e.g. report job runs without a hydrated port). Default should be the empty string or'(port)', not a competitor port's brand.
M-8. Brochures + per-berth PDF — no upload-time render audit
- These are user-uploaded PDFs, not engine-rendered, so the
template-quality items above don't apply. The relevant integrity
controls (magic-byte check, sha256, size cap, version snapshot)
are in place in
berth-pdf.service.ts:217-264and the brochure upload flow. No findings for these two flows.
Summary
| # | sev | file | item |
|---|---|---|---|
| C-1 | CRIT | Documenso template (live) | Berth Range field missing — multi-berth ranges dropped end-to-end |
| C-2 | CRIT | tiptap-to-pdfme.ts |
A4 vs Letter page mismatch + hard-coded 250 mm break threshold |
| C-3 | CRIT | tiptap-to-pdfme.ts |
italic/underline marks and image nodes silently dropped |
| H-1 | HIGH | generate.ts + tiptap serialiser |
no font registration → AED/JP/Greek/Cyrillic glyphs missing |
| H-2 | HIGH | reports + summaries | locale-default toLocaleString server-side + currency bypass |
| H-3 | HIGH | every pdfme template | fixed-height slots clip overflow with no pagination |
| H-4 | HIGH | invoice-template.ts + summaries |
raw null/undefined date passthrough renders "Invalid Date" |
| M-1 | MED | all PDFs | no tagged-PDF / PDF/UA metadata |
| M-2 | MED | fill-eoi-form.ts |
silent field-name drift in source EOI PDF |
| M-3 | MED | fill-eoi-form.ts |
in-app EOI ships unflattened AcroForm |
| M-4 | MED | berth-range.ts |
noisy per-mooring warn log |
| M-5 | MED | berth-spec-template.ts |
waitingList overflow |
| M-6 | MED | assets/eoi-template.pdf |
no sha pinning of source binary |
| M-7 | MED | report templates | wrong-port fallback brand 'Port Nimara' |
| M-8 | MED | brochure + per-berth uploads | no issues — upload integrity controls in place |
Approx word count: ~1380.
15. Customer-facing copy + terminology audit (copy-auditor)
Task #15 – Customer-facing copy + terminology audit
Scope: CRM (src/components, src/app/(dashboard)), client portal (src/app/(portal), src/components/portal), branded email templates (src/lib/email/templates), PDF templates (src/lib/pdf/templates), public marketing site (website/). Read-only audit; no edits.
CRITICAL
C1. Four interchangeable nouns for the same domain entity
The same record is called interest, lead, prospect, and deal across surfaces. Sales reps and clients see all four within a single session.
- Entity / schema / URL:
interest(everywhere — DB,/interests, portal nav, page titles). - "Lead":
src/components/clients/client-interests-tab.tsx:30LEAD_CATEGORY_LABELS = { hot_lead: 'Hot lead', … }and the column header literally rendered as<dt>Lead</dt>.src/components/interests/interest-tabs.tsx:~736section heading<h3>Lead</h3>+<EditableRow label="Lead Category">.src/components/berths/berth-interests-tab.tsx:44hot_lead: 'Hot Lead'(Title Case mismatch with sibling above).src/components/dashboard/lead-source-chart.tsx+source-conversion-chart.tsxwidget title "Lead Source Attribution".
- "Prospect":
src/components/berths/berth-detail-header.tsx:~275form labelLinked prospect (optional)+ helperLink this status change to the prospect (interest) it relates to.— explicitly parenthesises the canonical name as a synonym.- Residential uses
prospectas a stage value (Prospectchip inresidential-clients-list.tsx, residential-client tabs) — confusing because elsewhere "prospect" means the record itself.
- "Deal":
src/components/berths/berth-tabs.tsxtab labelDeal Documents, API path/api/v1/berths/[id]/deal-documents.src/components/clients/bulk-archive-wizard.tsxplaceholderWhy are you archiving this late-stage deal?andsmart-archive-dialog.tsxheadingLate-stage deal — confirmation required.src/components/dashboard/hot-deals-card.tsx, widget label "Hot deals".- Pervasive in code comments inside
interest-tabs.tsx,inline-stage-picker.tsx— comments will leak into future copy.
Recommendation: pick one client-facing noun (the domain choice is interest; "deal" is fine as marketing/internal shorthand for hot interests but should never appear in fields/labels). Rename Deal Documents → Interest Documents, "Linked prospect" → "Linked interest", <dt>Lead</dt> / <h3>Lead</h3> → Buyer profile or Category. Residential prospect is a stage so leave alone but consider renaming to enquiry or new to free up the word.
C2. Raw machine status strings leak to the client portal
src/app/(portal)/portal/interests/page.tsx:80 renders
<span>EOI: {interest.eoiStatus.replace(/_/g, ' ')}</span>
and line 65 the same pattern for leadCategory. Clients see EOI: waiting for signatures, EOI: partially signed, hot lead, etc. — the underscores are stripped but the enum vocabulary is not translated. "hot lead" exposed to the client is also a privacy/optics issue (we are telling the prospect we classified them).
Fix: add a PORTAL_EOI_STATUS_LABEL map (e.g. waiting_for_signatures → "Awaiting your signature", signed → "Signed"); never render leadCategory in the portal at all.
C3. Signing-status labels diverge across three surfaces
For the same enum (draft | sent | partially_signed | completed | expired | cancelled):
| Surface | Label set |
|---|---|
interest-eoi-tab.tsx / interest-contract-tab.tsx / interest-reservation-tab.tsx |
Draft / Awaiting signatures / Partially signed / Signed / Expired / Cancelled |
documents-hub.tsx STATUS_PILL_MAP + document-list.tsx |
Renders raw enum (<Badge>{doc.status}</Badge>) — sent, partially_signed, completed displayed verbatim |
signing-progress.tsx STATUS_LABELS |
Only Pending / Signed / Declined — missing Sent, Expired, Cancelled |
notification-digest.ts email |
eoi_signed: 'EOI signed', eoi_completed: 'EOI completed' — "signed" vs "completed" used as if different events |
Realtime toast (realtime-toasts.tsx) |
EOI fully signed (yet another phrase) |
A user clicks a status pill in Documents Hub (shows partially_signed), opens the interest EOI tab (shows Partially signed), gets a toast that says EOI fully signed, and an email that says EOI completed — four phrasings for one document. Centralise via src/lib/labels/document-status.ts (already a pattern for seed-data etc.) and import everywhere.
HIGH
H1. "Save" button verbiage has six forms
Inventory of submit buttons across src/components/:
Save— inline editors, addresses-editor, contacts-editor, inline-phone-field, settings-manager, image-cropper.Save Changes(Title Case) — client-form, yacht-form, berth-form, expense-form, company-form, interest-form, reminder-form, role-form, tag-form, custom-field-form, port-form, webhook-form, template-form.Save changes(sentence case) — admin/users/user-form, interest-contact-log-tab.Save profile,Save username,Save preferences,Save overrides,Save template,Save view— descriptive variants.Saving...(ASCII three dots) vsSaving…(single ellipsis char) — both appear, ~50/50 split.
Decide: sentence case (Save changes) and standardise the loader as Saving… (Unicode ellipsis matches Prettier-friendly UTF-8 elsewhere in the codebase). The same form-pattern with a different casing in adjacent admin sections (user-form Save changes vs role-form Save Changes) is a likely playwright-visual diff source too.
H2. "New X" vs "Create X" mismatch on the same surface
Empty-state CTA and form submit button often disagree:
- Clients list action:
label: 'New Client'→ opens sheet titledNew Client→ submit buttonCreate Client. - Same pattern for Yacht, Expense, Company, Role, Tag, Template, Webhook, Port.
aria-label="New interest"oninterest-list.tsx, button textCreate Interest.
Pick one verb per action lifecycle (New … for affordances, Create … for confirm, OR unify to a single Create … throughout). The current pattern teaches the user two words for the same action.
H3. Public marketing form CTAs are all Submit
All five website forms (website/components/pn/specific/website/{form,contact,berths-item,supplement-eoi,register,news-item}/form.vue) use a bare Submit button. The matching confirmation email subjects say "Thank You for Your Interest" and the PDF title is "Expression of Interest". The CTA doesn't mention what the user is submitting.
Recommendation: replace with action-specific verbs — Register interest, Send enquiry, Request a call back (already used as helper text above the Submit on berths-item/form.vue). Loading state Submitting form... is also redundant — Sending… is shorter and matches the CRM email send loader.
H4. EOI vs Expression of Interest abbreviation discipline
Both forms appear, but the split is currently inverse to what client-facing surfaces should do:
- Client-facing surfaces (portal
/portal/documentspage, email templatedocumentLabel, website pages, PDF body, Documenso template-form select option) correctly spell out Expression of Interest. - But the portal interests page (
/portal/interests) and the portal documents page header both sayEOIsalongsideExpression of Interest—text-sm text-gray-500 mt-1: "Your contracts, EOIs, and signed agreements". - Realtime toast to staff says
EOI fully signed— fine for staff but the same toast also fires for portal users if they have a session? (Worth verifying; if so, full form needed.) - PDF body (
eoi-standard-inapp.ts:177) introduces the abbreviation correctly: This Expression of Interest (the "EOI") — good. Other PDFs (interest-summary-template.ts) use rawEOI status: …without that introduction.
Rule of thumb: portal/email/marketing → full form; CRM internal UI → EOI. Audit the portal pages to remove all EOI mentions.
H5. Email greeting + sign-off tone drift
Across src/lib/email/templates/:
- Greeting:
Dear {name},(portal-auth, crm-invite, inquiry-client-confirmation, residential-inquiry, document-signing — all three modes),Hello {name},(admin-email-change),Hi {name},(notification-digest),Welcome,(fallback in crm-invite/portal-auth),Dear Administrator,(inquiry-sales-notification). - Sign-off:
Best regards,(inquiry-client, residential-inquiry),Thanks,(admin-email-change),Thank you,(document-signing),The {portName} team(most),{senderName}(signing-invitation when provided).
Pattern: client-touching emails should land on one greeting (Dear {name},) and one sign-off (Best regards, / The {portName} team). The casual Hi {name}, on the notification-digest is fine because it's internal to staff, but Hello on admin-email-change is just a third style for the same internal audience.
MEDIUM
M1. "Signing envelope" jargon leaks into a user-facing dialog
smart-archive-dialog.tsx exposes:
<option value="leave">Leave envelope pending</option><option value="void_documenso">Void the signing envelope</option>
"Envelope" is Documenso/DocuSign internal vocabulary. Replace with Leave signing request pending / Cancel the signing request. The Documenso admin page is OK to keep envelope (dev-facing settings).
M2. Override / Confirm overloaded action verb
interest-stage-picker.tsx:179 shows {overrideEffective ? 'Override stage' : 'Confirm'}. The non-override label is too generic; users land on a stage-change dialog and the primary button just says "Confirm". Suggest Move to {stage} (parameterised) or Update stage.
M3. Loading state punctuation inconsistency
Saving... (ASCII) vs Saving… (Unicode …). Easy global codemod; matters for Playwright visual diffs and for screen-reader pronunciation (three dots gets read out as "dot dot dot").
M4. Reminder/alert verb spread
Acknowledge / Dismiss / Mark complete / Resolve (audit log) — four near-synonyms for "I dealt with it". Reminders use Mark complete, Alerts use Acknowledge + Dismiss. Acceptable if the semantics differ (acknowledge = seen, complete = done) but the current copy doesn't make that distinction clear.
M5. "Hot Lead" / "Hot lead" casing within the same domain
client-interests-tab.tsxandinterest-card.tsx:Hot lead.berth-interests-tab.tsxandinterest-filters.tsx:Hot Lead.dashboard/hot-deals-card.tsx:EOI Signed,EOI Sent(Title Case).- General CRM trend is sentence case — Title Case in these three files is the outlier.
Suggested follow-ups
- Add
src/lib/labels/document-status.ts; refactordocuments-hub,document-list,interest-{eoi,contract,reservation}-tab,signing-progress,notification-digestto import it. (C3) - Portal: never render
eoiStatus/leadCategoryraw; map first. (C2) - Rename
Deal Documentstab +/deal-documentsroute toDocuments. (C1) - Codemod
Save Changes→Save changes,Saving...→Saving…, unifyNew XvsCreate X. (H1, H2, M3) - Website: replace bare
Submiton five forms with action-specific verbs. (H3) - Portal/email/PDF: drop bare
EOIabbreviation in favour ofExpression of Interest. (H4) - Standardise email greeting/sign-off pair per audience tier. (H5)
- Replace
envelopejargon insmart-archive-dialog.tsx. (M1)
Verified clean: Inquiry spelling consistent (American); crm-invite.ts use of "CRM" is staff-only and intentional; reports PDFs only use enum strings internally.
16. Dependency + supply-chain hygiene audit (deps-auditor)
Dependency + Supply-Chain Hygiene Audit
Repo: new-pn-crm @ feat/documents-folders · Date: 2026-05-12 · Auditor: task #16
Inputs: pnpm audit, pnpm outdated, pnpm licenses list [--prod], pnpm why,
pnpm install --frozen-lockfile, package.json, pnpm-lock.yaml, Dockerfile*.
Headline: No known CVEs (pnpm audit → 0 across info/low/moderate/high/critical),
no GPL/AGPL anywhere in the tree, lockfile is intact and reproducible.
Real risk concentrates in two places: a Node 20 base image at/past EOL, and
a @types/node major-version mismatch that lets the type-checker greenlight
runtime APIs that don't exist in Node 20. Everything else is incremental.
CRITICAL
C1 — @types/node@^25.6.2 against Node 20 runtime
- What:
package.jsonline 111 pins@types/nodeto^25.6.2; resolved version is25.6.2. Every Dockerfile and the esbuild target (--target=node20) ships Node 20. - Why it bites: Node 25 is the Current release line — it includes APIs added
after Node 20 (e.g. recent
node:sqliteevolution,node:testadditions,process.permission, newerfs.globshapes, updatedWeb*globals). TypeScript cannot tell you you've called something that won't exist on the runtime — the build passes, the prod worker crashes at first call. - Severity: CRITICAL — silent landmine, no compile warning, no audit signal.
- Fix: Downgrade to
@types/node@^20. If you genuinely want to consume Node-22 APIs, also bump the base images tonode:22-alpine(see C2) and--target=node22.
C2 — Node 20 LTS at end-of-life
- What: All three Dockerfiles (
Dockerfile,Dockerfile.dev,Dockerfile.worker) useFROM node:20-alpine(no minor pin). Node 20 entered Maintenance LTS Oct 2025 and reaches EOL on 2026-04-30 — i.e. ~2 weeks before today (2026-05-12). The image will still build, but Node 20 no longer receives security patches from upstream. Alpine's package security advisories will continue for OS libs only. - Severity: CRITICAL — the base image is the largest surface in the SBOM and is now unpatched against new V8/Node CVEs.
- Fix: Move to
node:22-alpine(Active LTS through Apr 2027). Pin the minor digest (node:22.11-alpine@sha256:…) for reproducibility. Bump esbuild--target=node22inbuild:server/build:workerscripts. No app-code change expected — the codebase already uses ESM-native idioms.
HIGH
H1 — @types/pdfkit mis-classified as a runtime dependency
- What:
package.jsonline 62 puts@types/pdfkit@^0.17.6underdependenciesalongsidepdfkit. Type packages are compile-time only. - Impact: Slightly bloats prod
node_modulesand the Docker prod image; more importantly it's a classification smell — anyone reasoning about supply-chain surface will look atdependenciesand assume it's executed. - Fix: Move to
devDependenciesalongside the other@types/*.
H2 — Deprecated transitive: glob@10.5.0
- What:
pnpm-lock.yamlcarriesglob@10.5.0with the upstream notice "Old versions of glob are not supported and contain widely publicized security vulnerabilities … please update."pnpm why globtraces it toarchiver-utils@5.0.2 ← archiver@7.0.1(a direct dependency, used by GDPR exports per CLAUDE.md). - Impact:
glob< 11 has known prototype-pollution-class issues.pnpm auditdoesn't flag them because the advisories require an exploitable callpath, but the deprecation notice is the upstream signal to upgrade. - Fix: Bump
archiverto^8.0.0(already shown inpnpm outdated). Archiver 8 pulls a currentgloband the API is source-compatible for the waysrc/lib/services/gdpr-export.service.tsuses it. Verify with the GDPR export Playwright case after upgrade.
H3 — Deprecated transitive: @esbuild-kit/{core-utils,esm-loader}
- What: Both are marked "Merged into tsx: https://tsx.is" by upstream. They
come from
drizzle-kit@0.31.10andbetter-auth@1.6.9— not directly fixable here. - Severity: HIGH (visibility only; no known exploit). The packages still function but receive no upstream maintenance.
- Fix: Track
drizzle-kitandbetter-authreleases; both maintainers have open PRs migrating to baretsx. No local change today — file as a watch item.
H4 — pnpm.overrides uses floating ranges
- What:
package.jsonpnpm.overrides:vite: "8.0.5" // pinned ✓ esbuild: ">=0.25.0" // floating ✗ postcss: ">=8.5.10" // floating ✗ - Impact: The
>=overrides re-resolve on everypnpm install --no-lockfile/pnpm update. They were added as CVE-fix safety nets, but their floating shape defeats the lockfile's reproducibility guarantee on the very transitives that prompted the override in the first place. - Fix: Replace
">=0.25.0"/">=8.5.10"with the actual resolved versions (currently0.27.7and8.5.14), or use exact pins. Re-evaluate whenever you bump esbuild/postcss.
MEDIUM
M1 — Major-version upgrades available
Captured from pnpm outdated (today vs latest):
| Package | Current | Latest | Risk |
|---|---|---|---|
next |
15.5.18 | 16.2.6 | App Router breaking changes in 16; defer until React/Next stabilise together |
eslint + eslint-config-next |
9 / 15 | 10 / 16 | Lint-only; do alongside Next 16 |
zod |
3.25.76 | 4.4.3 | Wide blast radius — every src/lib/validators/*.ts + createTemplateSchema VALID_MERGE_TOKENS allow-list logic. Plan as its own task |
tailwindcss |
3.4.19 | 4.3.0 | Config migration (Tailwind 4 = Lightning CSS) — schedule with design tokens work |
@hookform/resolvers |
3.10.0 | 5.2.2 | API change for zod resolver — paired with zod 4 |
react-day-picker |
9 | 10 | Verify in calendar/date pickers |
archiver |
7.0.1 | 8.0.0 | Clears H2 — do first, it's narrow |
esbuild (dev) |
0.27.7 | 0.28.0 | Patch-y; trivial |
Minor upgrades (bullmq, better-auth, @tanstack/react-query, vitest,
@playwright/test, @types/node patch, libphonenumber-js, tailwind-merge,
lint-staged, react-grab) are all single-digit-bumps, no risk.
M2 — dotenv lives in devDependencies but is imported by production-runnable scripts
- What:
dotenvisdevDependencies(line 117) but is imported byscripts/backfill-document-folders.ts(documented in CLAUDE.md as a deploy step),scripts/import-berths-from-nocodb.ts,scripts/db-reset.ts,src/lib/db/seed.ts, etc. - Impact: Anyone who runs
pnpm install --prodand thenpnpm db:backfill:doc-folders(a documented deploy command) fails at module resolution. Not exploited today because deploy runspnpm installwithout--prodfor those steps, but the contract is implicit. - Fix: Either (a) move
dotenvtodependencies, or (b) document in CLAUDE.md that the backfill must be run from a full-deps image / dev workstation. (a) is the smaller foot-gun.
M3 — node:20-alpine is unpinned (floats on minor + digest)
- What: No minor or digest pin on the
FROMlines. - Impact: Two builds an hour apart can land on different base layers; SBOM drifts without code changes; pre-existing CVE-fix bumps reach prod un-noticed (mostly a good thing, but caught me out audits).
- Fix: Use
node:22.11-alpine@sha256:<digest>once you move to 22. Re-pin monthly as part of dependency hygiene.
M4 — No engines field in package.json
- What:
package.jsonhas noengines.node/engines.pnpm. ThepackageManagerfield pins pnpm to10.33.2, but Node is implicit. - Impact:
pnpm installdoesn't enforce the runtime; a contributor on Node 18 will install successfully and only fail later. CI hides this because Docker is the source of truth. - Fix: Add
"engines": { "node": ">=22 <23", "pnpm": ">=10" }and turn onengine-strict=truein.npmrcif you want hard enforcement.
LICENSE AUDIT (prod tree)
No GPL or AGPL anywhere. Non-permissive licenses found:
| Package | License | Disposition |
|---|---|---|
@img/sharp-libvips-darwin-arm64 (and other arches) |
LGPL-3.0-or-later | OK — dynamic link, native binding; LGPL §5 covers this use |
dompurify |
MPL-2.0 OR Apache-2.0 | OK — dual; you may rely on Apache-2.0 |
@zone-eu/mailsplit (transitive of mailparser) |
MIT OR EUPL-1.1+ | OK — dual; MIT chosen |
caniuse-lite |
CC-BY-4.0 | OK — data only, attribution satisfied by upstream notices |
postgres (driver) |
Unlicense | OK — public-domain-style |
axe-core (dev only) |
MPL-2.0 | OK — dev/test, not redistributed |
lightningcss, lightningcss-darwin-arm64 (dev/build) |
MPL-2.0 | OK — build-time, MPL is file-scoped |
tslib |
0BSD | OK |
No UNLICENSED / "Custom" / SSPL packages.
LOCKFILE + AUDIT INTEGRITY
pnpm audit→ No known vulnerabilities.pnpm audit --json metadatashows 989 deps, 0 vulns, 0 dev (because the audit metadata reportsdependenciesafter filtering — clean).pnpm install --frozen-lockfile→ "Lockfile is up to date", no warnings, no peer-depUnmet/Conflictlines. Huskypreparehook ran clean.pnpm-lock.yamlhas 139peerDependencies:entries — all satisfied (nopeer:missingmarkers in the resolved graph).- No "phantom" deps detected — only the two deprecated chains in H2/H3.
RECOMMENDED FIX ORDER
- C1 + C2 together — bump base image to
node:22.11-alpine, drop@types/nodeto^22, esbuild--target=node22. Smoke test build + worker. - H1 — move
@types/pdfkittodevDependencies(1-line PR). - H2 —
archiver@^8.0.0; run GDPR-export Playwright case. - H4 — replace floating overrides with exact pins.
- M2 — promote
dotenvtodependencies, OR document deploy contract. - M4 — add
enginesfield. - M1 majors — schedule one-at-a-time, starting with
archiver(done in #3) andesbuild. Next 16 / Zod 4 / Tailwind 4 are each their own project.
Total touch points: ~6 single-line PRs + 3 scheduled major-bump tracks.
17. Build + deploy + prod readiness audit (build-auditor)
Audit #17 — Build + Deploy + Prod Readiness
Scope: Dockerfile, Dockerfile.dev, Dockerfile.worker, docker-compose.yml, docker-compose.prod.yml, docker-compose.dev.yml, next.config.ts, src/lib/env.ts, .env.example, plus the entry points src/server.ts / src/worker.ts and health endpoints.
Branch at audit time: feat/documents-folders.
CRITICAL
C1 — No .dockerignore in repo root
cat .dockerignore returns no such file. Build context at audit time:
node_modules= 4.9 GB.next= 2.7 GB.git= 41 MB- Plus
storage/,playwright-report/,test-results/,tests/,scripts/, screenshots,.env*.
Every docker build ships ~7.6 GB to the daemon. Worse, Dockerfile.dev and the builder stage of Dockerfile / Dockerfile.worker all do COPY . ., which means:
.env,.env.local,.env.dev(if present) end up in build-layer history of the builder stage. The runner stage doesn't re-copy them, but intermediate layers are cacheable and pushable; a careless--target builderpush leaks secrets.- Local
node_modules(built on macOS) get shipped to an Alpine builder and then ignored — silent waste, andnode_modules/sharpdarwin binaries collide with the musl install. - Test snapshots / fixtures get baked into the trace.
Fix: add .dockerignore covering at minimum: node_modules, .next, .git, .env*, dist, storage, playwright-report, test-results, tests, coverage, *.log, .DS_Store, .vscode, .idea, .husky, docker-compose.*.yml (not needed inside the image).
C2 — EMAIL_REDIRECT_TO has no production refusal guard
CLAUDE.md is explicit: "must be unset in production". The Zod schema in src/lib/env.ts:41 accepts it unconditionally as z.string().email().optional(). If a staging .env leaks into a prod deploy (a very common ops mistake with the current env_file: .env setup — see M4), every outbound client email, EOI signing invite, and webhook delivery silently routes to the staging mailbox and the production user sees… nothing.
Evidence of the blast radius — EMAIL_REDIRECT_TO short-circuits:
src/lib/email/index.ts:131— all SMTP recipients rewritten.src/lib/services/documenso-client.ts:118-180— all Documenso recipient lists + template formValues overridden.src/lib/queue/workers/webhooks.ts:94-107— webhook deliveries fully suppressed.
Fix: add a superRefine (or schema-level cross-check) that hard-fails when NODE_ENV === 'production' && EMAIL_REDIRECT_TO is set. Belt-and-braces: log a logger.fatal and process.exit(1) from src/lib/email/index.ts boot if the condition is reached.
C3 — Custom server depends on socket.io that may not be in the standalone trace
Dockerfile runner stage copies only .next/standalone/, .next/static/, public/, and dist/server.js (renamed server-custom.js). There is no separate pnpm install --prod in the runner — every runtime dep must arrive via Next's output: 'standalone' tracer.
src/server.ts imports @/lib/socket/server, which import { Server } from 'socket.io' and @socket.io/redis-adapter. esbuild bundles server.ts with --packages=external, so at runtime server-custom.js does require('socket.io') against /app/node_modules. Neither socket.io nor @socket.io/redis-adapter is in next.config.ts:66 serverExternalPackages, and no Next route ever imports @/lib/socket/server (the socket server is only instantiated by the custom entry point), so the Next tracer has no reason to include them in .next/standalone/node_modules.
If this has been working in prod it's only because the packages get pulled in transitively via something Next does see. The dependency is invisible to the build system — a Next minor upgrade could drop them from the trace tomorrow.
Fix: add both to serverExternalPackages, and extend outputFileTracingIncludes for the custom-server bundle, or COPY them explicitly into the runner from the deps stage:
COPY --from=deps --chown=nextjs:nodejs /app/node_modules/socket.io ./node_modules/socket.io
COPY --from=deps --chown=nextjs:nodejs /app/node_modules/@socket.io ./node_modules/@socket.io
(Same risk applies to anything else only the custom server imports — audit src/server.ts import graph.)
HIGH
H1 — CSP keeps 'unsafe-inline' on script-src in production
next.config.ts:31 — script-src 'self' 'unsafe-inline' regardless of isProd. Only 'unsafe-eval' is gated. With 'unsafe-inline' on, the entire XSS defence of CSP is defanged — any reflected/stored XSS still executes inline. The comment claims it's for Tailwind/Radix runtime styles, but those affect style-src, not script-src. Move to nonce or hash-based script policy in prod.
H2 — NEXT_PUBLIC_APP_URL not in Zod schema, but baked at build time
.env.example:67 lists NEXT_PUBLIC_APP_URL but src/lib/env.ts does not validate it. The builder stage runs pnpm build with SKIP_ENV_VALIDATION=1, so Next inlines an empty string when the var is missing. src/providers/socket-provider.tsx:67 then runs io(process.env.NEXT_PUBLIC_APP_URL!, {...}) → io('', {...}) → browser falls back to window.location.origin, which silently works in most cases but breaks the moment the CRM is fronted by a different origin than the socket gateway. src/lib/auth/client.ts:12 has the same risk for the auth base URL during SSR.
Fix: add NEXT_PUBLIC_APP_URL: z.string().url() to the schema and pass it into the builder stage (via --build-arg + ARG NEXT_PUBLIC_APP_URL in Dockerfile). Drop SKIP_ENV_VALIDATION=1 from the builder stage, or at least surface a build-time warning for missing NEXT_PUBLIC_* vars.
H3 — Dockerfile.dev runs as root and re-installs on every layer rebuild
- No
USERdirective → dev container is root inside the bind-mounted/app. Anypnpm dev-spawned child can write to host-mounted files as root. - Combined with C1 (no
.dockerignore), theCOPY package.json pnpm-lock.yaml ./followed bypnpm devover a bind mount means the hostnode_modulesshadow the in-container install on macOS (different platform), and dev images frequently break onsharp/tesseract.jsuntil rebuilt.
Fix: create a node user (or reuse uid 1001), chown /app, drop privileges, and ship a working .dockerignore so the build context isn't 7.6 GB.
H4 — docker-compose.prod.yml has no resource limits, no log rotation
crm-app, crm-worker, postgres, redis all run with default unlimited memory and the default json-file log driver. On a small VPS one runaway worker OOMs the host. The default log driver has no rotation, so disks fill silently. Add deploy.resources.limits (or top-level mem_limit in non-swarm mode) and logging: driver: json-file, options: { max-size: "10m", max-file: "5" } to every service.
H5 — Compose healthcheck targets localhost:3000, but env.PORT is configurable
docker-compose.prod.yml:45 and .yml:43 hardcode http://localhost:3000/api/health. If a deploy sets PORT=8080 via .env, the container listens on 8080, the healthcheck stays on 3000 → permanent "unhealthy" → restart loop. Either drop PORT from env.ts (the schema validates it but compose ignores it) or templatize the healthcheck (wget … http://localhost:${PORT:-3000}/api/health).
MEDIUM
M1 — Worker healthcheck only pings Redis
Dockerfile.worker:38-39 checks Redis.ping(). A wedged BullMQ consumer (silent disconnect from the queue stream but TCP alive) passes this probe while jobs queue forever. Upgrade to read a sentinel BullMQ heartbeat key the worker writes on each job loop, or expose a tiny HTTP /healthz from the worker that asserts queue.client.status === 'ready' on the named queues.
M2 — Worker re-installs deps in the runner stage
Dockerfile.worker:31-32 does pnpm install --frozen-lockfile --prod in the runner — network round-trip on every build, even though the deps stage already has the full tree. Move to COPY --from=deps /app/node_modules ./node_modules then pnpm prune --prod, or use pnpm deploy --prod --filter <pkg>. Save ~30–60s per build and removes a network failure mode.
M3 — next.config.ts serverExternalPackages likely incomplete
socket.io, @socket.io/redis-adapter, imapflow, mailparser, pdf-lib, pdfme, sharp, tesseract.js are all heavy native/CJS-leaning deps used server-side. Only 8 are listed. Anything missing risks bundling into the Next route trace (slower cold start, larger lambda/standalone size, possible runtime require failures for native bindings). Audit the import graph and add the rest.
M4 — env_file: .env puts every secret into the container env
docker-compose.prod.yml:36,59. Anyone with docker inspect or /proc/<pid>/environ access on the host reads BETTER_AUTH_SECRET, EMAIL_CREDENTIAL_KEY, DOCUMENSO_API_KEY, DOCUMENSO_WEBHOOK_SECRET in plaintext. Switch to docker secrets (/run/secrets/...) or a sidecar mount and have env.ts read from file paths when the _FILE suffix is present.
M5 — .env.example missing schema entries
Not in .env.example: MULTI_NODE_DEPLOYMENT, WEBSITE_INTAKE_SECRET, EMAIL_REDIRECT_TO (intentional per docs but the doc note exists; add a commented # EMAIL_REDIRECT_TO= line so devs know it's an option), DOCUMENSO_CLIENT_RECIPIENT_ID / DOCUMENSO_DEVELOPER_RECIPIENT_ID / DOCUMENSO_APPROVAL_RECIPIENT_ID (env.ts has all three with defaults, but they should still be documented), PORT. The EMAIL_CREDENTIAL_KEY placeholder is 64 zeros — fine for dev but worth a comment that prod must rotate.
M6 — Node 20-alpine, no PID-1 init
Both Dockerfiles use node:20-alpine (still LTS, but the node:22-alpine LTS is current). Neither installs tini/dumb-init — Node handles SIGTERM itself in these entrypoints so it's not broken, but if any child process is ever spawned (e.g. tesseract worker pool) zombie reaping is on Node. Cheap upgrade: RUN apk add --no-cache tini && ENTRYPOINT ["/sbin/tini", "--"].
M7 — Dockerfile runner has no HEALTHCHECK directive (only compose has one)
Image-level HEALTHCHECK makes the image self-describing — useful for non-compose orchestrators (swarm, nomad, k8s readinessProbe via exec). Add the same wget … /api/health line to the app Dockerfile as the worker Dockerfile already does for Redis.
M8 — CSP connect-src https: / img-src https: are wide
Tighten to an allow-list once per-port branding exposes the configured S3 host.
M9 — Builder stage never sets NODE_ENV=production
Dockerfile:14-15 sets NEXT_TELEMETRY_DISABLED=1 + SKIP_ENV_VALIDATION=1 but not NODE_ENV. next.config.ts:3 branches on isProd for CSP — make this deterministic with ENV NODE_ENV=production above RUN pnpm build.
Quick-win checklist
- Add
.dockerignore(C1). - Refuse-to-start when
EMAIL_REDIRECT_TOset in prod (C2). - Pin socket.io into the standalone trace (C3).
- Remove
'unsafe-inline'from script-src in prod CSP (H1). - Validate
NEXT_PUBLIC_APP_URLat build (H2). - Add compose resource + log limits (H4).
- Templatize healthcheck PORT (H5).
18. Berth recommender quality audit (recommender-auditor)
Audit — src/lib/services/berth-recommender.service.ts
Read-only audit. Scope per task #18: tier ladder, heat weights, max-oversize cap, fallthrough policy paths, port-isolation defense-in-depth, CTE correctness, cooldown / late-stage settings, n+1 risk, edge cases.
Code as of feat/documents-folders @ 660553c.
CRITICAL
None blocking ship. The recommender's entry-point port guard
(interestInput.portId !== args.portId → CodedError) and the feasible
CTE's b.port_id = $portId correctly fence cross-tenant queries at the top
level. The remaining issues are correctness / defense-in-depth.
HIGH
H1. active_interest_count lacks i.id IS NOT NULL defense-in-depth filter
aggregates CTE (lines 475–479):
COUNT(*) FILTER (
WHERE i.archived_at IS NULL
AND i.outcome IS NULL
AND ib.is_specific_interest = true
) AS active_interest_count
The LEFT JOIN on interests i ON i.id = ib.interest_id AND i.port_id = $portId
intentionally sets i.id = NULL when an interest_berths row points at a
cross-port interest (orphan / legacy data). For an ib row with
is_specific_interest = true whose i.id was nulled by the port-filter,
the FILTER evaluates archived_at IS NULL → TRUE, outcome IS NULL → TRUE,
is_specific_interest = true → TRUE — and the row is counted as an active
interest against the feasible berth, mis-classifying it as Tier C (or D if
combined with the H2 issue below).
total_interest_count correctly guards with FILTER (WHERE i.id IS NOT NULL)
and the inline comment promises "FILTER also enforces port isolation
defense-in-depth," but only total_interest_count carries that guard. The
documented project precedent for the documents-hub aggregator is "defense-in-
depth port_id filter at every join — entry-point check alone is rejected."
The recommender should mirror that.
Fix: Add AND i.id IS NOT NULL to the active_interest_count filter
(also worth adding to max_active_stage for consistency — see M3).
H2. max_active_stage not filtered by is_specific_interest = true
Lines 483–496:
COALESCE(
MAX(CASE i.pipeline_stage ...) FILTER (
WHERE i.archived_at IS NULL AND i.outcome IS NULL
),
0
) AS max_active_stage,
The inline comment on active_interest_count is explicit: "An EOI-bundle-only
link (is_specific_interest=false, is_in_eoi_bundle=true) is legal coverage,
not a pitch, and shouldn't demote the berth." That intent is honoured by
active_interest_count but violated by max_active_stage, which sums
over all open ib rows regardless of the is_specific_interest flag.
Concrete failure: berth X is part of an EOI bundle for interest A
(at deposit_10pct, EOI-bundle-only — legal coverage, not a pitch). No
specific-interest link on X. The recommender computes
active_interest_count = 0 (correct) but max_active_stage = 6 (deposit_10pct).
classifyTier looks at activeInterestCount > 0 && maxActiveStage >= 6. The
first clause is false → tier A (correct). So in this specific case the bug is
masked.
But mixed case: berth X has both an EOI-bundle-only deep-stage link AND a
specific-interest link at details_sent. active_interest_count = 1,
max_active_stage = 6 (from the bundle link) → Tier D. Per the documented
semantics it should be Tier C (the only pitch is at details_sent = 2).
This falsely sends late-stage warnings into the UI and, when
tier_ladder_hide_late_stage = true, hides the berth that should still be
recommendable.
Fix: Add AND ib.is_specific_interest = true to the max_active_stage
FILTER, to match active_interest_count.
MEDIUM
M1. Tier-B heat suppressed when berth has any active interest
recommendBerths (line 587–598): heat is only computed when tier === 'B'.
A berth with strong fall-through history plus a single fresh tire-kicker
active interest is classified C (active > 0, no late stage), heat = null,
and all the recovery signal (recency / furthest stage / interest count /
EOI count) becomes invisible in the UI. The pipeline reason chip degrades to
"1 active interest in early stage" and the rep loses context about whether
the berth has a history of falling through at contract_signed.
Defensible as a design choice — the tier already encodes "needs attention" — but documenting the trade-off (or surfacing a "history" indicator independent of tier) would close the gap.
M2. pipeline_stage = 'completed' (stage 9) absent from CASE expressions
Both max_active_stage and fallthrough_max_stage CASE blocks enumerate
open … contract_signed (1–8) with ELSE 0. The schema comment at
src/lib/db/schema/interests.ts:18 lists completed as the ninth stage.
An interest at pipeline_stage='completed' with outcome IS NULL (defective
but possible) falls into the ELSE 0 branch, producing the same maxStage as
"no data." Practically not harmful because won outcomes drop the row from
the active filter, but the silent collapse to 0 is fragile if the data
ever drifts. Either add the completed arm explicitly or replace the CASE
with a join against a stage-order lookup so the JS constant and the SQL
arm stay in lock-step.
M3. CTE LEFT JOIN allows null-side rows into all aggregates
Same root cause as H1, narrower impact:
lost_count: filter requiresi.outcome IS NOT NULL→ safe.latest_fallthrough_at,fallthrough_max_stage: sameoutcome IS NOT NULLguard → safe.eoi_signed_count:i.eoi_status = 'signed'→ null-on-null → safe.max_active_stage: filter isi.archived_at IS NULL AND i.outcome IS NULL→ both NULLs match → row is included with CASE returning 0 →COALESCE(..., 0)masks it. Safe in practice but only by accident.
Adding the i.id IS NOT NULL predicate to every active-side filter is
cheap, matches the documents-hub precedent, and makes the intent
self-documenting.
LOW
L1. Negative / zero admin values not validated
asNumber accepts any finite number. topNDefault = 0 returns an empty
recommendation list; maxOversizePct = -50 produces a multiplier of 0.5
that combines with the length_ft >= desiredLengthFt filter to make every
berth infeasible; fallthroughCooldownDays = -30 puts the cutoff in the
future and silently disables the cooldown (every fall-through is "before"
the future cutoff). Consider clamping at parse time (Math.max(0, n) for
non-negative settings, Math.max(1, n) for topN).
L2. outcome::text cast is a no-op
interests.outcome is declared text(...) (not an enum) — the explicit
::text cast inside LIKE 'lost%' is redundant. Harmless; safe to drop.
L3. Hard-coded heat normalisation constants
computeHeat uses 5 (interest count) and 3 (EOI count) as the
"saturate-at" caps and 30 / 365 days for the recency curve. These are
not admin-tunable. Per-port behaviour expectations may differ — a port
that sees 20+ interests on hot berths will have interestCount
saturating early. Promote to settings if tuning lands as a real need;
otherwise document the assumption.
L4. Width-only feasibility cap uses 8× L/W heuristic
When desiredLengthFt is null but desiredWidthFt is set, the upper
length cap is width * 8 * (1 + oversizePct/100). Inline comment owns
this as a pragmatic guard. Worth a unit test pinning the ratio so a
future tweak doesn't silently widen the cap.
Architecture / structure — clean
- Tier ladder (
classifyTier): A/B/C/D mapping is correct and matches the doc-string. Tier C/D requiresactiveInterestCount > 0; D needsmaxActiveStage >= LATE_STAGE_THRESHOLD (= 6, deposit_10pct). Tier B requireslostCount > 0. Tier A is the fall-through default. Verified. - Heat defaults (30 / 40 / 15 / 15) sum to 100, and
computeHeatre-normalises vianorm = 100 / weightSumso admin tuning that doesn't sum to 100 still produces a 0..100 score. FinalMath.max(0, Math.min( 100, ...))clamps. Verified. - Max-oversize cap arithmetic:
oversizeMultiplier = 1 + pct/100, applied aslength_ft <= desired * multiplier. Inclusive upper bound; the lower boundlength_ft >= desiredis also inclusive. Symmetric and correct. - Fallthrough policy paths:
immediate_with_heat→ no cooldown filter, heat surfaces immediately.cooldown→ tier B berths whoselatestFallthroughAt > now - cooldownDaysare skipped; non-B berths unaffected.never_auto_recommend→ tier B berths skipped entirely (heat still computed but never reaches the output). All three paths correct.
tier_ladder_hide_late_stage: defaulttrue→showLateStage = false→ tier D rows dropped at line 564. Caller can override via theshowLateStagearg. Correct.- N+1 risk: scoring loop is pure JS over the pre-fetched rowset. The
three-query shape (
loadRecommenderSettings,loadInterestInput, main CTE) is constant. No issue.
Edge cases — verified
- No history: LEFT JOIN yields one null-side row, all FILTER predicates short-circuit, counts = 0, COALESCEs return 0 → Tier A. ✓
- All-lost history:
active = 0, lost > 0→ Tier B; cooldown / never paths each gate correctly; heat computes from fall-through fields. ✓ - Mixed open + lost:
active > 0dominates → Tier C/D, heat = null (see M1 trade-off). ✓ (with caveat) - Won outcome: not matched by
outcome LIKE 'lost%' OR outcome = 'cancelled', doesn't inflate lost_count or contaminate fallthrough stage. ✓ - Cross-port leakage: prevented at the entry point and the
feasibleCTE; partial defense-in-depth gap at the aggregates layer (H1, M3).
19. Search relevance audit (search-auditor)
Search relevance audit — task #19
Scope: src/lib/services/search.service.ts, src/lib/services/search-nav-catalog.ts, src/components/search/command-search.tsx, src/hooks/use-search.ts, plus the resolve-id route used by paste detection.
Method: Read each file in full, traced the ranking formulas, simulated the three test queries against scoreEntry, audited the graph-expansion merge for permission leakage, and spot-checked the catalog for duplicates.
Spot-check results (the three required queries)
All three pass — but with a duplicate-result wrinkle (see HIGH-2).
| Query | Top entry | Score | Why |
|---|---|---|---|
ai |
/admin/ai "AI configuration" |
80 | label.startsWith("ai") |
smtp |
/settings/email "Email accounts (SMTP / IMAP)" |
60 | label.includes('smtp') beats keyword-exact (50) on the /admin/email twin |
client portal |
/admin/settings "System Settings" |
50 | exact keyword match |
Runner-ups for ai: System Settings (35, ai interest scoring keyword prefix) and Profile (20, "avatar" substring). Acceptable noise floor.
CRITICAL
None. The system is solid overall — sanitization is correct, port isolation is consistent, the affinity boost is bounded, and paste detection is port-scoped via the resolve-id endpoint (good — prevents cross-tenant navigation on super-admin paste).
HIGH
HIGH-1 — Graph expansion bypasses per-bucket permission gates (authorization leak)
search() (line 1809–1865) gates each direct-match bucket via can(opts, '<x>.view'). Then expandGraph runs unconditionally on whichever direct matches survived, and its output is pushed into mergedClients / mergedInterests / mergedYachts / mergedCompanies / mergedBerths via mergeWithExpansion (lines 1911–1915) — without re-checking the destination bucket's permission.
Concrete leak: a user with berths.view but no interests.view who searches A12:
- direct: berth A12 surfaces
- expansion:
interestsFromBerths→ populatesexpandedInterests→ merged intomergedInterests→ returned to the client - The dropdown renders rows with the client's full name + pipeline stage from interests the user cannot otherwise read
Similar leaks: berth name via yacht-direct match → expandedBerths; client names via company-direct match → expandedClients; etc.
Fix: gate the expansion writes — only push expanded.X into mergedX when can(opts, '<X>.view'). Cleanest: pass the can(...) results into mergeWithExpansion as a "destination allowed" boolean.
HIGH-2 — Six catalog labels are duplicated under different hrefs
The catalog has both /settings/X and /admin/X entries with near-identical labels, so common queries return two visually-similar rows pointing at different pages:
| Query | Hits |
|---|---|
tags |
/settings/tags "Tags" + /admin/tags "Tags" |
branding |
/settings/branding + /admin/branding |
templates |
/settings/templates "Document templates" + /admin/templates "Document templates" |
storage |
/settings/storage + /admin/storage |
analytics/umami |
/website-analytics + /admin/website-analytics |
email/smtp |
/settings/email + /admin/email |
For users who have both manage_settings permissions, the dropdown shows two indistinguishable rows. Recommendation: either (a) collapse to one canonical entry per concept, or (b) disambiguate the label suffix (e.g., "Email accounts (admin)" vs "Email accounts (self-serve)"). The duplication reflects the underlying double-page structure, which deserves its own product decision.
HIGH-3 — looksLikeEmail / wantPhone are computed then discarded
Lines 1804–1807 compute wantEmail and wantPhone, then lines 1885–1886 do void wantEmail; void wantPhone; with a TODO-style comment. Dead code paid for on every request. Either delete or wire it into the bucket reordering the comment promises.
MEDIUM
M-1 — applyAffinity re-sorts AFTER mergeWithExpansion, breaking the direct-first guarantee
mergeWithExpansion (line 1754) carefully puts direct matches before expansion rows. Then apply() (line 1905) re-sorts the merged list by recently-touched membership — a recently-touched related-via row can leapfrog a direct (non-touched) match. Either intentional (and should be documented) or a bug (and the merge ordering is wasted work). The current behavior surprises me: I expect direct matches to always win at the top.
M-2 — searchOtherPorts mixes tsvector + trigram + ILIKE inconsistently
Clients section uses tsvector OR ILIKE; berths section uses b.mooring_number % ${query} (pg_trgm operator with the default 0.3 threshold). Berths are short codes — trigram on them is unreliable ("A12" trigram similarity to "B12" is ~0.5, both surface). Standardize: berths should match via prefix only (consistent with the in-port searchBerths).
M-3 — searchNotes interest-branch source_label drops when no primary berth
Line 1166: b.mooring_number AS source_label is null when the interest has no primary berth, so the row's sourceLabel falls back to the generic "Interest" via labelForSource. The interest's client name would be a far more useful label (the interests bucket uses it). Patch: COALESCE with the client name via an extra JOIN.
M-4 — Paste-detection regex hard-codes invoice numbering shape
INVOICE_RE = /^INV-\d{6}-\d+$/i (line 92) assumes the legacy 6-digit prefix. The resolve-id endpoint also accepts invoice_number lookup, so non-matching shapes silently fall through to free-text search. Not a security issue, but if invoice numbering changes the paste shortcut breaks invisibly. Consider expanding to /^INV[-_/].+$/i and letting the resolve-id endpoint be the source of truth.
M-5 — Non-ASCII characters in names are stripped by tsquery sanitizer
buildPrefixTsquery (line 278) strips [^a-z0-9_], so Šibenik, Łukasz, Müller all reduce to empty tokens. The trigram fallback similarity() saves most of these (it's diacritic-tolerant for >0.3 similarity), but exact-prefix matching on accented names is lost. For Croatian / Polish / German tenant names this matters. Consider unaccent() before sanitization or relax the regex to \p{L}.
M-6 — expandGraph issues N+1-style queries for each direct ID set
The LIMIT ${perBucketCap * direct.<X>Ids.length} pattern (e.g., line 1387, 1463, 1486) scales the row cap by direct-match count. With limit=5 and 5 direct berth matches, that's 25 expansion rows fetched, then merged into the same 10-row limit * 2 cap downstream — most fetched rows are thrown away. Minor cost; cap globally instead.
M-7 — searchDocuments JOIN on document_signers has no port_id filter
Defense-in-depth: ds.signer_email ILIKE match is filtered through d.port_id, but the JOIN itself doesn't carry the port filter. Documents are FK-scoped, so no leak today, but the recommender pattern in this codebase (per CLAUDE.md) says "defense-in-depth port_id filter at every join." Apply the same here.
M-8 — import() of searchNavCatalog inside search() is sync wrapped in two awaits
Line 1867 — await Promise.resolve((await import('@/lib/services/search-nav-catalog')).searchNavCatalog(...)). The dynamic import is fine (avoids a circular dep), but Promise.resolve wrapping a sync result then awaiting it is dead ceremony. Inline or await import(...).then(...).
M-9 — Bucket ordering matches spec: notes second-to-last, navigation last ✓
BUCKETS in command-search.tsx (lines 60–81) — confirmed. Notes is index 14, Navigation is index 15. buildFlatRows preserves this order, and the comments at lines 75–79 and 1135–1138 document the rationale.
What works well
scoreEntryladder (label-exact 100 → label-prefix 80 → label-substring 60 → kw-exact 50 → kw-prefix 35 → kw-substring 20) is correct and matches the spec.- Paste detection: regex narrowness is fine because resolve-id is port-scoped and the fallback is normal search.
- The
NEVER_TSQUERY/NEVER_PHONEsentinels (line 385–386) correctly avoid Postgres-evaluation-order surprises that would otherwise break NULL guards in WHERE. searchBerthsexact-match short-circuit (line 757) is the right UX call — typing "A1" when A1 exists should not also dump A10–A19.- Catalog
requiresis permission-gated correctly andsearchNavCatalogrespects bothrequiresandsuperAdminOnly. mergeWithExpansionuses aSetdedupe — direct match wins, no duplicate rows.applyAffinityis stable wrt original order (line 327) when the touched-set is empty.
Recommendations, ranked
- Fix HIGH-1 immediately — graph-expansion permission leak. One-line gate per bucket merge.
- Resolve HIGH-2 catalog duplicates — product decision needed.
- Decide on M-1 — direct-first vs affinity-first. Document the chosen rule in the service docstring.
- Clean up HIGH-3 dead code or wire it up to actually reorder buckets for email/phone-shaped queries.
- Sweep through M-2 / M-5 / M-7 in a single pass — all are SQL-shape fixes in the same file.
12. Onboarding + first-run UX audit (onboarding-auditor)
Audit · Onboarding + first-run UX (task #12)
Scope: src/app/(dashboard)/[portSlug]/admin/onboarding, ensureSystemRoots,
seed-bootstrap.ts, the required-settings gates (SMTP / branding / EOI
signers / recommender), empty-state copy on the main lists, and the
"what works out of the box" path after POST /api/v1/admin/ports.
Bottom line: the checklist is the right shape but three of its nine auto-
checks read the wrong setting key, the forms step links to nowhere,
fresh ports ship with zero domain data (no berths, no tags, no signers),
and nothing prompts a freshly-invited admin to even open the checklist.
A new port is technically usable for clients/companies but cannot
generate an EOI without manual SQL or several blind admin visits.
CRITICAL
C1. Three checklist auto-checks read keys that no admin page ever writes
src/components/admin/onboarding-checklist.tsx STEPS declares
autoCheckSettingKey values that don't match what the linked admin
pages actually persist:
| Step | Checklist reads | Admin page actually writes |
|---|---|---|
email |
sales_email_smtp_host |
smtp_host_override (email page) / sales_smtp_host (sales-email card) |
documenso |
documenso_api_url |
documenso_api_url_override |
settings |
recommender_top_n_default |
nothing — DEFAULT_RECOMMENDER_SETTINGS covers all keys, admin never has to save |
Effect: a port that has actually been fully configured will still show those three steps as incomplete. The "manual mark done" fallback is hidden behind an extra click, and the percentage bar is permanently stuck below 70 %. This makes the checklist actively misleading — operators stop trusting it.
Fix: rename the keys to the _override variants (or both) and drop the
recommender auto-check (or check heat_weight_* whose presence
genuinely means "admin tuned it").
C2. forms step href is broken
STEPS[8].href = '../' resolves through the Link template to
/${portSlug}/admin/../ → /${portSlug}/ (the dashboard).
The intended target (src/app/(dashboard)/[portSlug]/admin/forms/page.tsx)
exists and is what the description references. Should be 'forms'.
C3. No gate on EOI signer identity
The checklist treats documenso_api_url (sic — see C1) as proof of
Documenso readiness, but the EOI pathway also requires
documenso_developer_name, documenso_developer_email,
documenso_approver_name, documenso_approver_email, and
documenso_eoi_template_id. Without these, buildDocumensoPayload
sends recipients with empty names/emails or the template-generate call
404s on a missing template id. There is no visible warning until a rep
tries to send the first EOI and Documenso bounces it. Add an
autoCheckSettingKey (or a derived multi-key check) for each so the
step doesn't go green until the developer + approver + template are all
populated.
C4. ensureSystemRoots is awaited but its failure mode poisons port creation
src/lib/services/ports.service.ts:46 awaits ensureSystemRoots(...)
after the INSERT INTO ports has committed (no surrounding tx).
The inline comment claims "non-fatal if this throws" — but a throw
propagates out of createPort, the route returns 500, and the operator
sees a failure even though the port row is live. The next admin action
self-heals through ensureEntityFolder's fallback, but the failed
response leaves the operator suspicious and re-POSTing produces a
409 slug already exists. Either wrap port + folders in a transaction
or catch + log + continue here so the error message matches the
comment's promise.
HIGH
H1. createPort seeds nothing beyond folders
createPort only writes the port row and the three system folders. It
does not seed:
- Default tags (the checklist asks for "starter tags" but offers no one-click default set)
- Default brochure (rep can't send the "send brochure" flow until one is uploaded; nothing flags this)
- Berths (no UI to add berths; the only path is
scripts/import-berths-from-nocodb.ts) berth_rules(defaults vary per trigger and are off forberth_unlinked— fine, but the absence isn't surfaced)email_from_address/branding_app_name(used in emails but not validated; sending mail with a blank from address fails silently on most providers)- Recommender weight rows (defaults work but the onboarding step reads the absence as "incomplete" — see C1)
Net effect: an admin can finish every onboarding step and still have a port that can't generate an EOI (no berths, no developer/approver, possibly no template) or send a brochure (no brochure exists). The checklist needs either (a) a "Seed defaults" button on port creation that writes recommended starter rows, or (b) explicit failing gates per domain.
H2. Storage step has no in-app action
autoCheckSettingKey: 'storage_backend' only flips green when a row
exists in system_settings — but the default backend (s3) is
inferred in code from loadStorageConfig() when no row is present, so
a perfectly functional s3-backed install never writes that row and the
step stays red forever. /admin/storage is read-only (status panel +
test connection); switching backends still requires a manual
UPDATE system_settings + pnpm tsx scripts/migrate-storage.ts.
Either add the writer UI or change this step to verify
getStorageBackend() round-trips a probe object.
H3. Roles step auto-ticks immediately
/api/v1/admin/roles → listRoles() returns all roles unfiltered
by portId, so the six global system roles created by seedBootstrap
make the count > 0 on the freshest possible port. The step turns
green without the admin doing a thing, and the description "Create
roles & assign users" implies they did. Auto-check should be a
per-port subset, e.g. count rows in user_port_roles for portId.
H4. No first-run prompt anywhere outside the buried nav link
The onboarding checklist lives under Admin → Tenancy → "Onboarding
checklist" (admin-sections-browser.tsx:300), described as
"read-only references" (which it is not — it has working manual
checkboxes). A freshly invited port-admin who logs in lands on
/{portSlug} and sees empty stat cards, with no banner, toast, or
"Finish setup" CTA pointing at the checklist. Discoverability is
effectively zero unless they know the URL. At minimum: dashboard banner
when < X of the auto-checks are passing, dismissible per user.
H5. Berth list empty state misleads fresh ports
src/components/berths/berth-list.tsx:
title="No berths found", description="Berths are imported from external sources. Adjust your filters...".
On a port with zero berths there is nothing to filter — the copy
implies the data exists but is hidden. Should branch on
totalCount === 0 && noFiltersActive and link to /admin/import with
the exact pnpm tsx scripts/import-berths-from-nocodb.ts command, or
to a future in-app importer.
H6. Two competing EmptyState components
src/components/ui/empty-state.tsx uses {body, actions}, while
src/components/shared/empty-state.tsx uses {description, action}.
Different list pages consume different ones (e.g. clients/yachts use
shared/, documents-hub uses ui/). Same visual but divergent props
will trip up any future "improve onboarding copy" pass. Consolidate.
MEDIUM
M1. Branding auto-check anchors on logo only
branding_logo_url is the proxy for "branding done", but
branding_app_name and branding_primary_color are more functionally
load-bearing (app name shows in email subjects, color in CTAs).
Consider branding_app_name as the gate — or any-of.
M2. Tags step has no "Apply default set" affordance
/admin/tags starts blank. Onboarding tells the operator to "define
starter tags" but offers no recommended palette. Add a one-click "Apply
recommended set (Hot / Warm / Cold / VIP / Press)" or similar so
operators have an opinionated baseline they can edit.
M3. Settings auto-check confuses "value exists" with "operator chose it"
Once the admin opens /admin/settings and saves without changing
anything, settings-manager.tsx writes the default back as a real row
and the checklist turns green. That's a side effect, not informed
consent. Use a sentinel ("admin saw this page") rather than a
defaultable knob.
M4. admin-sections-browser description is wrong
"Setup checklist for fresh ports (read-only references)" —
OnboardingChecklist has working toggleManual + persisted state.
Update the copy or it discourages clicking in.
M5. Vocabularies are global-code-constant
Interest sources / statuses / contact reasons come from
VOCABULARIES in src/lib/vocabularies.ts, not from per-port settings.
Fine for MVP, but the onboarding doc says "vocabularies" implying
configurability. Either expose per-port overrides or remove the
mention.
M6. Documents hub root view doesn't tell admins why Clients//Companies//Yachts/ exist
On first visit to /{portSlug}/documents, the system roots are
present (from ensureSystemRoots) but with zero children. Empty-state
copy ("Upload a file...") doesn't explain that the three locked
system folders will auto-populate as deals progress.
What works well
seedBootstrapis genuinely idempotent and safe to re-run.ensureSystemRootsrace semantics are clean; the partial-unique index pattern is exemplary.DEFAULT_RECOMMENDER_SETTINGSplusloadRecommenderSettings's layered (port > global > default) lookup means recommender is the one subsystem that genuinely works zero-config.- The checklist UI affordances (progress bar, auto-detected hint, manual-override button) are solid; only the wiring is wrong.
(~1,290 words)
27. Type-safety + drizzle leak audit (types-auditor)
Type-Safety + Drizzle Leak Audit — Task #27
Branch: feat/documents-folders · 2026-05-12
Top-line counts (src/, ts+tsx)
| Pattern | Count |
|---|---|
as unknown as |
72 |
as any (raw, mostly route hrefs) |
69 |
// eslint-disable @typescript-eslint/no-explicit-any |
73 |
// @ts-ignore / // @ts-expect-error |
0 |
as Route (typed-routes cast) |
17 |
$inferSelect / $inferInsert direct exports |
0 |
Bare : any parameter (not eslint-disabled) |
2 functional + 2 declarations |
Good news up front: no @ts-ignore / @ts-expect-error anywhere, and no $inferSelect type leaked through the API boundary as a public response contract. Service return shapes go through { data } envelopes; drizzle row types stay internal.
CRITICAL
1. tx: any in client-restore service — bypasses Drizzle's transaction type contract
src/lib/services/client-restore.service.ts:361
tx: any,
This parameter receives a Drizzle transaction client and threads writes through 12+ downstream tables in a multi-step restore. A typo'd table or wrong column type goes undetected at compile time. Type as Parameters<typeof db.transaction>[0] (see src/lib/db/utils.ts:17 for the same shape applied via as unknown as).
2. useQuery<any> + apiFetch<{ data: any }> on berth detail page
src/components/berths/berth-detail.tsx:20–25, 60
const { data, isLoading } = useQuery<any>({...});
apiFetch<{ data: any }>(`/api/v1/berths/${berthId}`)
const berth = data as any;
Three escape hatches stacked on the highest-traffic detail page. Every field access downstream is unchecked — a service-side rename to mooringNumber → mooring_number would silently render undefined. Replace with a BerthDetailResponse type co-located with the service.
3. Portal-auth and public routes bypass parseBody
6 portal + 3 public-intake routes use raw await req.json() instead of the project-standard parseBody(req, schema):
src/app/api/portal/auth/{forgot-password,reset-password,sign-in,activate,change-password}/route.tssrc/app/api/auth/set-password/route.tssrc/app/api/public/{residential-inquiries,website-inquiries,interests}/route.tssrc/app/api/v1/admin/custom-fields/[fieldId]/route.ts(intentional — comment explains)
CLAUDE.md mandates parseBody so 400 errors have field-level shape the toast hook recognizes. ZodErrors from schema.parse after raw req.json() become generic 500s. Custom-fields one is justified; the other 9 are not.
HIGH
4. mergePerms double-cast in new permission-overrides route
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:254, 259
const out = { ...(base as unknown as Record<string, Record<string, boolean>>) };
return out as unknown as RolePermissions;
Comment acknowledges this duplicates withAuth's deepMerge. Either reuse deepMerge from helpers.ts (lines 202–205, 234–237 already use the same pattern) or extract a typed helper mergePermsTyped(base, patch): RolePermissions. Two implementations of permission merge is a divergence risk.
5. Audit-log as unknown as Record<string, unknown> epidemic
21 occurrences across services that write oldValue / newValue to audit_logs:
invoices.ts× 7,expenses.ts× 6,documents.service.ts× 2,berths.service.ts× 2,companies.service.ts,company-memberships.service.ts,yachts.service.ts,document-templates.ts,ocr-config.service.ts× 2,ai-budget.service.ts× 2
Wide repetition of the same widening cast is a smell — every service does the same dance to fit Drizzle row types into the audit JSONB column. Fix: introduce toAuditJson<T>(row: T): Record<string, unknown> once in src/lib/services/audit.ts (same pattern gdpr-bundle-builder.ts already uses — toJsonRow, line 152 comment explicitly cites avoiding this). Removes 21 unsafe casts in one shot.
6. next/typedRoutes defeated by 49 as any href casts
router.push(..) and <Link href={..}> with template-literal dynamic URLs get widened to string, which isn't assignable to Route<string>. Components compensate via as any (49 sites) or as Route (17 sites). Hotspots: command-search.tsx (10), topbar.tsx (10), user-menu.tsx (8), reservation-list.tsx (6), residential lists/headers (8).
This nullifies the value of experimental.typedRoutes everywhere it matters most — dynamic navigation in shells, search, and detail headers. Fix: introduce route(path: string): Route helper in src/lib/routes.ts that does the cast in one audited place; ban as any/as Route for href via ESLint rule. Bonus: makes it possible to migrate to a real typed-routes wrapper later.
7. RolePermissions ↔ Record<string, unknown> round-trip in withAuth chain
src/lib/api/helpers.ts:203, 235 — two layers of permission merge cast both directions to satisfy deepMerge's untyped signature. deepMerge should be generic over <T extends Record<string, unknown>> and accept RolePermissions directly. Same problem as #4; same fix.
MEDIUM
8. Record<string, unknown> JSONB writes without zod re-parse at write time
Server-side blobs stored to system_settings.value, userProfiles.preferences, audit JSONB columns:
ocr-config.service.ts:79, 85—value: value as unknown as Record<string, unknown>. Upstream zod parse exists, so safe in practice, but the cast hides the relationship.ai-budget.service.ts:88, 94— same pattern.me/route.ts:148–168— good model: explicitALLOWED_PREF_KEYSallow-list + 8KB size cap + zod viaparseBody. Use as the template for the other two.components/admin/settings/settings-manager.tsx:237, 250andsettings-form-card.tsx:79–93— client-sideRecord<string, unknown>state. Admin-only surfaces, low risk, but no per-key shape check before PUT.
9. Dynamic-sort key cast in invoices list
src/lib/services/invoices.ts:199
column: invoices[query.sort as keyof typeof invoices] as unknown as PgColumn,
The query.sort zod schema should already enum-restrict sort keys to actual columns; if so, the inner as keyof typeof invoices is redundant. If query.sort is a free string, this is also a SQL-shape risk surface (mitigated only because Drizzle column proxy will throw on unknown keys). Verify the validator enum is exhaustive.
10. Template-preview accepts arbitrary content as TipTap
src/app/api/v1/admin/templates/preview/route.ts:32
const doc = body.content as unknown as TipTapNode;
Admin-gated, so blast radius limited, but the renderer downstream assumes well-formed TipTap JSON. Add a minimal tipTapNodeSchema zod check at the boundary — a malformed node tree would otherwise throw deep in the renderer with a useless stack trace.
11. Node stream ↔ Web stream casts
5 sites cast between NodeJS.ReadableStream, Readable, and ReadableStream<Uint8Array> via as unknown as:
src/app/api/storage/[token]/route.ts:103src/lib/services/expense-pdf.service.ts:507–510src/lib/services/document-sends.service.ts:374src/lib/services/brochures.service.ts:255,berth-pdf.service.ts:350
Type system gap is genuine (Node Readable ↔ Web ReadableStream don't have a structural match in lib.dom + @types/node). Centralize in src/lib/storage/stream-bridge.ts with named helpers toWebStream(readable) / toNodeStream(web) — removes the casts from feature code.
12. as unknown as { destroy: () => void } stream cleanup
brochures.service.ts:255 and berth-pdf.service.ts:350 reach into stream internals because the storage backend's return type doesn't expose destroy. Add destroy?(): void to the StorageBackend.get() return type so cleanup is part of the contract.
13. as unknown as string for pdfme BLANK_PDF sentinel
9 PDF templates carry basePdf: 'BLANK_PDF' as unknown as string. This is a known pdfme upstream type-def bug — the string literal 'BLANK_PDF' is accepted at runtime but typed as Uint8Array | string. Wrap once: const BLANK_PDF = 'BLANK_PDF' as unknown as string; exported from src/lib/pdf/constants.ts. Removes 9 casts.
14. Drizzle self-FK uses : any
src/lib/db/schema/system.ts:43
revertOf: text('revert_of').references((): any => auditLogs.id),
Standard Drizzle workaround for forward-references, but the official typing is (): AnyPgColumn. Swap.
15. phone-parse.ts metadata require
src/lib/dedup/phone-parse.ts:25 — const metadata: any = require(...) for libphonenumber-js/metadata.min.json. CommonJS interop hack; replace with import metadata from 'libphonenumber-js/metadata.min.json' + resolveJsonModule: true (already on in tsconfig.json).
Drizzle leak check — clean
Searched for $inferSelect / $inferInsert exports crossing the API boundary: zero hits in src/. Services return Drizzle row types internally, but every API route wraps them in { data } envelopes (confirmed by spot-check across invoices, berths, clients, documents). The Record<string, unknown> widenings flagged above happen at write time into JSONB columns, not at read time across the API surface. No PII columns or internal-only fields slip through.
Recommended sequence
- Critical first: fix #1 (
tx: any), #2 (berth detail<any>), #3 (parseBody for portal auth) — 3–4h. - One helper, big win:
toAuditJson<T>(#5) — removes 21 casts. - Route helper:
route()(#6) — removes 49+as anyand unblocks future real typed-routes adoption. - Stream bridge: centralize Node↔Web conversion (#11) — removes 5 casts.
- PDF constant: extract
BLANK_PDF(#13) — removes 9 casts.
Net effect: ~85 of 145 escape hatches removed with five focused refactors, and the remaining ones become small enough to justify case-by-case.
31. Auth flow polish audit (auth-flow-auditor)
Auth Flow Polish Audit — Task #31
Scope: CRM (auth)/ pages (login, reset-password, set-password), portal (portal)/ pages (login, activate, reset-password, forgot-password), email-change confirm/cancel landing, /api/auth/resolve-identifier, withAuth gates and Better Auth config.
Severities: CRITICAL = silent security/data risk · HIGH = real user can hit a dead-end · MEDIUM = polish/copy that erodes trust.
CRITICAL
C1 — Password reset does not revoke existing sessions on either flow
Better Auth's sendResetPassword (src/lib/auth/index.ts:73) is configured with no onPasswordReset / revokeAllSessions hook; the same is true for resetPassword in portal-auth.service.ts:428. Outcome: a compromised cookie keeps working forever after the legitimate owner does the "forgot password" dance. This is the canonical "session-bumping on reset" guarantee users assume and we're not delivering it. Add a step that deletes every row in sessions (CRM) and portal_auth_tokens + active portal_sessions (portal) for the affected user inside the same transaction that writes the new password hash.
C2 — Disabled CRM user retains an active session cookie
withAuth rejects with 403 "Account disabled" when userProfiles.isActive === false (src/lib/api/helpers.ts:152), but:
auth.api.signInEmailitself doesn't know aboutisActive— a disabled user can still complete/loginand be redirected to/dashboard, where every API call then 403s.- Setting
isActive=falseinupdateUser(users.service.ts:227) never deletes the existingsessionsrow, so an already-logged-in disabled user keeps every page that doesn't hit/api/v1working, and any cached SSR page loads.
Fix: (a) on signIn add a profile lookup and reject before issuing the cookie; (b) on isActive=false flip, delete from sessions for that userId; (c) middleware should treat a 403 from an API as a global redirect to /login?reason=disabled.
HIGH
H1 — No dedicated "this link expired / already used / your account is disabled" landing pages
Every token failure today is surfaced as a toast on a still-functional form, or as a 400 JSON error that the user only sees if they actually submit the form.
set-password/page.tsx(CRM) handles!token(the "Link is missing or invalid" branch, line 73-88) but does NOT distinguish "token present but expired" from "token present but already used" — both surface as atoast.error(body.message)and leave the form interactive, inviting an infinite retry loop.- The portal
password-set-form.tsxis identical (line 63-67): expired/used tokens render as a red<p>under the form. - There is no
/account-disabledpage; the user just sees403 Account disabledtext from the JSON response in DevTools and gets stuck on/dashboardrendering nothing (or rendered SSR shell with broken API calls).
Recommend: a single <TokenStateMessage state="expired" | "used" | "invalid" /> component that the page server-renders by doing a HEAD-style validate call on mount, plus a /disabled route the middleware redirects to.
H2 — Email-change /settings?emailChange=confirmed|cancelled query param is never consumed
The confirm/cancel redirect URLs at api/v1/me/email/{confirm,cancel}/[token]/route.ts:71/50 set ?emailChange=confirmed|cancelled. Grep shows ZERO consumer in src/app or src/components. The redirect succeeds, but the user lands on the bare /settings page with no banner, no toast, no confirmation — for a security-sensitive action this looks broken / makes users wonder if it took. Wire a banner in user-settings.tsx keyed off useSearchParams().get('emailChange').
H3 — Cancel-email-change link is GET-only with no friction
api/v1/me/email/cancel/[token]/route.ts is a one-click GET that wipes the pending change. Gmail/Outlook link prefetchers, antivirus URL scanners, and corporate proxies will auto-fetch links and cancel a legitimate request without the user ever clicking. Pattern for safety: GET renders a confirmation page (Are you sure? button), POST executes. Same fix needed on confirm if a link-scanner could pre-confirm an attacker's address-change before the real user sees the cancel link.
H4 — set-password (CRM) success path has no auto-sign-in
set-password/page.tsx:64 toasts "Password set successfully" then routes to /login. The user has to type their email and the password they just chose, again. For invite flows this is the worst conversion point. Either (a) auto-sign-in via auth.api.signInEmail after the consume call returns, or (b) at minimum prefill the email field on /login. (Portal's activate flow has the same problem in password-set-form.tsx:97).
H5 — Reset-password expiry shows no time estimate; users hit "expired" cold
CRM (auth)/reset-password/page.tsx:62 says "we have sent a password reset link" with no TTL. Better Auth default reset token expiry is 1 hour (the email body on line 79 mentions "expires in 1 hour" but the success page doesn't echo this). Portal forgot-password (forgot-password/page.tsx:43) correctly says "expires in 30 minutes". Make the CRM message say "Link expires in 1 hour" so users at the airport know whether to wait.
H6 — resolve-identifier returns 429 with { email: '' } which bypasses the synthetic miss path
api/auth/resolve-identifier/route.ts:56 returns { email: '' } on rate-limit. Client (login/page.tsx:56) does payload.email?.trim() || identifier — so the original username (without @) is passed into authClient.signIn.email. Better Auth rejects it as "invalid email format" instead of "invalid credentials", which is a distinguishably-different error from the normal miss case and re-opens the enumeration channel the synthetic-email defence was built to close. Return { email: syntheticEmail(raw) } on the 429 path too (status code can stay 429).
MEDIUM
M1 — Login error toast leaks Better Auth wording
login/page.tsx:65 uses result.error.message ?? 'Invalid credentials'. Better Auth surfaces strings like "User not found" / "Invalid password" / "Email or password is invalid" depending on the path — the first two are an enumeration leak that bypasses the resolve-identifier defence. Always overwrite to a fixed 'Email or password is incorrect.' and only log the underlying reason server-side.
M2 — Portal sign-in error message is friendlier than CRM
Portal: 'Invalid email or password'. CRM: raw Better Auth message OR 'Invalid credentials'. Unify on "Email or password is incorrect" everywhere (matches CRM (auth)/login/page.tsx:65 and portal (portal)/portal/login/page.tsx:37) — the CRM phrasing "Invalid credentials" is jargon.
M3 — set-password divergence: form-validation TTL mismatch
CRM set-password/page.tsx requires min 9 chars (line 16). Portal password-set-form.tsx also 9 (line 23). But the activation/CRM invite TTL diverges silently: CRM invite = 72h (crm-invite.service.ts:17), portal activation = 72h (portal-auth.service.ts:25), portal reset = 30min, CRM reset = 60min. The "request a new link" copy in the invalid-token branch should embed the actual TTL so admins debugging "why doesn't this work" don't have to read the schema.
M4 — set-password (CRM) error fallback is inconsistent shape
(auth)/set-password/page.tsx:60 reads body.message ?? body.error — but api/auth/set-password/route.ts uses errorResponse(err) which emits { error }. The message key is dead code, fine, but the legacy comment on set-password/route.ts:24 says envelopes were normalised in commit "auditor-F §32" — the page should match: body.error ?? 'Failed to set password.'.
M5 — Portal forgot-password 30-min TTL is short for international clients
30 minutes is aggressive when emails routinely sit in spam quarantine for 5-15 minutes before clearing. CRM reset's 60min is a sensible floor. Either lift to 60min or surface the 30min countdown more aggressively in the email + landing page.
M6 — login Suspense fallback for set-password renders empty shell
set-password/page.tsx:143 falls back to <BrandedAuthShell>{null}</BrandedAuthShell> — a flash of empty branded card while useSearchParams resolves. Replace with a skeleton or "Verifying link…" microcopy; the empty state reads as "page broken" for ~100ms on slow networks.
M7 — /portal/activate Suspense fallback is unbranded grey div
portal/activate/page.tsx:8 falls back to a plain Loading… div — jarring after the branded email. Mirror the CRM set-password pattern with <BrandedAuthShell>. Same on portal/reset-password/page.tsx:8.
M8 — "Request a new link" target on portal set-password invalid-token is wrong for activation flow
password-set-form.tsx:86 always points to /portal/forgot-password. For activation the user has no password yet — /portal/forgot-password returns the silent 200 and the admin has to manually resendActivation. Branch on endpoint, or give portal users a self-service "Resend activation".
M9 — No "Remember me" / shared-device control
Better Auth session expiresIn: 24h (auth/index.ts:94); portal token also 24h. No checkbox to shorten on a shared device, no copy saying so. Add a session-only cookie path.
M10 — Portal login next param is unvalidated
portal/login/page.tsx:42: router.replace(next as never) where next = search.get('next'). Open redirect: /portal/login?next=https://evil.example navigates cross-site after sign-in. Validate next.startsWith('/portal/') before using.
Summary
- 2 CRITICAL (no session-revoke on password reset; disabled-user keeps session)
- 6 HIGH (no expired/used/disabled landing pages; emailChange success param consumed by nobody; GET-cancellation prefetch risk; no auto-sign-in after set-password; missing TTL copy; rate-limit branch leaks enumeration)
- 10 MEDIUM (copy inconsistencies, branded-shell drift, open redirect in portal
next, no shared-device session control)
Token mechanics themselves are sound (32-byte CSPRNG, SHA-256 storage, single-use markers, dual rate-limit buckets, anti-enumeration silent-200 on forgot-password, dummy-hash timing equalisation in portal signIn). The polish gaps are in what happens after a token succeeds or fails — landing pages, banners, session lifecycle.
30. Image + asset hygiene audit (asset-auditor)
Image + Asset Hygiene Audit (Task #30)
Scope: uploaded-image handling across avatar, brochure, berth-PDF, generic file uploader, receipt scanner, and the new portrait avatar cropper. EXIF, MIME spoof, polyglots, server-side resize, dimension caps, SVG/GIF risk, filename sanitisation, Content-Disposition, per-surface size caps.
Files reviewed (highlights):
src/lib/constants/file-validation.ts(allow-list + magic bytes)src/lib/services/files.ts(uploadFile+ previews)src/lib/services/storage.ts(sanitizeFilename)src/app/api/v1/me/avatar/route.ts+src/components/shared/image-cropper-dialog.tsxsrc/app/api/v1/files/upload/route.tssrc/app/api/storage/[token]/route.ts(filesystem-backend proxy)src/lib/services/berth-pdf.service.ts,brochures.service.ts,expense-pdf.service.tssrc/app/api/v1/documents/[id]/download/[...slug]/handlers.ts
CRITICAL
C1. No server-side image normalisation on avatar / generic image uploads — EXIF (GPS) is persisted and served verbatim
Where: src/app/api/v1/me/avatar/route.ts:46-68, src/lib/services/files.ts:45-72.
The /api/v1/me/avatar handler takes the multipart body, checks size (≤2 MB),
runs bufferMatchesMime (first-bytes-only) and writes the bytes straight to
storage. The "cropper" (image-cropper-dialog.tsx) does run a Canvas re-encode
client-side, which incidentally drops EXIF — but a malicious user (or simply
any user with curl) can bypass the cropper by POSTing the raw image directly
to the same endpoint. The route accepts whatever JPEG/PNG/WebP/GIF arrives and
the generic uploader (/api/v1/files/upload) has the same property: no
sharp().rotate().toBuffer() normalisation, no EXIF strip, no ICC profile
reset, no re-encode.
Result: every photo uploaded from a phone — receipt scans (/expenses/scan),
client/yacht photo attachments via FileUploadZone, manual avatar PUTs — is
served from MinIO with full EXIF: GPS latitude/longitude, device serial,
photographer name, original capture timestamp.
GDPR/PII exposure (audit #8 already flagged related issues). The previewer just hands a presigned URL straight to the browser, so any rep / client portal visitor with a download URL gets the metadata.
Fix: run every accepted image/* payload through sharp(buf).rotate() .withMetadata({ orientation: undefined }).toBuffer() (or .toFormat(jpeg|png|webp))
in uploadFile() before backend.put(...). Sharp is already a dependency
(used by expense-pdf.service.ts). Same wrapper enforces a max-pixel cap
(see H1).
C2. No max-dimension / decompression-bomb gate on uploaded images
Where: src/lib/services/files.ts:38-72, src/app/api/v1/me/avatar/route.ts.
MAX_FILE_SIZE is 50 MB (avatar: 2 MB). Neither path inspects width/height.
A 2 MB highly-compressed PNG can decode to >300 megapixels (e.g. a 30000×30000
palette PNG). Any downstream consumer that decodes:
- the
<AvatarImage>in the React UI, pdf-lib/pdfmeembedding the image into a generated client/interest PDF,sharpresize inexpense-pdf.service.ts, will OOM or pin a worker. The receipt-PDF service runs sharp but only whenraw.byteLength > 500 KB, so a 400 KB decompression-bomb PNG skips the threshold andsharpis called on the embed path with no dimension cap, and PDFKit attempts to embed the raw bytes.
Fix: in the normalisation step from C1, cap output to MAX_DIMENSION = 4096
(or 2048 for avatar) using sharp.resize({fit:'inside',withoutEnlargement:true})
and reject any source whose metadata().pixels > LIMIT before allocating the
decode buffer.
HIGH
H1. Magic-byte check is prefix-only — polyglots pass
Where: src/lib/constants/file-validation.ts:48-87.
bufferMatchesMime checks the leading 3–8 bytes. PNG/JPEG/GIF/WebP/ZIP-based
office formats all share short, well-known prefixes. A file beginning
FF D8 FF ... <PDF body> ... <HTML> <script>...</script> passes as
image/jpeg, lives in storage as image/jpeg, and gets served from a
presigned MinIO URL with Content-Type: image/jpeg. With nosniff set this
is mostly inert in modern browsers, but:
- The S3-presigned download URL is hit directly by the browser (the proxy
with
X-Content-Type-Options: nosniffis only on the filesystem backend at/api/storage/[token]). MinIO/S3 does not addnosniffautomatically. - The signed URL is on the same origin's CDN for portal users when MinIO is fronted by the marketing site, raising same-origin sniff risk.
The avatar/general path has no trailing-byte gate. Compare with the PDF path
which at least checks both %PDF- prefix (good) and doesn't enforce a
trailing EOF marker — same shape weakness.
Fix:
- After the prefix check, run
sharp(buf).toFormat(declared)re-encode (from C1) which strips any non-image trailer. - Force
ResponseContentDisposition/ResponseContentTypeon the presigned download (minio-js supports both viarespHeaders) so MinIO emitsX-Content-Type-Options: nosniffregardless of object metadata.
H2. Filesystem backend proxy enforces stronger checks than the S3 path
Where: src/app/api/storage/[token]/route.ts:217-225 vs src/lib/storage/s3.ts:249-262.
The filesystem PUT proxy does a magic-byte check on the streamed body when
the token's declared content-type is application/pdf. The S3 presigned PUT
(used in prod) lets the browser stream straight to MinIO — the only
post-upload verification is in berth-pdf.service.ts:234-262 and
brochures.service.ts:230-263. Generic image uploads via S3 presigned PUT
have no post-upload verification at all because no caller currently mints
presigns for arbitrary images — but the abstraction allows it. If a future
caller ever presigns a non-PDF, the S3 path will accept anything.
Fix: make presignUpload accept a verifyMagicBytes: true flag and require
every caller to opt in/out explicitly. Or wrap S3 presigns in a one-shot
post-upload head + first-5-bytes verifier (the brochure path already does
this; lift it into getStorageBackend().registerUpload(...)).
H3. Animated GIF is allowed with no frame cap
Where: ALLOWED_MIME_TYPES includes image/gif. No upstream consumer
inspects metadata().pages or metadata().delay.
A 50 MB animated GIF with 5000 frames at 5 ms delay will burn CPU on every rep's client list view and on PDF embed. Also a known browser DoS vector.
Fix: during the sharp normalisation (C1), pass { animated: false } so
only the first frame is kept, or set pages: 1. Or drop GIF from
ALLOWED_MIME_TYPES entirely — the CRM has no real reason to accept it
(reps share PNG/JPEG, brochures are PDF).
H4. Avatar Content-Type echoes browser-declared MIME — preview endpoint trusts blindly
Where: src/app/api/v1/me/avatar/route.ts:53 — mimeType: fileEntry.type || 'image/jpeg'.
fileEntry.type is the browser-declared type. Magic bytes are checked but
the stored content-type is still the declared one. If a uploader claims
image/webp but sends a JPEG (passes magic-byte check against jpeg signature?
no, but a crafted polyglot can pass webp's RIFF check while embedding extra
bytes), the stored mimeType is wrong. Downstream PREVIEWABLE_MIMES check
goes off files.mimeType so the server's content-type lies.
Fix: after the magic-byte check, derive the canonical MIME from the matched signature (one entry per signature) and store that, not the browser-declared form.
MEDIUM
M1. Content-Disposition for /api/v1/documents/[id]/download/... lacks RFC 5987
Where: src/app/api/v1/documents/[id]/download/[...slug]/handlers.ts:68,92-94.
sanitizeFilenameForHeader replaces "/\/CRLF with _ but emits only
filename="..." — Unicode filenames render as mojibake / get truncated by
Firefox + Safari. The /api/storage/[token] proxy gets this right
(filename*=UTF-8''<encoded>); the doc download doesn't. The other
ad-hoc PDF exports (clients/[id]/export-pdf, berths/[id]/export-pdf,
interests/[id]/export-pdf) hard-code ASCII filenames and skip RFC 5987 too
— acceptable because they're constant, but the doc download is dynamic.
Fix: mirror the storage-proxy form:
attachment; filename="<sanitised-ascii>"; filename*=UTF-8''<encoded> and
switch the disposition from inline to attachment for non-previewable
MIMEs (the current inline lets a malicious file open in-page even with
nosniff).
M2. sanitizeFilename doesn't strip RTL/zero-width Unicode
Where: src/lib/services/storage.ts:15-22.
Strips [/\\:], NUL, and \x01-\x1f\x7f. Doesn't touch:
- U+202E RIGHT-TO-LEFT OVERRIDE — classic Windows-icon-spoof vector
(
invoice_fdp.exedisplays asinvoice_exe.pdf). - U+200B/U+200C/U+FEFF zero-widths — collision spoofs in folder listings.
- Surrogate halves.
Fix: Unicode-normalise (name.normalize('NFC')) then drop the
Cf/bidi-control category, e.g. via /[---]/gu.
M3. No per-surface size caps beyond avatar/PDF
Where: file-validation.ts (50 MB), avatar (2 MB), berth PDFs (admin
setting), brochure (admin setting). The generic uploader has only the
50 MB ceiling — applies equally to a yacht photo, a maintenance-log
attachment, a client document scan. Reps could legitimately upload a 49 MB
phone-camera PNG and it would be embedded into PDFs without resize.
Fix: uploadFile should branch on category (avatar, yacht_photo,
maintenance, attachment, …) and apply per-category byte + dimension caps,
not a flat 50 MB.
M4. text/plain / text/csv have no signature verification
Where: file-validation.ts:71-72 (intentionally unconstrained), served
through the same presigned URL path as binary files. A user can upload
evil.html claiming text/plain; with nosniff plus the stored
Content-Type: text/plain modern browsers display it as text, but stale
links that get loaded via <iframe src="..."> will render as the declared
type. Lower risk than C1/H1 but worth tightening.
Fix: sniff: reject when the first 512 bytes contain <script, <html,
<!DOCTYPE, or non-printable bytes outside common encodings. Or require an
explicit category: 'text' for these MIMEs and refuse them on the avatar /
attachment surfaces.
M5. Image cropper outputs only 512 px JPEG @0.85 — no enforcement that the upload matches
Where: src/components/shared/image-cropper-dialog.tsx:51,70.
outputWidth = 512 is the only client-side cap. Once the cropped JPEG hits
the server, the server does not verify that the avatar is square or under
some pixel ceiling — the server just sees a 2 MB image. A scripted client can
ship a 4000×4000 JPEG straight to /api/v1/me/avatar because the cropper is
client-side. Tied to C1's fix (normalise + resize server-side).
M6. No HEIC/HEIF support — iOS share-sheet uploads silently fail
Where: ALLOWED_MIME_TYPES. iPhones default photo format is image/heic;
the receipt scanner (scan-shell.tsx:494) uses accept="image/*" so the
browser allows the pick, then the upload 400s with "type not allowed". UX
regression more than security.
Fix: add image/heic + image/heif to the allow-list and transcode to
JPEG in the same normalisation pass (sharp 0.34 supports HEIC via libvips
- libheif, but check the deploy image first).
Already strong (no action)
- PDF magic-byte gate on both in-server and presigned-PUT paths
(
berth-pdf.service.ts,brochures.service.ts, filesystem proxy). - SVG excluded from
ALLOWED_MIME_TYPES— no SVG-XSS surface in user uploads. The only SVG generation is the chart-card data-URI which is produced by the CRM, not user-controlled. - Filesystem-backend proxy sets
X-Content-Type-Options: nosniff+Cache-Control: private, no-store+ single-use HMAC tokens. - Storage key derivation is UUID-based (
generateStorageKey) so original filename never controls a path — no path-traversal surface from filenames. uploadFileallow-list + size cap + magic-byte composition.
Recommended order
- C1 + C2 + H1 together: introduce a
normalizeAndStoreImage()helper insrc/lib/services/files.tsthat runs every accepted image throughsharp().rotate().withMetadata({orientation:undefined}).resize().toFormat()beforebackend.put(). Drops EXIF, kills polyglot trailers, caps pixels. - H4 + M5: derive canonical MIME from the matched signature; treat
mimeTypefield as server-authoritative. - M1 + M2: tighten filename + disposition headers.
- H2: lift the post-upload verify out of berth-pdf / brochure into the storage abstraction.
- M3 / M6: per-surface caps + HEIC transcode (deploy-image work).
- H3 / M4: drop GIF or freeze to first frame; sniff text payloads.
(word count ≈ 1280)
21. Mobile + PWA + iOS quirks audit (mobile-pwa-auditor)
Mobile + PWA + iOS quirks audit
Branch: feat/documents-folders · Scope: src/app/(scanner), src/components/layout/mobile/*, src/components/search/mobile-search-overlay.tsx, src/components/shared/drawer.tsx, src/middleware.ts, public/manifest.json, public/icon-*.png, root layout viewport/metadata, tailwind.config.ts safe-area utilities.
CRITICAL
None blocking ship.
HIGH
H1. No service worker is registered — /scan PWA has zero offline capability
Grep serviceWorker|navigator.serviceWorker|workbox|next-pwa returns nothing across src/ + public/. The per-port manifest declares display: 'standalone' and the scanner's whole product premise is "rep walks the marina with a phone capturing receipts", i.e. exactly the situation where Wi-Fi drops to nothing between pontoons. Consequences:
- iOS Add-to-Home Screen installs succeed but cold-launch with no signal fails at the first network call (Next.js page chunks 404 in WebView).
- The OCR + upload + create-expense chain in
ScanShell(src/components/scan/scan-shell.tsx) has no offline-queue / retry.kind: 'error'is rendered and the only artifact is the in-memory blob — closing the PWA loses the photo and the manual-typed fields. - Android Chrome will refuse to fire
beforeinstallpromptwithout a service worker, so the install prompt never auto-surfaces.
Fix (in priority order): (1) ship a minimal Workbox/next-pwa SW that precaches the scanner route + Tesseract WASM + lucide icons, (2) wrap the expense submit in an outbox (IndexedDB queue → background sync), (3) capture beforeinstallprompt and surface an "Install" CTA inside the idle-state scanner card.
H2. Two manifests overlap — root /manifest.json and dynamic /[portSlug]/scan/manifest.webmanifest
- Root layout (
src/app/layout.tsx:47) declaresmanifest: '/manifest.json'withstart_url: '/',theme_color: #0f172a(slate-900). Root viewport saysthemeColor: '#1e2844'(navy). Two different theme colors → Chrome will pick the<meta name="theme-color">from<head>(navy) but the manifest install splash will use#0f172a. Cosmetic mismatch on install. - Scanner manifest overrides scope to
/<portSlug>/scanwiththeme_color: #3a7bc8(brand blue) and viewportthemeColor: '#3a7bc8'— internally consistent. ✓ - Issue: if a rep visits
/<portSlug>/dashboardand hits "Add to Home Screen" (rare but possible), they get a PWA whosestart_urlis/which redirects to/loginon every cold-launch because the root<head>resolves the unscoped manifest first. There is no<link rel="manifest">swap between the two surfaces; Next.js'sgenerateMetadataon the scanner route DOES override the root metadata (verified atsrc/app/(scanner)/[portSlug]/scan/layout.tsx:28), but root/manifest.jsonstill defines a competing PWA.
Fix: either narrow the root manifest's scope and start_url to /login (so non-scanner installs land on auth), or remove root manifest: and lean solely on the per-port scoped scanner manifest. Add start_url: '/<portSlug>/dashboard' per-port via a second dynamic manifest for the main app, if installable main-app is even desired.
H3. iOS standalone status-bar / safe-area mismatch in the scanner
- Scanner layout declares
appleWebApp.statusBarStyle: 'default'(src/app/(scanner)/[portSlug]/scan/layout.tsx:32) — that's the white-bar-with-black-text style that iOS draws OPAQUELY above the WebView, NOT under it. viewport.viewportFit: 'cover'is set (line 46) which tells iOS to let content extend under safe areas.ScanShell(src/components/scan/scan-shell.tsx:449) renders<main className="mx-auto ... min-h-[100dvh] w-full max-w-xl ... px-4 py-6 sm:py-10">— nopt-safe-top, nopb-safe-bottom, nosafe-left/safe-right.- Result on iPhone 14/15 with home indicator + standalone install: the "Capture receipt" / "Save expense" buttons sit flush against (or under) the home-indicator stripe. The brand logo at the top is fine because
py-6 sm:py-10happens to clear the notch — by accident, not by design.
Fix: add pb-[calc(env(safe-area-inset-bottom)+1rem)] to <main>, switch statusBarStyle to 'black-translucent' so the brand-blue theme paints over the status area (or to 'default' AND remove viewportFit: 'cover'), and add pl-safe-left pr-safe-right for landscape edge-case.
H4. Dashboard mobile shell uses min-h-screen (100vh) instead of 100dvh
src/components/layout/mobile/mobile-layout.tsx:24,29 uses min-h-screen twice. On iOS Safari (not standalone) 100vh is the LARGE viewport height (URL bar collapsed), so on first paint the page renders ~75–100px taller than visible. The bottom tab bar is position: fixed so it lands correctly, but <main>'s min-h-screen means content scrolls below the visible viewport on initial load — reps see a blank strip past the tab bar until the URL bar collapses on first scroll.
Fix: swap both min-h-screen for min-h-[100dvh] (Tailwind 3 supports dynamic viewport units). The scanner layout already does this correctly (src/app/(scanner)/[portSlug]/scan/layout.tsx:68).
MEDIUM
M1. Touch targets below 44pt in the mobile search overlay
src/components/search/mobile-search-overlay.tsx:
- "Cancel" button (line 273) is plain text — no min-height, hit-area ≈ 16px tall. Thumb-prone position next to the keyboard.
- Clear-X button (line 260) is
size-7= 28px. Below Apple HIG 44pt. - Bucket chips (line 344) are
px-3 py-1.5 text-xs→ ~28px tall. Apple HIG 44pt fail; they're scrollable so misses are recoverable, but each chip needsmin-h-[44px]or a transparent expanded hit-box (before:absolute before:inset-0 before:-my-2).
M2. Inline-editable-field hit-areas too small for marina-glove use
src/components/shared/inline-editable-field.tsx:133,172,257 uses h-8 (32px) and h-7 (28px) for the edit-mode inputs and select triggers. Detail pages on mobile share this pattern. Apple HIG fail; reps with wet/salty fingers on a pontoon will mis-tap. Bump to h-11 (44px) on mobile or guard with a min-h-[44px] md:h-8 mobile-first override.
M3. visualViewport.offsetTop ignored in search overlay positioning
src/components/search/mobile-search-overlay.tsx:76–86 subscribes to visualViewport.resize + scroll and reads vv.height. The drawer uses top: 12px + computed height. But vv.offsetTop (the visual-viewport's vertical offset within the layout viewport) is not consulted. On iOS Safari with keyboard up + rubber-band scroll, the visual viewport can shift relative to layout; the drawer's top: 12px is layout-viewport-relative, so the top of the drawer can briefly clip up under the URL/status bar. Minor visual artifact; only affects scrolled-during-typing states.
Fix: top: ${(vv?.offsetTop ?? 0) + 12}px.
M4. Mobile bottom tabs lack safe-left / safe-right insets
src/components/layout/mobile/mobile-bottom-tabs.tsx:42–47 uses pb-safe-bottom only. The dynamic manifest forces orientation: 'portrait' ONLY when installed as a PWA. In Safari (pre-install) on iPhone landscape, the bottom tab bar tucks under the notch. Add pl-safe-left pr-safe-right (Tailwind pl-safe-left resolves to padding-left: env(safe-area-inset-left)).
M5. Stale memory + suspiciously small PNGs
project_pwa_assets_pending.md claims icons must be added; all four exist in /public (icon-192 = 688B, icon-512 = 2411B, 512-maskable = 2411B, apple-touch = 654B; dated 2026-05-03). Memory note is stale — delete it. However: 688B / 2411B is small for a real branded PWA icon — these look like placeholders. Swap in production artwork before launch.
M6. apple-touch-icon at /apple-touch-icon.png not referenced by the scanner manifest
The root metadata icons block (src/app/layout.tsx:40–46) declares apple: '/apple-touch-icon.png' (180×180). The scanner layout only sets manifest: + appleWebApp — it inherits the root icons.apple because Next.js does shallow-merge of metadata. ✓ but only because of inheritance; explicit confirmation in a comment would prevent future regressions if someone overrides icons: in the scanner layout.
M7. No apple-mobile-web-app-status-bar-style mismatch detection between routes
Root layout: 'black-translucent' (matches navy theme + safe-area inset). Scanner: 'default' (white opaque bar). When a rep navigates from /scan into the main CRM via a deep link inside the same PWA install, iOS uses the install-time status bar style and ignores per-page overrides — so depending on which surface they installed FROM, every other surface looks wrong. Pick one style and apply globally; recommend 'black-translucent' plus consistent safe-area-inset usage on every shell.
M8. Vaul drawer repositionInputs={false} defaults are correct, but iOS keyboard layoutViewport vs visualViewport edge case
src/components/shared/drawer.tsx:20–22 defaults shouldScaleBackground: false and repositionInputs: false. The comments in mobile-search-overlay.tsx:106–118 describe the iOS reasoning correctly. Verified ✓. However, the MoreSheet's <DrawerContent> uses default bottom: 0 anchoring (no visualViewport-based height override). If MoreSheet ever gains a text input, it'll exhibit the same scroll-then-jump the search overlay had to special-case. Currently MoreSheet is link tiles only — non-issue unless inputs are added.
M9. No <NoScript> or offline fallback page anywhere
If the scanner PWA cold-launches with no network and no service worker (H1), Next.js's standalone-mode router will fail-soft to a blank screen. There is no not-found.tsx, error.tsx, or offline.tsx in src/app/(scanner)/[portSlug]/scan/. Goes hand-in-hand with H1.
M10. The legacy /expenses/scan page coexists with the new /scan PWA flow
src/app/(dashboard)/[portSlug]/expenses/scan/page.tsx is a desktop-flavored scan-receipt page inside the dashboard shell — different from the standalone PWA at /[portSlug]/scan. Both upload to the same /api/v1/expenses/scan-receipt and /api/v1/expenses endpoints, but the user-facing flows diverge (the dashboard one has both camera + file picker buttons; the PWA one is camera-first). Confusion risk; pick one or clearly label the dashboard surface as "Upload receipt (desktop)" vs the PWA "Scan receipt".
M11. interests/interest-list.tsx FAB safe-area offset is hand-rolled
Line 350 hardcodes bottom-[calc(env(safe-area-inset-bottom)+86px)] where 86 = tab-bar height (56) + 30px gap. If tab-bar height changes, FAB collides. Extract MOBILE_TAB_BAR_HEIGHT to a shared constant or CSS var.
Quality nits
- Scanner manifest
short_name: 'Scanner'vsappleWebApp.title: 'PN Scanner'→ installed-app label differs across iOS/Android. Unify on "PN Scanner". safe-left/safe-rightTailwind utilities are declared (tailwind.config.ts:150–154) but never referenced anywhere insrc/.must-revalidateon manifestCache-Controlis redundant alongsidemax-age=300.
What's solid
- Per-port dynamic manifest with proper
scope+start_url. viewportFit: 'cover'+ safe-area-inset utilities in topbar/bottom-tabs.-webkit-tap-highlight-color: transparentglobal (globals.css:98).- Vaul defaults
shouldScaleBackground: false,repositionInputs: false(drawer.tsx:20–22) match iOS+Vaul known-issue guidance. visualViewport.heighttracking for above-keyboard sizing (modulo M3).- Drawer GPU-compositing hints (
globals.css:261–267). - HEIC-safe capture (
accept="image/*"+capture="environment"). - Tesseract.js on-device first, AI optional — privacy-respecting fallback.
- Middleware correctly exempts
/scan/manifest.webmanifest+/scanfrom auth (middleware.ts:17,33).
Top 3 to fix before launch: H1 (service worker + offline queue), H2 (manifest scope overlap), H3 (scanner safe-area bottom-button collision). Everything else is polish.
26. Multi-currency + FX correctness audit (currency-auditor)
Multi-currency + FX correctness audit — task #26
Scope: USD-vs-port-currency across berths/invoices/reports/expenses, FX
snapshotting, currency_rates retention, rounding, mixed-currency
dashboard totals, PDF math, berths_default_currency, hardcoded USD,
formatCurrency. Read-only. Branch feat/documents-folders.
CRITICAL
C1. Dashboard "Pipeline Value" sums mixed currencies as USD
dashboard.service.ts:39-51 and :95-160 reduce berths.price into
pipelineValueUsd without reading berths.priceCurrency, then the
UI labels the result 'USD'
(pipeline-value-tile.tsx:45-47, kpi-cards.tsx:19,
revenue-forecast.tsx:25). Same bug in getRevenueForecast
(weighted pipeline) and the stage-weights total. A single non-USD berth
poisons the headline KPI; masked today only because Port Nimara is USD-
only. With the new per-port berths_default_currency setting this will
detonate as soon as a second port chooses EUR/GBP.
Fix shape: either (a) refuse to aggregate mixed currencies and render a
grouped figure like the Revenue Breakdown chart already does, or (b)
convert per row via convert(price, priceCurrency, 'USD') and surface
the conversion timestamp. (a) is safer — (b) hides FX risk in one number.
C2. Revenue / Pipeline PDF reports drop currency entirely
pdf/templates/reports/revenue-report.ts:78-97 and
pipeline-report.ts:91-100 render amounts with
Number(...).toLocaleString(undefined, …) — no currency code, no
symbol, no formatCurrency. The generator
(report-generators.ts:106-147) sums berths.price across all
currencies, again ignoring priceCurrency. PDF output reads
TOTAL COMPLETED REVENUE: 1,234,567.00 with no unit. Plus the implicit
undefined locale means the same PDF renders differently between US-en
and de-DE nodes — non-deterministic under Next.js standalone runtime.
Combined with C1 these are the highest-risk financial artefacts in the
app — they ship to ownership.
C3. expenses.amountUsd snapshot is brittle and date-misaligned
expenses.ts:117-135 and :227-249 snapshot amountUsd +
exchangeRate on the row at create/update — good. But:
- Frankfurter unreachable at create time →
amountUsd = null,exchangeRate = null. The PDF (expense-pdf.service.ts:235-246) falls back to 1:1 with a footnote but no aggregate-total guard — totals silently undercount the foreign-currency portion. - The snapshot uses the rate at edit time, not
expenseDate. An expense from 6 months ago, edited today, gets today's FX. The correct anchor isexpenseDate.
Expenses is the only table that snapshots FX. Invoices, berths, yacht maintenance costs, and EOIs store amount + ISO code only and re-resolve FX live at display — see H1.
HIGH
H1. currency_rates has no history / retention
db/schema/system.ts:207-222 — one row per (base, target).
refreshRates() (currency.ts:36-68) upserts in place; only the
latest rate ever exists. Consequences:
- Cannot value an old invoice at its issue-date rate.
- No FX audit trail — if Frankfurter returns bad data the prior value is gone.
- The 6-hourly cron (
queue/scheduler.ts:31) overwrites silently.
Fix: append-only table (fetchedAt in PK), getRate(from, to, asOf?)
selects the most recent row ≤ asOf. Pairs with M8.
H2. Rounding policy is undocumented and currency-blind
currency.ts:23 does Number((amount*rate).toFixed(2)) — pins to 2
decimals regardless of currency. JPY has 0 fractional digits, so a USD
→ JPY conversion stores .45 JPY which is unspendable; a JPY → USD
conversion floors at 1 cent precision when 1 yen ≈ $0.0066. No banker's-
rounding helper exists, no Math.round policy, no doc.
Invoice math (services/invoices.ts:251-276, :435-466) does
(subtotal * discountPct) / 100 and subtotal - discountAmount + feeAmount with no rounding before String()-ing into numeric
columns. A 2% discount on a $100.10 subtotal stores '2.002' and
'98.098'. The displayed total (Intl truncates at 2dp) and the stored
total diverge by sub-cent amounts for every percentage-discounted
invoice.
H3. formatCurrency cents-clamp hides fractions on berth pricing
utils/currency.ts:55-56 clamps minFractionDigits to 0 when
maxFractionDigits: 0 is passed — correct for headline tiles but also
the default for berth-card / berth-columns / berth-tabs price
(berth-card.tsx:91, berth-columns.tsx:185, berth-tabs.tsx:410).
€1,250,000.50 renders as "€1,250,001" with no tooltip. Low impact today;
will confuse yacht-show buyers once non-round prices land.
H4. Berth recommender ranks prices currency-blind
berth-recommender.service.ts scores by berths.price with no FX
normalization. Multi-currency tier ranking is meaningless. Heat weights
in system_settings are tuned per-port; admins have no way to spot the
skew. Same root as C1 but isolated to the recommender.
H5. /api/v1/currency/convert swallows rate-unavailable as data: null
/api/v1/currency/convert/route.ts:19 does not differentiate "rate
unavailable" from "amount was zero" — both return { data: null } with 200. Callers that distinguish these need a separate error envelope.
expenses.ts and expense-pdf.service.ts handle null correctly; the
API surface does not.
MEDIUM
M1. No cold-start bootstrap for currency_rates
queue/scheduler.ts:31 runs every 6h; on a fresh db:seed the table is
empty for up to 6h and every convert() returns null. Seed initial
rates in seed-bootstrap.ts or self-trigger on cron registration. Masked
today because seeded ports are all USD (USD→USD short-circuits at
currency.ts:9).
M2. seed-bootstrap.ts hardcodes USD for every port
seed-bootstrap.ts:42,49 — both demo ports default to USD. The schema
admits per-port currency but no EUR/GBP demo port exists. Multi-currency
correctness has zero seed/fixture coverage. Adding one non-USD demo port
would surface C1/C2/H4 in smoke output.
M3. Hardcoded "Rates (USD)" column header
berth-columns.tsx:324 — header reads 'Rates (USD)' regardless of
the row's priceCurrency. Column body is currency-aware; header lies
for non-USD rows.
M4. EOI / interest-summary PDFs use prefix code instead of formatCurrency
pdf/templates/interest-summary-template.ts:112,
berth-spec-template.ts:127,172 — 'USD 1,234,000' rather than
$1,234,000. Inconsistent with the invoice template and in-app UI.
Surfaces to clients in EOI bundles.
M5. OCR receipt parser maps $ → USD unconditionally
ocr/parse-receipt-text.ts:17. CAD/AUD/HKD/SGD all print $. Force
confirmation when the port's defaultCurrency isn't USD.
M6. Expense form/scan defaults hardcode USD rather than port default
expense-form-dialog.tsx:61,85,215,227,
expenses/scan/page.tsx:63,314, scan-shell.tsx:102. A rep at a
EUR-default port changes the dropdown on every expense.
M7. Synthesized inverse rates drift
refreshRates() stores 1/rate rounded to 6dp (currency.ts:60).
USD→EUR→USD round-trips diverge from identity by basis points; matters
for the expense-pdf USD→EUR chain. Fetch base=USD and base=EUR
separately from Frankfurter rather than synthesizing.
M8. Unique index blocks the H1 fix
currency_rates_base_target_idx makes append-only history a breaking
migration. Flagged so the H1 fix is planned with the index drop.
Notes / non-issues
formatCurrencyis well-defended; consolidate the ad-hoctoLocaleString({ style: 'currency' })inexpense-columns.tsx/expense-detail.tsxonto it.getRatecaching inexpense-pdf.service.ts:215-231is the right shape — reuse for any other batch conversion path.- Documenso payloads carry currency through unchanged; no FX in that path.
Top 3 to fix first: C1 (dashboard mixed-currency totals), C2 (report PDFs drop currency entirely), H1 (no FX history/retention).
29. Outbound webhook delivery audit (outbound-webhook-auditor)
Outbound Webhooks — Audit (Task #29)
Scope: src/app/(dashboard)/[portSlug]/admin/webhooks/,
src/app/api/v1/admin/webhooks/**, src/lib/services/webhooks.service.ts,
src/lib/services/webhook-dispatch.ts, src/lib/services/webhook-event-map.ts,
src/lib/queue/workers/webhooks.ts, src/lib/validators/webhooks.ts,
src/lib/db/schema/system.ts, src/lib/utils/encryption.ts,
src/lib/queue/index.ts. Read-only.
CRITICAL
C1 — Signature has no replay protection
workers/webhooks.ts:120-134 — HMAC covers only the JSON body
(sha256=HMAC(secret, bodyString)). The body contains a timestamp
field, but it's not separately authenticated/headered in a way the
receiver can verify in a freshness window. A captured request replays
verbatim, signature still valid. No X-Webhook-Timestamp, no nonce,
no documented receiver dedup contract.
Fix: Stripe-style signature = HMAC(secret, ${ts}.${body}) with
X-Webhook-Timestamp header, and document that receivers must reject
|now − ts| > 5 min. Also document X-Webhook-Delivery (already
sent) as the receiver-side idempotency key.
C2 — webhook_deliveries grows unbounded
schema/system.ts:107-126 — no reaper anywhere; searches for
the table outside writers returned zero hits. Every event, retry,
test, and redeliver writes a row with full payload JSONB plus up
to 1 KB response_body. BullMQ's removeOnComplete/removeOnFail
only prunes Redis, not Postgres. On a port subscribed to high-volume
events (berth.status_changed, interest.stage_changed,
invoice.*) this is unbounded write-amplification.
Fix: maintenance job pruning by status + age (e.g. 30 d success,
90 d dead_letter), gated by a system_settings retention key. Add
(status, createdAt) index for the scan.
C3 — Worker dispatches with empty signature when secret is NULL
workers/webhooks.ts:111-134, schema :97 — secret is
nullable; the worker silently sends header X-Webhook-Signature: ''
when missing. Compliant receivers reject, mis-coded ones accept.
Creation always generates a secret, so NULL implies DB tampering or a
future migration mistake — defence-in-depth still warrants a hard
fail.
Fix: dead-letter with reason missing_signing_secret; ideally make
webhooks.secret NOT NULL.
HIGH
H1 — DNS-rebinding TOCTOU
workers/webhooks.ts:18-45, 147-167 — resolveAndCheckHost()
does its own dns.lookup, then hands the hostname back to
fetch, which resolves again before connecting. The validator's
comment (validators/webhooks.ts:69-70) defers rebind to the worker,
but the worker's check is independent of the actual connect; a rebind
between lookup and fetch still hits internal IPs.
Fix: pin the connect to the resolved IP via an undici Agent with
connect: { lookup: () => allowedIp }, keep the original hostname as
the Host header so TLS SNI works. Or inspect socket.remoteAddress
post-connect and abort on mismatch.
H2 — Retry policy too short / no jitter
queue/index.ts:14, 29-34 — maxAttempts: 3, exponential 1000 ms.
Real schedule ≈ 1 s / 2 s / 4 s — a 30 s receiver outage during a
deploy permanently dead-letters every in-flight event; super-admins
get a notification storm and events are lost unless redelivered
manually. Industry norm (Stripe, GitHub) is ≥5 attempts over hours
with jitter.
Fix: bump to 8–10 attempts, exponential base 30_000 ms with jitter, surface the next-retry-time in the admin detail view.
H3 — No circuit-breaker on chronically failing endpoints
workers/webhooks.ts:231-287 — after a dead_letter the webhook
stays is_active=true. Five global worker slots (concurrency: 5)
get saturated by a broken subscriber's 3×10 s retry cycle, starving
other ports' webhooks. The dead_letter notification dedupes per
delivery, so 1000 events → 1000 alerts.
Fix: rolling failure counter on webhooks; auto-set
is_active=false after N consecutive dead_letters and alert once.
Coalesce notification dedupeKey by webhook+day.
H4 — EMAIL_REDIRECT_TO short-circuit writes status dead_letter
workers/webhooks.ts:94-109 — semantically wrong;
"exhausted retries" is what dead_letter means in admin UI.
The SSRF-blocked path at :148-166 shares the same status.
Doesn't fire alerts (alert path requires isFinalAttempt), but
pollutes the deliveries list.
Fix: introduce a skipped (or paused) status, use it for both
paths.
H5 — Payloads are ID-only; redeliver re-sends stale data
All 18 dispatchWebhookEvent(...) callsites pass only
{ clientId, berthId, interestId, ... }. Receivers must call back
for anything beyond the ID — yet webhooks fire for archived /
merged / deleted entities (client.archived, client.merged,
yacht.ownership_transferred). Worse, redeliverWebhookDelivery
(webhooks.service.ts:283-343) clones the payload verbatim, so a
replay after GDPR erasure resurrects the deleted ID and a replay
after a client merge resurfaces the pre-merge identity.
Fix: snapshot a minimal {id, name, status, archived} at dispatch
time; on redeliver, re-check entity existence and short-circuit to
skipped if the row is gone.
H6 — Test endpoint has no rate limit
webhooks.service.ts:347-383 — combined with H3 a rapid-fire test
can stall the queue. Add a per-webhook test throttle (e.g. 1/sec).
MEDIUM
M1 — SSRF denylist gap
validators/webhooks.ts:18-43 covers RFC1918, loopback, link-local
(incl. AWS IMDS 169.254/16), CGNAT 100.64/10 (catches Alibaba's
100.100.100.200), IPv6 ULA / link-local, GCP/Azure named metadata
hosts. Missing: Oracle Cloud metadata 192.0.0.192. Add the
literal.
M2 — HTTPS check is create-time only
webhook.url isn't re-validated at dispatch. A bad migration or DB
edit could let http:// through. Add url.startsWith('https://')
in the worker before fetch.
M3 — secretMasked decrypts on every list/get
webhooks.service.ts:80-99, 103-123 runs decrypt() per row to
compute a 5+3-char mask. The mask is deterministic from the
plaintext; cache it in a new column secret_masked so the read path
doesn't exercise the encryption key per webhook.
M4 — Shared encryption key naming
utils/encryption.ts:7 reads EMAIL_CREDENTIAL_KEY but encrypts
webhook secrets, SMTP creds, and IMAP creds with it. Rotation
spans multiple tables; the name implies email-only and invites config
drift. Rename to APP_CREDENTIAL_KEY (alias the old name) and
document the rotation runbook.
M5 — No event-name versioning
webhook-event-map.ts exports a flat list. When data inside
interest.stage_changed changes shape, every receiver breaks
silently. Add either a X-Webhook-Version header or event name
suffix (.v2) before this surfaces to external integrators.
M6 — responseBody may carry third-party PII
workers/webhooks.ts:191, 197 stores up to 1 KB of the receiver's
response and surfaces it through the admin deliveries list. If a
receiver echoes data in 4xx bodies it lands in
webhook_deliveries.response_body and the pino warn line. Flag for
the GDPR DPIA; consider redacting headers / scrubbing on storage.
M7 — SSRF-blocked delivery isn't audit-logged
workers/webhooks.ts:148-166 updates the row but skips
createAuditLog. Success and final-fail paths both write audit rows.
Add one here — these are the deliveries you most want forensics on.
M8 — No port-id assertion on the BullMQ payload at worker entry
workers/webhooks.ts:69 trusts portId from the job and uses it for
notifications + audit. The producer is internal and writes consistent
data, so this is defence-in-depth, but the worker could fetch the
webhook row and assert webhook.portId === payload.portId before
proceeding.
What's good
- AES-256-GCM at rest; secret returned to admin once only on
create / regenerate (
webhooks.service.ts:71-75, 225-229). - HTTPS-only create-time validation, comprehensive IPv4 + IPv6 private-range denylist, named cloud metadata hosts.
- DNS re-resolution at dispatch time — intent is right (TOCTOU gap noted in H1).
- Idempotent delivery row created before enqueue
(
webhook-dispatch.ts:48-57); worker crash leaves a recoverablependingrow. - BullMQ retries + dead_letter handling + super-admin notification;
redeliver path preserves the original failed row + tags the replay
payload with
retried_from/retried_at. - Multi-tenant guard at every read/write
(
webhooks.service.ts:108, 137, 174, 200, 244, 292, 352) and on the dispatcher subscription query (webhook-dispatch.ts:33-39). EMAIL_REDIRECT_TOdev kill-switch.- 10 s
fetchtimeout viaAbortController. - Permission gating: every admin route wraps in
withPermission('admin', 'manage_webhooks', …); redeliver + regenerate-secret included.
Priority
Land C1 (timestamp-in-signature) + C2 (deliveries reaper) + H2 (retry policy) + H3 (auto-disable) before exposing webhooks to external integrators. C3, H1, H5 are smaller patches but should ship in the same release. MEDIUM items can batch behind a single "webhook hardening" follow-up.
20. Authorization model integrity audit (authz-auditor)
Authorization model integrity audit
Branch: feat/documents-folders · Date: 2026-05-12 · Auditor: authz-auditor
Scope: every API route's permission gate, port-scope SQL filters, per-user override
merge semantics, isSuperAdmin bypass paths, residential toggle, Documenso webhook
port resolution. Read-only.
CRITICAL
C-1 — Privilege escalation via user-permission-overrides PUT
File: src/app/api/v1/admin/users/[id]/permission-overrides/route.ts lines 153–244
(plus the parallel issue in src/lib/services/users.service.ts updateUser line
285–292).
The PUT endpoint gates on withPermission('admin', 'manage_users') and refuses
self-target (line 163), but it does NOT verify that the caller already holds the
permission they are granting to the target user. A port admin holding ONLY
admin.manage_users can therefore:
- Mint a colleague with
admin.permanently_delete_clients = true,admin.system_backup = true,admin.manage_settings = true,documents.delete = true,interests.override_stage = true, etc. - Have the colleague execute those actions on their behalf, or
- Re-flip leaves on the colleague's record at will because nothing in the override-merge path knows the granting admin was unprivileged.
The same path exists in updateUser (role reassignment) — roleId is validated
to exist (line 289) but there is no "you can only assign a role whose effective
permission set ⊆ your own" check. Because admin/roles POST is super-admin-only,
role-creation is safe, but role assignment is the privilege-escalation surface
since a sales_director-equivalent could promote a peer to a super-admin-flavoured
role.
The audit log records the change so the activity is detectable, but detection is not prevention. Self-target block on the override route is necessary but not sufficient — the admin can just bounce the elevated permission off a sock-puppet account.
Fix: before writing permissionOverrides, compute the caller's effective
permission map and reject any leaf in the new override that is true while the
caller's matching leaf is false. Same check on the roleId change in
updateUser — compare the new role's effective permission set against the
caller's and refuse on any superset.
HIGH
H-1 — Listing endpoints without an explicit withPermission gate
grep -L "withPermission|requireSuperAdmin|requirePermission" against withAuth
routes turns up 31 files. Most are legitimate self-service surfaces
(/me, /notifications, /currency/*, /users/me/preferences,
/alerts/*/dismiss|acknowledge, /saved-views/[id] — ownership-checked) or
correctly do an in-handler check (/clients/bulk, /companies/bulk,
/yachts/bulk, /interests/bulk — all gate on ctx.permissions?.<resource>?.<action>).
The outliers worth flagging:
/api/v1/alerts/route.tsGET — no permission gate. Anyone in the port with valid auth can read every alert row (audit blockers, GDPR alerts, permission-denied alerts, etc.). ServicelistAlertsForPortscopes onportIdso cross-tenant leakage is contained, but the alert payload exposes internal-only signals (e.g. who triggered apermission_denied). Either gate onadmin.view_audit_logor filter the payload by sensitivity tier./api/v1/vocabularies/route.tsGET — intentionally permissionless per the comment (vocabularies feed pickers across the app). Fine — port-scoped./api/v1/settings/feature-flag/route.tsGET — port-scoped, returns a single boolean for a key the client names. Acceptable./api/v1/search/route.tsGET — relies on the service'scan()helper to skip buckets the caller can't see (search.service.tsline 305). Good.includeOtherPortscorrectly gates onctx.isSuperAdmin(line 20).
H-2 — Search service: residential buckets
src/lib/services/search.service.ts line 305 (can()) honours permissions.residential_clients.view
and .residential_interests.view. The withAuth resolver sets these to true when
portRole.residentialAccess is true (helpers.ts line 209–221) BEFORE the per-user
override layer runs (line 227–238). So a per-user override with
residential_clients.view = false will take effect — verified by tracing
deepMerge (helpers.ts line 73–98): the source false boolean replaces the target
true at the leaf because the recursion only triggers when both sides are
objects. Per-user false correctly bubbles through. Pass.
H-3 — withAuth userOverride fetch costs a round-trip on every request
Not a security issue but a perf+coupling note: every authenticated request now
runs three sequential queries when the user is not a super-admin
(userPortRoles → portRoleOverrides → userPermissionOverrides). Hot routes
inherit the latency tax. Consider a Promise.all for the override pair, or
a per-request memoize keyed on (userId, portId) since multiple withAuth
calls per request don't happen but middleware-adjacent paths exist.
MEDIUM
M-1 — withAuth residential toggle bypasses portRoleOverrides for residential.*
helpers.ts line 209–221: when portRole.residentialAccess === true, the resolver
replaces permissions.residential_clients and permissions.residential_interests
with a hardcoded all-true map. If a port-role-override set
residential_clients.delete = false (e.g. "this port lets reps see but not delete
residential rows"), the residential toggle silently overrides that. By design? Maybe
— the toggle is documented as "full residential access" — but it would be
surprising if an admin set up the port-role-override expecting it to constrain
toggled users. Document or compose more carefully.
The per-user permission override still wins (it runs after, line 227), so a deliberate admin can recover, but the precedence is subtle.
M-2 — parseBody-vs-req.json consistency on bulk routes
All four bulk routes (clients, yachts, companies, interests) use
parseBody correctly. The bulk permission check pattern is repeated four times
with the same shape — extract a requireOneOf(ctx, [{resource, action}, ...])
helper to avoid drift when a new bulk route ships.
M-3 — documents.feature-flags and documents.wizard
Both routes wrap with withAuth + withPermission('documents', 'view'). The
feature-flags route returns Documenso/template feature toggles — fine. The
wizard route fetches drafts. Spot-check passed; both scope to ctx.portId
in the service.
M-4 — Documenso webhook port resolution: verified correct
src/app/api/webhooks/documenso/route.ts line 58–101: secret enumeration over
listDocumensoWebhookSecrets() with verifyDocumensoSecret (timing-safe). The
matched portId threads through portScope (line 143) to every per-recipient
and per-document handler. resolveWebhookDocument (documents.service.ts
line 967–996) refuses to mutate when the lookup is ambiguous across ports without
a portId. Pass. No cross-tenant write surface.
One small nit: the webhook returns 200 on invalid-secret to avoid leaking
signal (line 100) but the audit row records a webhook_failed with
portId: null. Rate-limited per IP (line 77). Fine.
M-5 — requireSuperAdmin always requires a portId in userPortRoles?
No — super admins skip the userPortRoles lookup entirely (helpers.ts line
174 condition), but still need portId set somewhere (header or
profile.preferences.defaultPortId, line 161–164) unless they're hitting a
no-port endpoint. The gate on line 166 only fires when !portId && !isSuperAdmin.
A super admin without a portId in the request will have ctx.portId = '' and
ctx.portSlug = ''; any route that uses ctx.portId in a SQL filter will
match nothing, which is a fail-safe but produces confusing empty UIs. Worth
documenting that super-admin requests SHOULD always carry an X-Port-Id.
M-6 — requireSuperAdmin audit-logs denials with empty entityId
helpers.ts line 298: entityId: ''. The audit row is functional but harder
to query later. Set to attemptedAction or the route path for forensics.
Pass / verified
deepMergefalse-propagation:falseat any leaf correctly overwritestruein the role baseline because the recursion guard requires both sides to be objects (helpers.ts line 81–88). Boolean → boolean falls into theelsebranch and assigns directly.- Override layer ordering: role → port-role-override → residential toggle → user-permission-override. User override wins last. Self-target on PUT rejected (route line 163).
- Listing port_id SQL filter (sampled):
clients,interests,yachts,companies,documents,files,berths,invoices,expenses,reminders,residential_clients,residential_interests,alerts,error_events,audit_logs,notes(all four polymorphic types) — every servicelist*function constrains onportIdin the WHERE clause. Search service goes further with defense-in-depth port_id filters inside each per-bucket query (search.service.ts lines 361, 443, 490, 539, 600, 667, 717, 772, 805, 856, 902, 952, 995, 1035, 1069, 1107, 1156, 1172, 1186, 1200, 1384). /admin/ports/[id]: explicitassertPortInScopeblocks cross-tenant access by non-super-admins (route line 15–20). Pass./admin/error-events: super admins see all, regular admins scoped toctx.portId; the [requestId] route additionally re-checks the row'sportIdand returns 404 (not 403) on mismatch to avoid existence leak.isSuperAdminwritability: not increateUserSchema/updateUserSchema. Only settable via the invitation flow with an explicitif (body.isSuperAdmin && !ctx.isSuperAdmin) throwguard (/admin/invitations/route.tsline 40). Pass.- Documenso webhook: secret enumeration is timing-safe; ambiguous
cross-port
documensoIdlookups refuse to mutate; portScope threaded to every handler. Pass.
Summary punch list
| Sev | Item | File / fix |
|---|---|---|
| CRIT | Privilege escalation via permission-overrides PUT and role reassignment | permission-overrides/route.ts, users.service.ts updateUser — refuse to grant any leaf the caller doesn't hold |
| HIGH | /api/v1/alerts GET ungated |
Add withPermission('admin','view_audit_log') or filter payload |
| MED | Residential toggle silently overrides port-role-override on residential_* |
Document precedence in helpers.ts or compose via deepMerge |
| MED | withAuth runs three sequential override queries per request |
Parallelize override fetches |
| MED | Bulk-route permission check duplicated 4× | Extract requireOneOf helper |
| MED | requireSuperAdmin audit row carries empty entityId |
Set to attemptedAction |
| INFO | Super-admin request without X-Port-Id produces empty ctx.portId and silently empty queries |
Document; consider 400 |
Word count: ~1050.
25. Inquiry → CRM funnel correctness audit (funnel-auditor)
Inquiry → CRM Funnel Audit
Scope: src/app/api/public/{interests,residential-inquiries,website-inquiries}, src/app/api/v1/admin/website-submissions/*, src/lib/services/inquiry-notifications.service.ts, src/components/admin/inquiry-inbox.tsx, src/lib/validators/{interests,residential}.ts, settings keys inquiry_contact_email / inquiry_notification_recipients / residential_notification_recipients.
Read-only; no edits.
CRITICAL
C1. "Convert to client" prefill goes nowhere — every conversion is double-typed
inquiry-inbox.tsx:127-135 flips the row to converted, then pushes
/clients?prefill_name=…&prefill_email=…&prefill_phone=…&prefill_source=website&prefill_inquiry_id=….
A repo-wide grep for any of those five keys returns only the writer — no client form / page / hook ever reads prefill_*. So Convert: (a) flushes the inbox row to "converted" eagerly, (b) drops the operator on a blank New Client form, (c) loses the inquiry_id ↔ client/interest linkage permanently because nothing persists it. The triage state is now lying ("converted" with no downstream entity), and operators retype the payload from the inbox card. Either consume the params in client-form.tsx (with a hidden inquiry_id that the create endpoint persists into clients.metadata or a new inquiry_origin_id FK), or revert the eager state flip so the inbox stays honest until the client is actually saved.
C2. Two parallel intake pipelines with no correlation → duplicate interests + zombie inbox rows
/api/public/interests directly creates clients + yacht + interest rows and queues notifications. /api/public/website-inquiries is the new "dual-write capture" that stores the raw payload in website_submissions for triage. The website is expected to call both (the docstring on website-inquiries/route.ts says "AFTER its existing NocoDB write succeeds"). Nothing links them. Result for a single berth-form POST:
interestsrow created automatically with notifications fired.website_submissionsrow inserted withtriage_state='open'.- Operator opens the inbox, sees the "open" card, clicks Convert → second
interestsrow. - Inbox UI never sees that step 1 already happened; the heat-scored interest from step 1 is silently shadowed.
Either the public form path should write submission_id onto the created interests row (and the inbox should auto-mark converted whenever a matching interest exists), or the two pipelines need to be merged into one. Right now they coexist and contradict each other.
C3. Email dedup is case-sensitive — capital-letter resubmission spawns duplicates
public/interests/route.ts:91 matches clientContacts.value === data.email (no lower()). The supporting idx_cc_email index (clients.ts:91) is also raw-value, not lower(value). Two POSTs as Matt@Example.com and matt@example.com produce two separate clients, yachts, interests — and now the recommender has split history for the same human. The companies branch (route.ts:122) gets this right (sql\lower(${companies.name}) = lower(${data.company.name})`); the email branch must match: lowercase on insert and on lookup, plus a partial unique index on lower(value) WHERE channel='email'`.
HIGH
H1. Residential clients have no dedup at all
residential-inquiries/route.ts:73-93 always inserts a fresh residential_clients row. There's no email/phone match, no unique index on residential_clients.email (verified — schema/residential.ts has no uniqueIndex). Every resubmit = new prospect. Sales gets a bloated list of phantoms. Mirror the berth-path dedup (lowercased-email lookup → reuse → open a new residential_interests row only).
H2. findUsersWithInterestsPermission ignores user_permission_overrides (migration 0055)
inquiry-notifications.service.ts:139-158 reads only roles.permissions. The request-time auth path in lib/api/helpers.ts:227-238 correctly layers role → port-role-override → user-override, but this fan-out helper does not. Symptoms:
- User granted
interests.viewvia override only → never gets new-inquiry pings. - User had
interests.viewin role but their override removed it → still gets pinged.
Either (a) collect every user on the port and run the same deepMerge chain per user, or (b) move permission-resolution into a single service helper both callers use.
H3. Bogus portId fails as a 500 (FK violation) instead of 400
public/interests/route.ts:51-52 accepts the portId query/header but never verifies the port exists before the transaction. An invalid id surfaces as a Postgres FK error from clients.port_id, returned as a generic 500. The residential endpoint (residential-inquiries/route.ts:58-61) validates upfront via db.query.ports.findFirst — make the berth route do the same.
H4. Cross-port email collisions are non-deterministic
public/interests/route.ts:90-114: when a clientcontact with the same email exists on a _different port, the code creates a new client. But tx.query.clientContacts.findFirst returns "any matching row" with no ORDER BY — subsequent submissions may pick either port's row first. Net: same email used cross-port, then resubmitted to the original port, can spawn 2nd/3rd same-port clients. Fix: filter the lookup by joining to clients.port_id, or scope the contact lookup to clients owned by the target port from the start.
H5. portName hardcoded as 'Port Nimara' in four call sites
inquiry-notifications.service.ts:57, 126(client confirmation + sales alert)residential-inquiries/route.ts:158, 207(subject tokens)
The author left a // future: resolve from getPortBrandingConfig comment. The moment a second port has a marketing site, every email reads "Port Nimara" regardless of recipient. Wire through getBrandingShell(portId).portName (already loaded for the HTML body via branding-resolver.ts).
H6. Residential confirmation ignores inquiry_contact_email
residential-inquiries/route.ts:150 hardcodes contactEmail: 'sales@portnimara.com' in the client confirmation email. The berth path reads the per-port inquiry_contact_email setting (inquiry-notifications.service.ts:44). Settings UI (settings-manager.tsx:96) advertises this setting controls both — but it doesn't. Admins can't reroute residential replies.
H7. Residential sales alert bypasses the email queue
residential-inquiries/route.ts:214 calls await sendEmail(recipients, …) synchronously inside sendResidentialNotifications. The berth path enqueues via BullMQ (inquiry-notifications.service.ts:51,118). When SMTP is slow/down, the residential POST hangs (or 500s — though wrapped in .catch, the await is fired after the response is returned, so worker eventloop is the only victim) and the notification is lost with no retry. Move to the email queue like the berth path.
MEDIUM
M1. No UTM / referrer / attribution capture anywhere
publicInterestSchema and publicResidentialInquirySchema have no utm_source, utm_medium, utm_campaign, referrer, landing_page fields. source is hardcoded 'website' (interests.ts:231). Berth-recommender heat scoring and lead-source dashboards (audit #11) cannot differentiate organic vs paid vs broker referral. The website_submissions.payload JSONB at least preserves whatever the website chooses to forward — but interests itself stores only the literal string 'website'. Add an attribution block to both validators + columns (interests.utm_*, residential_interests.utm_*) and persist what the website hands us.
M2. Public routes use req.json(); schema.parse(body) instead of parseBody
public/interests/route.ts:47-48 and public/residential-inquiries/route.ts:51-52. CLAUDE.md explicitly flags this: "Always use parseBody(req, schema) from @/lib/api/route-helpers" so the error envelope is field-level 400 instead of a generic 500.
M3. Company / yacht / phone matching missing trim + phone-E164 dedup
companiesmatch (route.ts:121-124) is case-insensitive but not whitespace-trimmed:"Acme Ltd "≠"Acme Ltd".- Phone contact dedup uses raw
clientContacts.value, nevervalueE164. The same number formatted differently is a duplicate row. - Yachts always insert; resubmissions create a fresh yacht every time even if the hull/registration is identical.
M4. Response envelope inconsistency
- Berth route:
{ data: { id, message } }at 201 — close to canonical envelope. - Residential route:
{ success: true, clientId, interestId }at 201 — legacysuccess:trueshape that CLAUDE.md says was normalized away in 2026-05-07.
Pick one and update consumers.
M5. Inquiry inbox payload-key extractor is brittle
pickName/pickEmail/pickPhone in inquiry-inbox.tsx:58-78 use a small set of candidate keys but never compose first_name + last_name. Website payloads that send {first_name, last_name} without a name/fullName field render as (no name supplied). Two-line addresses and contact-form payloads silently lose the operator's first hint of who submitted.
M6. Audit log misses dedup decisions
createAuditLog (public/interests/route.ts:254-271) records the interest creation but not whether the client+yacht were created fresh vs reused. Forensics ("did this lead come from the form or get manually entered?") become guesswork. Add metadata.dedup = { clientReused: boolean, companyReused: boolean }.
M7. Yacht inserted with status: 'active' even on speculative form leads
route.ts:179. There's no "prospect" yacht state, so every unconverted interest still leaves an "active" yacht. Active-yacht counts in reports become inflated. Consider a 'prospect' status or a deferred-insert pattern keyed on interests.outcome != 'lost_*'.
M8. Admin website-submissions list permission mismatch
The inbox at /[portSlug]/admin/inquiries is the marketing-funnel triage surface, but /api/v1/admin/website-submissions/route.ts:23 gates GET on admin.view_audit_log. A sales lead reviewing submissions doesn't conceptually need audit-log access. Introduce a dedicated inquiries.view / inquiries.triage permission (consistent with the rest of the permission matrix) so this can be granted independently.
Settings application — verified flow
inquiry_contact_email(string, per-port): consumed by berth client-confirmation email (inquiry-notifications.service.ts:44); not consumed by residential confirmation (H6). Falls back tosales@portnimara.comliteral.inquiry_notification_recipients(JSON array, per-port): consumed by berth sales-alert fan-out (inquiry-notifications.service.ts:106). Empty array = no external alert. No de-dup against role-based recipients (a user listed here who also hasinterests.viewgets two pings).residential_notification_recipients(JSON array, per-port): consumed by residential alert; falls back to[inquiry_contact_email]if empty (residential-inquiries/route.ts:174-179). Correct envelope.
Three settings are surfaced on the admin Settings page (settings-manager.tsx:96-117) so admins can edit them; default values match the service-side fallbacks.
32. Improvements + nice-to-haves + genuine AI integration opportunities (improvements-auditor)
This is a forward-looking proposal report, not a defect audit. Grouped HIGH-VALUE / MEDIUM / EXPLORE with effort estimates and "what NOT to AI-ify" critical pass.
Audit #32 — Improvements, Nice-to-Haves & AI Opportunities
Scope: Forward-looking proposals, not a defect audit. Every proposal grounded in real surfaces seen in this repo (file paths cited). For each: user benefit, implementation sketch, effort estimate (S/M/L), and risk note where it matters.
Effort key: S = ≤½ day, M = 1–3 days, L = >3 days / cross-cutting.
Section A — UX / Feature Improvements
A · HIGH-VALUE
A1. Bulk actions on Berths, Companies, Yachts
Bulk archive/tag/move flow exists in src/components/clients/client-list.tsx + src/components/interests/interest-list.tsx (single /bulk endpoint per domain), but Berths, Companies, and Yachts use the same data-table.tsx shell with BulkAction[] support and never pass any. Reps regularly need to retag a batch of yachts after import or move 30 berths to a new pricing band.
- Sketch: Add
bulkActions=[...]wired through the existingdata-table.tsxAPI; mirror the/api/v1/clients/bulkand/api/v1/interests/bulkendpoint pattern forberths,companies,yachts.interest-list.tsxlines 124–280 are the reference implementation. - Effort: M
- Risk: low — pattern already tested for two domains; ensure permission gate per action mirrors single-entity gates.
A2. Smart undo banner for archive / outcome / stage-change
Already have client-restore.service.ts + a smart-restore-dialog component, and stage rollback would be supported by audit logs. Reps lose minutes every time they fat-finger an archive or set an outcome on the wrong card on the pipeline board.
- Sketch: After any archive / outcome-set /
interest_archived/interest_completedtrigger, raise a Sonner toast with an "Undo" action for 8s, calling the existing restore service or a tiny reverse-mutation endpoint. Hook into the mutationonSuccessininterest-list.tsx,client-list.tsx,pipeline-board.tsx, andinterest-outcome-dialog.tsx. - Effort: M
- Risk: berth-rules-engine has already fired side-effects (
berth_unlinked,interest_completedcascade). Undo must replay the reverse rule or explicitly skip rule-engine via askipRulesflag — otherwise undo leaves stale berth status.
A3. "What changed since I last looked" digest on detail pages
The entity-activity.service.ts + use-track-entity-view.ts infrastructure is already in place — every detail view is tracked. Reps open a deal they haven't touched in a week and have to manually scroll the activity feed.
- Sketch: On detail page load, query activity items with
createdAt > lastViewedAt(fromrecently-viewed.service.ts) and render a dismissable "3 new things since 5 days ago: signed EOI, +€2k deposit, new note from María" strip aboveentity-activity-feed.tsx. - Effort: M
- Risk: none meaningful — purely additive.
A4. j/k row navigation + o open + e edit + / focus filter on list pages
Cmd-K is already wired in command-search.tsx; reps still mouse-hop between rows in data-table.tsx. Power users on busy pipeline days are the loudest beneficiaries.
- Sketch: Add a
useListKeyboardNav(rows, activeIndex)hook used insidedata-table.tsx.j/kmove active row,o/Enter opens detail,etriggers inline-edit on the first inline-editable cell,/focuses the filter input. Respecte.targetbeing an input. - Effort: S
- Risk: must be globally disabled inside dialogs/forms — use the same
document.activeElement instanceof HTMLInputElementguard already in command-search.
A5. Quick-create overlay (cmd-K → "+ New …") Command-search currently navigates but doesn't create. Reps regularly want to drop a client/interest/reminder without leaving the current page (e.g. a quick call comes in while reviewing a berth).
- Sketch: Extend
command-search.tsxpalette with+ New client,+ New interest,+ New reminder,+ Log call. Each opens a drawer-mounted minimal form (3 fields max) using the existing forms wrapped inDrawerinstead ofDialog. Re-useclient-form.tsx,reminder-form.tsxin a "compact" mode prop. - Effort: M
- Risk: low — entirely additive UI.
A · MEDIUM
A6. Smarter defaults from "my last used"
Today client-form.tsx, interest-form.tsx, expense-form-dialog.tsx, and reminder-form.tsx reset every field. A rep doing 12 interests in a row re-types the same source / currency / lead source.
- Sketch: Persist last-submitted values per form per user under
user_profiles.preferences.formDefaults(same shape used fordashboardWidgetsperwidget-registry.tsxcomments). On form open, prefill from preferences, mark prefilled fields with subtle "(last used)" hint. Provide a "Reset defaults" link in the form footer. - Effort: S
- Risk: leaks tag/source preference into the wrong port for super-admins switching ports — scope key by
(userId, portId, formName).
A7. Pipeline board: drag-to-stage with confirm on "won/lost"
pipeline-board.tsx exists. Today reps must click a card → open the interest → open outcome dialog. Drag-to-stage is the natural kanban gesture.
- Sketch: Add
@dnd-kit/sortable(already in tree if not, very light add). WireonDragEndtoinline-stage-picker.tsx's mutation. Dropping intowon/lostcolumns opensinterest-outcome-dialog.tsxinstead of silent set. - Effort: M
- Risk: berth-rules-engine fires on
eoi_sent/contract_signedtriggers — make sure stage drag uses the sameadvanceStageIfBehindcodepath, not a raw stage update.
A8. Saved-view sharing within a port
saved-views.service.ts is per-user. Sales teams want a shared "Hot leads — March" view.
- Sketch: Add
visibility: 'private' | 'shared'column tosavedViews; servicelist()returns own + shared. Permission gate:savedViews.share(new). Show a "Share" toggle insave-view-dialog.tsx. - Effort: M
- Risk: low — additive; ensure shared views can't expose entity rows the viewer lacks permission for (filter happens server-side on data fetch, not view definition, so already safe).
A9. Bulk "Move to folder" in documents hub
Documents hub (hub-root-view.tsx, entity-folder-view.tsx, flat-folder-listing.tsx) supports single-item move via move-to-folder-dialog.tsx. No multi-select. Admins post-importing 200 docs spend 200 clicks.
- Sketch: Add row-checkboxes to
document-list.tsx, surfaceMove to folderas a bulk action. Reuse existingmove-to-folder-dialog.tsxaccepting an array. Service already supports the operation per-item; wrap in a single transaction. - Effort: S
- Risk: system-managed folders already reject mutations via
assertNotSystemManaged— bulk move must respect this per-item and report per-item errors (partial success).
A10. Reminder snooze presets in a single hotkey
snooze-dialog.tsx exists with a Date picker. Reps want "tomorrow morning", "next Mon", "in 1 week" one-tap.
- Sketch: Add quick-buttons row to snooze dialog. Same options as Gmail's snooze. Pre-compute target dates relative to user timezone (already wired via
inline-timezone-field.tsx). - Effort: S
- Risk: DST — use the existing
formatInTimezonehelpers, don't add raw ms.
A11. Dashboard widget: "My open EOIs — needs nudge"
13 widgets in widget-registry.tsx; none surface "EOIs sent ≥ 5 days ago, not yet signed, no reminder set". This is the single most actionable rep widget — the deal that's slipping.
- Sketch: New widget
eoi_followupsqueryingdocumentswherestatus='sent', computedsent_age_days > N(fromsystem_settings.eoi_nudge_days, default 5), grouped by client. Include "Send reminder" action calling existingsendReminderDocumenso wrapper. - Effort: M
- Risk: none.
A12. Dashboard widget: "Berths I'm watching" Multiple reps end up specialising on berth subsets. Today no way to pin.
- Sketch: Add a
watchedBerthsarray under user preferences, "watch" toggle inberth-detail-header.tsx, widget rendering status changes since last view. - Effort: S–M
A · EXPLORE
A13. Pipeline "what's due this week" board view A second pipeline-board view mode that columns by next-action-date instead of stage. Useful when stage is similar across many deals but timing varies.
- Sketch: Toggle in
pipeline-board.tsxheader switching between stage-mode and date-mode. Bin into "Today / This week / Next week / Later". - Effort: M
A14. Inline-editable pipeline-board cards
pipeline-card.tsx is read-only; double-click → edit value/notes in place, mirroring the <InlineEditableField> pattern already used everywhere on detail pages.
- Effort: S
A15. "Open in new tab" cmd-click on any entity row
data-table.tsx row click navigates. Need to make every row a real <a href> so cmd-click + middle-click behave natively. Power users coming from Linear / Notion will expect this.
- Effort: S
- Risk: keyboard-nav handler from A4 must not interfere with native link semantics.
Section B — Subtle Ergonomic Wins
B · HIGH-VALUE
B1. Auto-save indicator on <InlineEditableField>
Inline-editable fields blur-save silently. Reps occasionally close the tab thinking their edit didn't take.
- Sketch: Tiny "Saved · just now" timestamp ghost-text near the field for 2s after mutation success; "Saving…" spinner while pending. Surface in
inline-editable-field.tsxandinline-tag-editor.tsx. - Effort: S
B2. Empty-state CTAs everywhere
empty-state.tsx exists but several lists fall back to "No results" plain text (e.g. interest-eoi-tab when no EOI yet, client-yachts-tab when none linked).
- Sketch: Audit every list/tab consumer, wire
<EmptyState>with a primary CTA (e.g. "Generate EOI", "Link yacht"). - Effort: S
B3. Copy-to-clipboard with smarter format
Mooring numbers (A1), client phones, IBANs all benefit from "Copy" affordance. Today users select-and-copy from inline-editable fields which produces inconsistent whitespace.
- Sketch: Add tiny "copy" icon-button next to
inline-phone-field.tsx, mooring number display inberth-detail-header.tsx, and bank details in invoice detail. Use the standardnavigator.clipboard.writeTextwith a 1s "Copied" tooltip. - Effort: S
B · MEDIUM
B4. Visual indicator for system-managed folders
CLAUDE.md says folder-tree-sidebar shows lock markers on system folders. Add the same visual rule to move-to-folder-dialog.tsx — today the dialog lets you select system folders (and gets rejected later by assertNotSystemManaged).
- Effort: S
B5. "Recently viewed" rail in command-search
recently-viewed.service.ts exists; cmd-K opens to all-purpose search. Show last-5-viewed entities at top of palette when no query typed.
- Effort: S
B6. Inline phone-to-call / phone-to-WhatsApp links
inline-phone-field.tsx renders text. Wrap in tel: and append a WhatsApp icon linking to https://wa.me/<E.164>. For a port-side sales team WhatsApp is the primary channel.
- Effort: S
- Risk: phone numbers without an
+country code breakwa.me— only render when E.164-valid.
B7. Toast deduplication for realtime invalidation
realtime-toasts.tsx (touched in current branch). Multi-edit sessions where one rep edits 8 fields generate 8 toasts on the watching rep's screen. Coalesce within 2s.
- Effort: S
B8. Filter chip "save as view" shortcut
filter-chips.tsx + saved-views-dropdown.tsx exist. Add a small "Save current filters as view" inline button when there's an unsaved filter delta.
- Effort: S
B · EXPLORE
B9. Command-palette macros "send EOI to last-viewed client", "create reminder in 3 days for current client", etc. Recorded by holding a key while performing actions, then invokable via cmd-K → "Run macro".
- Effort: L
- Risk: niche; design-heavy for low payoff. Push to backlog.
B10. Inline timezone awareness on dates
timezone-drift-banner.tsx warns of drift. Extend: every formatDate in detail headers shows Mon 14 May · 14:32 (your time) · 15:32 (client time) on hover when client timezone is known.
- Effort: S
B11. "Pin" comment/note
notes.service.ts is polymorphic; add a pinned: boolean column and surface pinned notes at the top of every tab.
- Effort: S
Section C — Genuine AI Integration Opportunities
Existing AI surfaces grounded in this repo: admin/ai and admin/ocr admin pages; email-draft.service.ts (compose suggestion via /api/v1/ai/email-draft); interest-scoring.service.ts (pure SQL — not AI today, candidate for AI uplift); berth-pdf-parser.ts (AI is the 3rd parser tier); expense-ocr.service.ts + receipt-scanner.ts (OCR + structuring); ai-budget.service.ts (cost-budget gate). The OpenAI SDK is wired but optional. All proposals below assume model calls go through a service that respects ai-budget and an explicit per-port enable flag.
C · HIGH-VALUE
C1. Auto-summarize a client / interest on detail open When a rep opens a client/interest, summarize: "5 EOIs over 18 months, 2 archived, last touched 12 days ago by María, current stage is contract-out — last note suggests cash-flow concern; berth A4 is the primary." Plays directly into A3 (what-changed digest).
- Sketch: New
/api/v1/ai/entity-summaryendpoint acceptingentityType + entityId, gathering activity log + notes + linked entities (already available viaentity-activity.service.ts), prompting GPT for a 3-sentence summary. Cache by(entityId, last_activity_id)in Redis. Surface as a collapsible card abovedetail-header-strip.tsx. Always show "View source" → activity feed; never hide raw data. - Effort: M
- Risk: confabulation — model invents a number. Mitigate: structured prompt that returns JSON with
claims: [{text, sourceActivityIds: []}], render only claims with non-empty source IDs. Hard 200-token cap.
C2. Semantic search across notes, email bodies & document content
search-nav-catalog.ts is keyword-based. Reps searching "the client who was worried about wave exposure" can't find anything. The biggest practical AI win in a CRM.
- Sketch: Add an
embeddingstable (pgvector — already supported by Postgres). Embednotes.body,email_messages.text, signed-document OCR text, on insert via a new BullMQembeddingsworker (sibling toworkers/ai.ts). Add/api/v1/search/semanticreturning ranked entityIds. Toggle in cmd-K palette between "Exact match" and "Semantic". Cite source row per hit. - Effort: L
- Risk: PII flowing to OpenAI embeddings. Use a local embedding model (gte-small via fastembed/onnx) per
lib/aidesign — never ship raw notes to OpenAI for embedding. Document this clearly in CLAUDE.md.
C3. Interest scoring uplift — hybrid SQL + lightweight learned model
interest-scoring.service.ts is pure rule-based (pipelineAge, stageSpeed, etc.). It works but reps disagree on signal weights. Train a per-port logistic regression on historical outcome (won/lost) using current factors + a few new ones (days since last note, last email response time, deposit pattern). Output a calibrated probability.
- Sketch: New nightly job
train-interest-modelinworkers/ai.tsusing a tiny library (no GPT — pure numerical). Persist coefficients insystem_settings.interest_model. Service applies them at scoring time. Expose model AUC onadmin/ai. - Effort: L
- Risk: per-port data thin (cold start). Default to SQL weights until ≥30 closed interests exist. Document drift detection — refuse to serve a model with AUC ≤ 0.6.
C4. Smart reminder suggestions from email content
Inbox (email-threads-list.tsx) already exists. When a client email contains "Let's chat next Tuesday" or "I'll get back to you in two weeks", surface a one-click "Create reminder for 21 May".
- Sketch: On new
email_messagesinsert, the existing worker calls a newextractActionableDates(body)GPT prompt returning JSON{candidates: [{date, summary, confidence}]}. Surface as a banner inemail-threads-list.tsxand in the matching interest's reminder rail. Never auto-create — always suggest. - Effort: M
- Risk: dates in client signatures / disclaimers ("This email was generated on …") fool the model. Filter low-confidence; cap one suggestion per message.
C · MEDIUM
C5. "Why this berth?" + "Why not?" explanation for the recommender
berth-recommender.service.ts outputs a tier (A/B/C/D) + heat score. Reps can't always articulate to the client why a specific berth made the shortlist.
- Sketch: Add an LLM rephrasing step over the structured tier-reasoning JSON (already produced by the service). Returns plain-English: "Tier A: matches your yacht's 22m LOA + 5m beam, on the protected pontoon, currently available, no historical pushback." Render inside
berth-recommender-panel.tsx. Source data is fully structured → low confabulation risk. - Effort: S
- Risk: explanation must never contradict the structured tier. Add an automated unit assertion that the explanation contains the tier label and the dimensions field.
C6. Auto-draft post-meeting note from a voice memo Reps walk back from a viewing with a 90s phone recording. Today they re-type. Drop the audio into the client's notes tab, Whisper transcribes + GPT summarizes into note-friendly bullet points.
- Sketch: Add
audio-note-uploadaction tonotes-list.tsx. Worker pipeline: upload via storage backend → Whisper → GPT bullets → insert as a draft note flaggedai_generated=true. Rep reviews + saves. - Effort: M
- Risk: Whisper accent accuracy on Polish / Italian names. Always preserve the raw audio + transcript alongside the bullets; never delete the source.
C7. Translation for portal/client comms Polish reps writing English. English reps writing Polish. Currently they paste into Google Translate.
- Sketch: Add a translate-icon button to
compose-dialog.tsxandnotes-list.tsx. One-click translates a draft into the client's preferred language (already tracked onclients.preferredLanguage). Show both versions side-by-side before send. - Effort: S
- Risk: never auto-translate without rep confirmation, especially for any contractual phrasing.
C8. Document-template merge-field auto-population from client context
merge-fields.ts catalog + eoi-context.ts already do structured population. Where merge fields lack a structured source (admin templates with {{custom_intro}} blanks), an LLM could draft from notes + client profile. Rep then reviews.
- Sketch: New "Suggest draft" button on each blank merge field at template-fill time. Returns 2–3 phrasings; rep picks one.
- Effort: M
- Risk: see "what NOT to AI-ify" below — this is borderline. Allowable only for non-legal merge fields (greeting, intro paragraph), explicitly blocked for legal/financial blanks.
C9. Photo categorisation for berth/yacht uploads Berth PDFs are parsed; raw photos uploaded to yacht/berth detail aren't tagged. AI auto-tagging would speed search for "yachts with a bowsprit" or "berths with a fixed davit".
- Sketch: On image upload via
image-cropper-dialog.tsx's completion, queue a vision job that returns 3–5 tags (drawn from a controlled vocabulary). Store as photo metadata. Search filters use vocabulary terms. - Effort: M
- Risk: vision-model bias / hallucinated features. Constrain output to a port-defined vocabulary list; reject anything outside it.
C · EXPLORE
C10. Conflict / clause-mismatch detection across templates and signed copies
When admins edit a template, did the new clause contradict something they wrote in another template? When a counterparty returns a "with edits" PDF (currently uploaded via external-eoi-upload-dialog.tsx), did they alter a non-trivial clause?
- Sketch: Embed each clause; on template save, surface "this clause is 0.92 similar to but materially differs from a clause in Template X". On external-EOI upload, diff against the canonical template's text and flag deltas in a yellow strip with "Reviewed by [rep]" before the rep can finalize.
- Effort: L
- Risk: false confidence — see "what NOT to AI-ify". Acceptable only as an assistive flag, never as a green-light. UI copy must say "Possible material difference detected — review required" not "No material difference".
C11. Expense anomaly detection beyond expense-dedup.service.ts
expense-dedup.service.ts handles exact duplicates. Layered AI: detect amounts outside the rolling p95 for the same vendor, or trip-labels that look mismatched against expense date.
- Sketch: Nightly job computes per-vendor p95 and flags outliers as
expense_anomalyreminders for the admin. - Effort: M
- Risk: low — it's a soft flag, not an auto-action. No money movement is gated.
C12. Smart vocabulary maintenance
vocabularies table holds lead-sources etc. Over time, reps spawn synonyms ("Inst.", "Instagram", "IG"). Cluster + suggest merges to the admin.
- Effort: S–M
Section C+ — What NOT to AI-ify (critical pass)
These places either carry liability if the model confabulates, or have a tighter ground-truth than AI can match. Refuse the AI proposal even if it sounds appealing.
- Legal text in EOIs, contracts, reservation agreements.
eoi-context.ts,document-templates.service.ts,reservation-agreement-context.ts. The merge-field allow-list (VALID_MERGE_TOKENSinmerge-fields.ts) exists precisely to keep AI out of legal copy. Never AI-generate a clause; never AI-paraphrase a clause "for readability"; never AI-translate a clause and present the translation as binding. Keep all legal text rep-authored or counsel-authored, period. - Money flow. Invoice amounts, deposit allocation, currency conversions, FX rate selection (
currency.ts,invoices.ts). The audit-26 multi-currency audit is in flight precisely because money math has to be deterministic and reconcilable. AI here = unrecoverable customer trust damage on a single mistake. - Regulatory / GDPR responses.
gdpr-export.service.ts,gdpr-bundle-builder.ts. Subject-access requests must return exactly what's in the database, with no LLM summarization layer that could omit a record. - Signing decisions. The Documenso webhook (
handleDocumentCompletedidempotency, audit-tier 1) is the source of truth that a contract was signed. AI must never infer signing state from email content. If the contract isn't in the webhook stream asDOCUMENT_COMPLETED, it isn't signed. - Berth assignment auto-commit.
berth-recommender.service.tsis intentionally pure SQL; the rules engine is intentionallysuggestby default. Don't change that — auto-binding a berth to a client based on an LLM "judgment" is exactly the kind of mistake that ends in a refund and an apology. Recommend, never auto-assign. - Mooring-number / dimensions parsing. The 3-tier PDF parser (AcroForm → OCR → AI) escalates to AI only when OCR confidence is low and a rep clicks "AI parse" and a mooring-mismatch confirmation is required at apply time (
berth-pdf.service.ts). Don't lower any of those guards. - Pipeline outcome ("won" / "lost"). This drives revenue reporting (
reports.service.ts). Setting an outcome must remain a human decision. AI may suggest "this looks won based on the signed contract", but the human clicks the button. - Email send-side text in template-driven send-outs.
document-sends.service.tsrate-limits and audits. AI-generated wording is fine for free-form composes (compose-dialog.tsx) where the rep reviews. AI-generated wording is not fine on bulk template sends where one bad phrasing reaches 50 clients before anyone notices. - Audit log entries. Audit logs (
audit.service.ts) must remain raw structured events. Never let AI rewrite or compress them. - Permission overrides.
user_permission_overrides(new in this branch). AI must never suggest or auto-apply grant/revoke — that's a security primitive.
Implementation sequencing recommendation
If the team wants a 2-sprint shipping bundle aligned with the existing branch's themes:
- Sprint 1 (UX, low risk): A1, A4, A5, A6, A11, B1, B2, B3, B5 — everything tagged S or low-M, no new infra.
- Sprint 2 (AI runway): Build the
lib/aiskeleton (budget gate is in place; need a local-embedding pipeline + a worker) → land C1 (entity summary) and C5 (recommender explanation), both low-risk because they wrap structured data. Defer C2 (semantic search) until the embedding worker is proven. - Backlog: A2 (smart undo — needs rules-engine reverse design), A7 (drag-to-stage on board), C3 (learned scoring — needs sufficient closed-deal volume per port), C10 (clause conflict — handle with extreme care).
Every C-section proposal should ship behind a per-port admin toggle (system_settings.ai_features.<name>) and respect ai-budget.service.ts. Every AI surface must cite its source rows or be flagged as "AI assistance".
— End of report —
22. Date/time + DST + scheduled jobs audit (datetime-auditor)
Date/time + DST + scheduled jobs audit — 2026-05-12
Scope: BullMQ cron schedules, reminder dueAt round-trip, TZ drift banner, server-side date formatting, ISO-8601, jobs that fire around midnight in user TZ vs server UTC, DST transitions, leap years, end-of-month.
CRITICAL
C1 — Reminder dueAt round-trip shifts by user-TZ offset on every edit
src/components/reminders/reminder-form.tsx:86,99,119
setDueAt(reminder.dueAt.slice(0, 16)); // line 86 — load
tomorrow.toISOString().slice(0, 16); // line 99 — default
new Date(dueAt).toISOString(); // line 119 — submit
reminder.dueAt is an ISO-8601 UTC string (...Z). Stripping the last 5
chars yields 2026-05-15T13:30 and feeds it into a <input type="datetime-local">
which interprets the value as local time. On submit, new Date('2026-05-15T13:30')
parses as local-time and .toISOString() converts back to UTC, subtracting
the user's UTC offset. So in Warsaw (CEST, UTC+2) every save of an
existing reminder shifts the time backward by 2 h. Open + save again, it
shifts another 2 h. End-result: a reminder created at "10:00 local" drifts
to 06:00, then 04:00, until it's eventually negative-of-the-other-side
(early morning vs evening).
The "default tomorrow 9 AM" path has the same bug in the opposite
direction: tomorrow.setHours(9,0,0,0) gives 09:00 local, then
.toISOString().slice(0,16) strips the Z so the input shows 07:00 (UTC)
to the user, who reads it as 07:00 local. On submit it stores 05:00 UTC.
The contact-log dialog at src/components/interests/interest-contact-log-tab.tsx:459-469
already implements the correct pattern (localIsoString building the local
HH:MM from getHours()/getMinutes()). Port it to reminder-form.tsx and
snooze-dialog.tsx. Same applies to any other future datetime-local
binding.
C2 — BullMQ recurring jobs run in UTC, not in port-local time
src/lib/queue/scheduler.ts:66-72
await queue.upsertJobScheduler(
job.name,
{ pattern: job.pattern }, // no `tz` option
{ data: {}, name: job.name },
);
BullMQ's RepeatOptions defaults tz to UTC when unset. Concrete
fallout for the Warsaw port (CET/CEST, UTC+1/+2):
| Pattern | Intent | Actual fire (CET / CEST) |
|---|---|---|
0 8 * * * (invoice-overdue, tenure-expiry) |
"8 AM local" | 09:00 winter / 10:00 summer |
0 2 * * * (database-backup) |
"2 AM local" | 03:00 winter / 04:00 summer |
0 4 * * * (session-cleanup, gdpr cleanup) |
"4 AM local" | 05:00 winter / 06:00 summer |
0 3 * * 0 (backup-cleanup) |
"Sunday 3 AM" | Sun 04:00 winter / 05:00 summer |
Twice a year (last Sun of March, last Sun of October) the local firing
time visibly shifts by an hour and admin docs ("daily check at 8 AM")
silently break. Fix: pass tz: process.env.SCHEDULER_TZ ?? 'Europe/Warsaw'
(or read per-port — see also C3) to every upsertJobScheduler. The
hourly/sub-hourly patterns (* * * * *, */N * * * *, 0 * * * *) are
TZ-invariant and don't need a tz.
C3 — report-scheduler never advances next_run_at
src/lib/queue/workers/reports.ts:22-50, src/lib/services/reports.service.ts
The minutely scheduler selects WHERE next_run_at <= now(), enqueues a
generate-report job, and inserts a generated_reports row — but does
not bump scheduled_reports.next_run_at. There is no other write of
that column anywhere in the service layer or API. Effect: once a
scheduled report comes due, the worker re-queues it every minute,
forever, until a human zeros the row out. For weekly/monthly reports
this means an instant flood of duplicate emails to recipients.
After enqueueing, write a new next_run_at derived from the cron
expression (use cron-parser or equivalent; project already vendors
croner-style logic via BullMQ's repeat machinery). Wrap the SELECT +
UPDATE in a transaction with FOR UPDATE SKIP LOCKED so two scheduler
ticks racing on the same row can't double-fire.
HIGH
H1 — detectOverdue compares against UTC "today"
src/lib/services/invoices.ts:763
const today = new Date().toISOString().split('T')[0]!;
// ... lt(invoices.dueDate, today)
invoices.due_date is a DATE. Building "today" from toISOString()
returns the UTC calendar date. The cron fires at 08:00 UTC (= 09:00 / 10:00
local) so today-in-UTC and today-in-Warsaw agree at that moment, but if a
human ever calls detectOverdue between 00:00–02:00 local (still
yesterday in UTC), invoices due "today" get flagged overdue a day early.
Compute the comparison date in port-local time (Intl + formatToParts).
H2 — Server-side PDF/email date formatting has no timeZone
src/lib/pdf/templates/reports/*.ts, src/lib/pdf/templates/*.ts,
src/lib/email/templates/document-signing.ts:141
Many calls of the form new Date().toLocaleString('en-GB') or
new Date(...).toLocaleDateString('en-GB') with no { timeZone }
option. On a UTC-deployed Docker container the output is UTC even when
the PDF context is per-port-local. "Generated: 11/05/2026, 22:30:00" on
a report a Warsaw rep opens at 00:30 the next morning is confusing.
Pass { timeZone: portTimezone } (resolve from ports.timezone or
port_settings) into every server-side formatter.
H3 — Notification-digest TZ gate skips a day on DST spring-forward
src/lib/services/notification-digest.service.ts:79-83
The local-hour gate works correctly in steady state, but on the
spring-forward boundary (e.g. Warsaw 31 Mar, 02:00 → 03:00 CEST), if the
configured digest time is 02:00 it is skipped entirely — local hour
goes from 01 to 03. Conversely on fall-back (CEST → CET) at 03:00 → 02:00
a 02:00 digest fires twice in the same calendar day. Document the gap
or, better, gate on (port_id, local-date) last-sent rather than the
hour alone.
H4 — Reminders fire/list use new Date() against UTC-stored timestamps but UI shows port-local
src/lib/services/reminders.service.ts:87, 105, 515
lte(reminders.dueAt, new Date()) is correct (dueAt is timestamptz),
but processOverdueReminders runs every 15 minutes and emails users
the second the UTC instant matches. If a rep sets a reminder for "Friday
17:00" in Warsaw, the email lands 17:00 CEST → fine. But the email
template (notifications insert) renders the server time — same H2
issue. Verify the user-facing email body renders dueAt in the recipient's
preferred timezone (userProfile.preferences.timezone), not server UTC.
MEDIUM
M1 — TZ-drift banner endpoint asymmetry
src/components/dashboard/timezone-drift-banner.tsx:62-75
Reads from GET /api/v1/me (returns profile.preferences.timezone),
writes to PATCH /api/v1/users/me/preferences (a different preferences
JSONB row). Both endpoints exist and both ultimately update
user_profiles.preferences, so functionally fine — but having two
endpoints write the same blob with different validators (/me
allow-lists {dark_mode, locale, timezone, tablePreferences},
/users/me/preferences uses updateUserPreferencesSchema) means a key
accepted on one endpoint may be silently dropped on the other. Either
merge into a single endpoint or document which is canonical.
M2 — Alpine small-ICU risk for per-port Intl.DateTimeFormat({ timeZone })
notification-digest.service.ts localHourFor and any future per-TZ
formatter need full-ICU. If the Docker base is Alpine without full-icu,
named zones silently fall back to UTC and the catch swallows it. Add a
startup self-test confirming Intl.DateTimeFormat('en',{ timeZone: 'Europe/Warsaw'}).format(new Date()) differs from UTC.
M3 — Contact-log followUpAt validator is looser than reminders
src/lib/validators/interest-contact-log.ts:14,23
z.coerce.date() accepts unzoned strings. Tighten to z.string().datetime()
to match the direct reminders endpoint.
M4 — BR-060 follow-up uses raw ms-arithmetic for "days since"
src/lib/services/reminders.service.ts:438
(now - lastActivity) / 86_400_000 under/over-counts by 1 h across DST
boundaries. Cosmetic for 14-day windows; document the rounding bias.
M5 — Greeting hourly tick uses setInterval(3_600_000)
src/components/dashboard/dashboard-shell.tsx:113 — drifts across DST.
Use a recursive setTimeout keyed to next local hour boundary.
ISO-8601 conformance summary
- Reminder writes/emit:
z.string().datetime()+.toISOString()✓ - Contact-log writes:
z.coerce.date()— loose, see M3. type="date"fields serialize asYYYY-MM-DDmatching DBDATE. ✓- PDF/email render: mixed; H2 covers the missing
timeZone.
Round-trip recap (picker → DB → email)
datetime-localvalue is local time, no TZ marker.new Date(v).toISOString()→ UTC Z form to API.- DB
timestamptzstores the instant. - Re-render to picker via
localIsoString(iso)(build local YMD/HM fromgetHours()etc.) — neveriso.slice(0,16). - Email/PDF render with
{ timeZone: portOrUserTz }.
C1 is the only place this breaks today. Once fixed plus C2/C3, the chain is consistent.
Out of scope
- No
node-cron/cronerjobs outside BullMQ. - No
Date.UTCconstruction; everything vianew Date(...)/Date.now(). - No
Temporaladoption; defer until Node 22 LTS unflags it.
24. File lifecycle + storage drift audit (file-lifecycle-auditor)
Audit — File lifecycle + storage drift
Scope: orphan blobs, stale folder rows, avatar cleanup, EOI signed-PDF orphans, brochure / berth_pdf version retention, storage-swap migration completeness, demoteSystemFolderOnEntityDelete, file_id orphans after document delete, GDPR-export ZIP retention.
Branch: feat/documents-folders @ 660553c. Read-only.
CRITICAL
C1. Avatar replacement leaks files rows + S3 blobs forever
src/app/api/v1/me/avatar/route.ts POST uploads a NEW file via uploadFile() and overwrites user_profiles.avatar_file_id — but never reads or deletes the previous id. Every "Replace photo" leaks one DB row + one blob, untethered (no client_id/yacht_id/company_id), so invisible to every existing UI sweep.
// no read of old avatarFileId, no cleanup
await db
.update(userProfiles)
.set({ avatarFileId: record.id, updatedAt: new Date() })
.where(eq(userProfiles.userId, ctx.userId));
Fix: SELECT the prior avatar_file_id, call deleteFile() (already handles ref-check + blob + audit), wrapped in try/catch so a stale-blob failure doesn't block the new avatar.
HIGH
H1. handleDocumentCompleted put-before-insert leaks signed-PDF blobs on retry storms
src/lib/services/documents.service.ts:1131-1188. Sequence: storage.put → db.insert(files) → db.update(documents).set(signedFileId). The idempotency gate at line 1110 stops a second webhook from minting a second blob — but only if doc.status === 'completed' AND signedFileId is set, which requires step 3 to have run. If step 2 OR step 3 throws on attempt N, the blob from step 1 survives with no DB pointer; Documenso retries; the gate doesn't trip (status still not 'completed'); step 1 runs again with a fresh UUID storage path. Each retry compounds an orphan.
Fix: either insert the files row in a pending state BEFORE storage.put (so failure rolls back via FK / explicit cleanup), or reuse a stable storage key derived from documents.id so retries overwrite the same blob.
H2. deleteDocument strands fileId + signedFileId rows + blobs
src/lib/services/documents.service.ts:596-616 does db.delete(documents) only. Both file FKs are plain references() (no cascade, no SET NULL) — the document row vanishes but the files rows + blobs survive with no link back. For a cancelled/expired doc with signedFileId (the sent/partially_signed block at line 599 doesn't cover these), the signed contract PDF — containing PII — is permanently orphaned in storage.
Fix: in deleteDocument, also delete dependent files rows via deleteFile(), or refuse the delete if files attached (mirroring deleteFile's ref-check).
H3. Brochure versions: zero cleanup, ever
src/lib/services/brochures.service.ts:191 archiveBrochure only flips archivedAt + clears isDefault. No version-row delete, no blob delete. No "delete prior version" admin endpoint, no retention cron, no rolling cap. CLAUDE.md says "Archived brochures retain version history" — that's by design, but there's also zero path to ever drop one. With ~10 MB PDFs iterated monthly, linear unbounded growth.
Fix: admin deleteBrochureVersion(brochureId, versionId) endpoint (blob delete via getStorageBackend().delete() + row delete in tx); refuse to delete the only remaining non-archived version. Optionally brochure_version_retention_count system setting.
H4. berth_pdf_versions has no cleanup mechanism
Symmetric problem. src/lib/services/berth-pdf.service.ts inserts a fresh row + UUID-keyed blob per upload (line 213); old versions accumulate forever. current_pdf_version_id advances; history-by-design is unbounded-by-default. For a port with hundreds of berths reuploaded under parser iterations, this is the largest storage footprint in the system.
Fix: admin "Delete this version" action on the version-history list, gated so the current_pdf_version_id cannot be deleted. Storage delete + row delete in a tx.
MEDIUM
M1. files.client_id lacks an explicit onDelete — fragile
src/lib/db/schema/documents.ts:30: clientId: text('client_id').references(() => clients.id) (no onDelete). Migration 0000 records ON DELETE no action. The only existing client-delete path (client-hard-delete.service.ts:193) explicitly nullifies files.client_id first, so it works — but any future bulk-delete / port-teardown / dev script bypassing hardDeleteClient will FK-violate. Compare files.yacht_id + files.company_id, both set null (added in 0042).
Fix: new migration to ON DELETE SET NULL files.client_id. Removes the implicit invariant that hard-delete is the only legal path.
M2. demoteSystemFolderOnEntityDelete is wired for clients only
One caller (client-hard-delete.service.ts:236). No hardDeleteYacht / hardDeleteCompany exists today, so not currently broken — but it's a landmine when those flows ship. Both must call demoteSystemFolderOnEntityDelete(portId, 'yacht'|'company', id).
M3. Hard-deleted-client files become un-swept root orphans
client-hard-delete.service.ts:193 nullifies files.clientId and demotes the system folder to "{name} (deleted)". The file rows now have clientId=null + folder_id pointing at the demoted folder — discoverable in the demoted folder but never automatically dropped. The HARD delete of the client doesn't actually hard-delete their files. Inconsistent with the "hard" naming AND with GDPR Article 17.
Fix: mid-transaction (before the nullify), capture the affected file IDs; post-transaction call deleteFile() on each (handles blob + audit). Alternatively: nightly worker that drops file rows where every entity FK is null + no doc/expense/maint reference + created_at < N days.
M4. GDPR export cleanup retries forever on storage failure
src/lib/queue/workers/maintenance.ts:97-108. If storage.delete(row.storageKey) throws, the catch increments failed but does NOT delete the DB row. Next 4 AM run, same row reappears; same failure; same warn. No max-retry, no dead-letter, no admin escalation. A permanently broken storage path silently piles up infinite warns AND the GDPR-erasure obligation never completes.
Fix: track delete_attempts per row; after N failures either force-delete the DB row + log the orphan-blob to an admin-visible orphans table, or escalate at pino error + Sentry.
M5. migrate.ts table list has no drift guard
src/lib/storage/migrate.ts:52 explicitly admits: "The report_snapshots table called out in the audit does not exist yet. Add it here when it lands." This is a manual checklist with no enforcement — any future table that adds a storage_key/storage_path and forgets to extend TABLES_WITH_STORAGE_KEYS will silently leave its blobs behind on every backend swap.
Fix: integration test that diffs information_schema.columns WHERE column_name IN ('storage_key','storage_path') against TABLES_WITH_STORAGE_KEYS. Failing test forces an update before the new table can ship.
M6. deleteFolderSoftRescue: no per-row audit + opaque sibling-name collision
src/lib/services/document-folders.service.ts:283-326:
- Only the folder delete itself is audit-logged; the bulk re-parent of N documents + N files leaves no per-row trail. An auditor cannot reconstruct "which folder did this signed contract land in?"
- If a re-parented child folder's name collides with an existing sibling at the destination, the UPDATE throws on
uniq_document_folders_sibling_nameand the tx rolls back. Error propagates as a raw "duplicate key" — comparemoveFolder, which catches viaisSiblingNameConflictand returns a useful 409.
Fix: (a) emit one bulk audit row with metadata: { docsMoved, filesMoved, rescuedTo }; (b) wrap the UPDATE in the same conflict catch.
M7. listTree silently drops orphan folder rows
document-folders.service.ts:95 logs "listTree: orphan folder row … dropped from tree". Defensive — but the orphans aren't auto-healed and aren't surfaced anywhere. Post-soft-rescue this shouldn't happen, but if it does (race, manual SQL, future bug), the row hides forever.
Fix: daily maintenance worker counts documentFolders WHERE parent_id IS NOT NULL AND parent_id NOT IN (SELECT id FROM documentFolders) and emits a metric / log.
Summary
| Sev | Finding | File | Effort |
|---|---|---|---|
| CRIT | C1 — Avatar replace leaks rows + blobs | api/v1/me/avatar/route.ts |
XS |
| HIGH | H1 — completed-webhook put-before-insert orphan | services/documents.service.ts:1131 |
S |
| HIGH | H2 — deleteDocument strands signed PDF |
services/documents.service.ts:596 |
S |
| HIGH | H3 — Brochure versions: no cleanup ever | services/brochures.service.ts |
M |
| HIGH | H4 — Berth PDF versions: no cleanup ever | services/berth-pdf.service.ts |
M |
| MED | M1 — files.client_id lacks onDelete |
schema/documents.ts:30 |
XS migration |
| MED | M2 — demoteSystemFolderOnEntityDelete client-only |
services/document-folders.service.ts:733 |
XS (future) |
| MED | M3 — Hard-delete client leaves orphan files | services/client-hard-delete.service.ts:193 |
S |
| MED | M4 — GDPR cleanup loops on storage failure | queue/workers/maintenance.ts:97 |
S |
| MED | M5 — Migrate table list has no drift guard | lib/storage/migrate.ts:55 |
S test |
| MED | M6 — Soft-rescue: no per-row audit + opaque collision | services/document-folders.service.ts:283 |
S |
| MED | M7 — Orphan folder rows logged, never healed | services/document-folders.service.ts:95 |
XS |
Biggest cumulative storage waste: H3 + H4 (uncapped version retention) and C1 (per-user avatar churn). Most dangerous correctness/GDPR findings: H1 (silent signed-PDF orphan under Documenso retry) and H2 (signed PII PDFs surviving document deletion).
28. Code quality + maintainability hotspots audit (maintainability-auditor)
Audit — Code Quality & Maintainability Hotspots (task #28)
Scope: cyclomatic complexity hotspots, files >500 lines, services violating SRP, monster components, cross-domain duplication, abandoned scaffolding. Read-only.
Top-line numbers: 9 source files >700 lines; 22 files >500 lines. TODO/FIXME/HACK markers: only 3 files (3 markers total) — drift is not the problem here; sheer file size and per-entity duplication are.
CRITICAL
C1. src/lib/services/documents.service.ts — 1982 lines, 33 exports, 30 imports, ~7 distinct concerns
One file owns: document CRUD, hub listing, signing send-flow (sendForSigning,
~200 lines, 10+ branches), manual upload (uploadSignedManually), 6 Documenso
webhook handlers (handleDocumentCompleted 224 lines / 11 branches,
handleRecipientSigned, …Expired, …Rejected, …Cancelled, …Opened),
template-driven wizard (createFromWizard), and aggregated-by-entity projection
(listInflightWorkflowsAggregatedByEntity + fetchWorkflowGroupRows). Single
strongest SRP violation in the codebase. Recommend split:
documents.service.ts (CRUD+detail), documents-signing.ts (send/cancel/manual-
upload), documents-webhook-handlers.ts (the 6 handlers), documents- aggregation.ts (the hub projection). Webhook handlers in particular are
inbound-event logic, not service CRUD, and dynamic-import circular deps with
interests.service.advanceStageIfBehind cross the boundary today.
C2. src/lib/services/search.service.ts — 2163 lines, single file
26 exports, 14 per-entity searchX helpers (clients, residential clients,
yachts, companies, interests, residential interests, berths, invoices, expenses,
documents, files, reminders, brochures, tags, notes, otherPorts), plus
expandGraph (~420 lines, 14+ branches), search orchestrator, and recent-
search storage. Cohesive in purpose but no single dev can hold this in head.
Recommend: search/buckets/*.ts (one per entity), search/expand-graph.ts,
search/orchestrator.ts. Touching one bucket today forces reading 2000+ lines
of unrelated context.
C3. src/lib/services/notes.service.ts — 1121 lines, near-pure duplication
6 entity-type branches per operation (clients / interests / yachts / companies
/ residentialclients / residential_interests). The create function alone
(lines 689–846) is 158 lines of 6 copy-pasted insert-then-profile-lookup
blocks; same for update (847–1019) and deleteNote (1020+). A
tableForEntity() dispatcher is _defined at line 82 then immediately silenced
(void tableForEntity; line 98) — i.e. the abstraction was started, abandoned,
and the dead helper left in place. Aggregated listers (listForClient/Yacht/ Company/ResidentialClientAggregated) are 4 near-identical 100-line bodies.
Recommend: dispatch table { table, fk, link } keyed by entityType +
single generic insert/update/delete; collapses ~600 lines.
C4. src/components/interests/interest-tabs.tsx — 959 lines, single file
OverviewTab is 415 lines of inline JSX (456–870). Inline helpers
MilestoneSection, MilestoneAdvanceButton, FutureMilestones,
EditableRow, InfoRow, useInterestPatch, useStageMutation,
humanizeStatus all share this file. Single file owns the entire detail-page
overview, milestone widget, mutation hooks, and tab definition. Recommend
split: interest-overview-tab.tsx, interest-milestones.tsx,
hooks/use-interest-patch.ts.
HIGH
H1. Two near-named template services live side-by-side
src/lib/services/document-templates.ts (955 lines — CRM template flow:
listTemplates, generateAndSign, EOI generation) and
src/lib/services/document-templates.service.ts (262 lines — Admin TipTap
template flow with audit-log versioning). Both export listTemplates,
getTemplateById, createTemplate, updateTemplate against different
schemas. Different consumers import each by accident-prone path. Strongly
recommend renaming the admin one to admin-document-templates.service.ts (it
already prefixes its functions with …AdminTemplate…).
H2. Per-entity component duplication is system-wide (4× scaffolding)
For each of clients / yachts / companies / interests there exist near-parallel:
<entity>-list.tsx, <entity>-columns.tsx, <entity>-filters.tsx,
<entity>-form.tsx, <entity>-detail-header.tsx, <entity>-card.tsx,
<entity>-tabs.tsx, <entity>-files-tab.tsx. Confirmed near-identical
pairs:
client-files-tab.tsxvscompany-files-tab.tsx— 88 lines each, only difference is the entity-key parameter (clientIdvscompanyId) in 6 spots. ~95% byte-identical. Should be<EntityFilesTab entityType=…>.client-list.tsx(350) /yacht-list.tsx(295) /company-list.tsx(308) /interest-list.tsx(469): same imports, same TanStack-table wiring, same bulk-action shape, parameterised only by columns + filters + form components.
A generic <EntityListShell columns={…} filters={…} form={…} /> would collapse
~1400 lines into ~400. Similarly forms: interest-form (756) + company-form
(706) share the same react-hook-form skeleton.
H3. src/lib/services/expense-pdf.service.ts — 987 lines, SRP-spanning
Mixes: query/fetch (fetchExpenseRows, resolveReceiptFiles), grouping
(groupRows, groupKey, computeTotals), image processing
(maybeResizeImage, streamToBuffer), and PDFKit layout primitives
(addHeader, addSummaryBox, addExpenseTable, addReceiptPages,
renderReceiptHeader, addReceiptErrorPage, addFooter). 17 functions, 3
unrelated concerns. Recommend: expense-pdf/data.ts, expense-pdf/ layout.ts, expense-pdf/index.ts.
H4. src/components/search/command-search.tsx — 1177 lines, 10 inline subcomponents
CommandSearch (268 lines) + FilterChipRow, ChipButton, EmptyStateBeforeSearch,
ResultsRegion, ZeroState, QuickCreateButton, ResultRow, Badge,
SectionHeading, BucketSection, plus buildFlatRows (327 lines, branch-heavy).
The inline subcomponents are reusable in principle but private to this file by
virtue of co-location. Recommend: search/internal/{filter-chips,result- row,bucket-section,build-flat-rows,empty-states}.tsx. buildFlatRows deserves
its own file with its own test.
H5. src/lib/services/interests.service.ts — 1273 lines, 17 exports
Owns 6 state-transition mutations (changeInterestStage, advanceStageIfBehind,
setInterestOutcome, clearInterestOutcome, archiveInterest,
restoreInterest), berth-linking (linkBerth/unlinkBerth), tag setter, board
projection (listInterestsForBoard, ~75 lines), list+detail. State-transition
logic could move to interests-lifecycle.ts; board projection to
interests-board.ts. Two interest CRUD helpers
(getInterestById 112 lines, listInterests 184 lines) both build elaborate
shaped reads — they're load-bearing but should probably both run through a
single projection helper.
MEDIUM
M1. Cyclomatic-density hotspots (informal — branch-count per body)
documents.service.handleDocumentCompleted— 224 lines, 11 branches.documents.service.sendForSigning— 200 lines, 10 branches.search.service.expandGraph— 420 lines, 14+ branches across entity types.documents.service.uploadSignedManually— ~110 lines.interests.service.changeInterestStage— ~140 lines.notes.service.create/update/deleteNote— 6 inline entity branches each.
M2. Abandoned scaffolding — void <identifier> silencing
The codebase has 7+ deliberate void <symbol> statements added to keep
imports/symbols around for future use:
src/lib/services/notes.service.ts:98—void tableForEntity;(full helper abandoned)src/lib/services/alert-rules.ts:331—const _unused = { gt, desc, alertsTable }; void _unused;(3 stale imports)src/app/api/v1/clients/bulk/route.ts:227-228—void HIGH_STAKES_STAGES; void ({} as PipelineStage);src/app/api/v1/admin/email-templates/route.ts:91—void eq;src/app/api/v1/admin/website-submissions/route.ts:76—void lt;src/app/api/v1/interests/bulk/route.ts:134-135—void inArray; void withPermission;
Either the future-PR landed without removing the placeholder, or the abstraction was never built. Each is a small lint-clean-up; collectively they signal unfinished refactors. Decide per case: implement the dispatcher (notes), or delete the dead imports.
M3. Real TODO/FIXME — only 3 in the entire src tree
src/lib/queue/workers/import.ts:13—// TODO(L2): implement import job handlers(worker is a stub).src/lib/queue/scheduler.ts:44—// TODO(L2): make per-user schedule configurable.src/components/interests/interest-detail.tsx:26— JSDoc remark, not a todo.
The import worker stub is the only real loose end — confirm whether import jobs are needed before shipping, otherwise delete the worker registration to avoid an empty queue.
M4. Cross-service implicit coupling via dynamic-import circles
documents.service imports advanceStageIfBehind from interests.service
statically; interests.service imports evaluateRule from
berth-rules-engine; berth-rules-engine calls services via await import(...) to dodge the cycle. The dynamic-import workaround masks circular
ownership: the rules engine is effectively the orchestrator of state changes
across documents + interests + invoices. Worth either (a) hoisting the rules
engine to a top-level coordinator that the services don't import back, or (b)
documenting the cycle explicitly in CLAUDE.md so the next dev doesn't break
it.
M5. Largest leaf components without inline subcomponents
interest-form.tsx(756) andcompany-form.tsx(706) are single components. Both define schema + form + nested pickers in one file. Could benefit frominterest-form-fields/{dimensions,category,picker}.tsx.interests/linked-berths-list.tsx(530) anddocuments/documents-hub.tsx(537) sit just above the threshold; readable but on the edge.
M6. Re-export shims (legacy import boundary)
src/components/clients/pipeline-constants.ts — "Re-export from the canonical
source so legacy imports keep working." Audit the consumer list and migrate
imports to the canonical path; remove the shim.
Notes / non-issues
- TODO/FIXME hygiene is excellent (3 markers across 148k LOC).
- The 18 services with
audit.service.ts-style pattern are short and cohesive — no monster spread. - Drizzle schema split (one file per domain in
src/lib/db/schema/) is clean;relations.ts(953 lines) is large but central by design. dashboard-shell.tsx(243 lines) is not a monster — single composition surface, leaves widgets in their own files. Healthy pattern.
Suggested order of operations
- Rename
document-templates.service.ts→admin-document-templates.service.ts(H1; one-day safety win). - Build
<EntityFilesTab entityType="…">and delete the two copies (H2; warm-up). - Replace notes.service entity-switch ladders with a dispatch table (C3).
- Split
documents.servicealong the natural seams: CRUD / signing / webhooks / aggregation (C1). - Split
search.serviceinto per-bucket files (C2). - Split
interest-tabs.tsxandcommand-search.tsx(C4, H4). - Sweep
void <symbol>placeholders (M2).
Total estimated reduction: ~3500 lines of code via deduplication + better split points, no functional change.
23. Multi-port super-admin flow audit (multi-port-auditor)
Audit: Multi-Port Super-Admin Flow (Task #23)
Scope: super-admin "otherPorts" search extension, port-switcher UX, cross-port report queries, every isSuperAdmin bypass path, accidental data bleed, X-Port-Id header handling, port_id default resolution from preferences, the super-admin-only /admin/ports listing.
Read-only audit. No edits made. Roughly ranked by blast-radius.
CRITICAL
C1. Port-switcher race — first request after navigation can hit the WRONG port
src/providers/port-provider.tsx:38-48, src/components/layout/user-menu.tsx:65-73, src/lib/api/client.ts:50-63.
PortProvider reads the URL slug at render and reconciles Zustand inside a useEffect. apiFetch reads useUIStore.getState().currentPortId synchronously. For a super-admin who is on /port-A/clients and clicks /port-B/clients (or hits a deep link from search/external nav), the first round of queries fires before the reconcile effect commits — sending X-Port-Id = port-A while the page chrome renders port-B. Listings come back from port-A and render inside port-B's shell ⇒ silent cross-port data bleed in the UI.
handlePortChange does invalidate React Query AND push the route, but setPort (Zustand setter) is sync — and the router.push is async. Any queries kicked off by the new route's components before the next tick can still read stale state on the initial mount. The reconcile happens on the second render.
Fix sketch: Have apiFetch derive portId from window.location.pathname FIRST and fall back to Zustand, not the reverse. The slug is authoritative; Zustand is a cache. (The current code only consults the URL when Zustand is empty.)
C2. apiFetch slug-to-id fallback is dead for non-super-admins
src/lib/api/client.ts:18-40.
The fallback for "Zustand not hydrated yet" calls /api/v1/admin/ports. That endpoint has requireSuperAdmin(ctx, 'admin.ports.list') (src/app/api/v1/admin/ports/route.ts:16). For a port director on a hard refresh, the request 403s, resolvePortIdFromSlug returns null, apiFetch ships the request with no X-Port-Id header — and withAuth then falls back to preferences.defaultPortId, which (per next finding) is also unwritable. End state for the user: a 400 "Port context required" on every initial request after a cold reload, until Zustand re-hydrates from localStorage. Suggest a public/authed /api/v1/me/ports lookup that is permission-free.
C3. defaultPortId preference is read by withAuth but the /me PATCH allow-list refuses to write it
src/lib/api/helpers.ts:160-164 reads (profile.preferences as { defaultPortId?: string })?.defaultPortId as the X-Port-Id fallback.
src/app/api/v1/me/route.ts:45-66 defines preferences with z.object({...}).strict() and the allow-list ALLOWED_PREF_KEYS = new Set(['dark_mode', 'locale', 'timezone', 'tablePreferences']) at line 154. defaultPortId is silently stripped at every write. The fallback in withAuth is therefore dead — preferences.defaultPortId can only ever be set by a hand-rolled db.update. For super-admins this means: no header ⇒ no portId ⇒ ctx.portId = '' ⇒ every WHERE port_id = '' returns empty. Mild UX bug for super-admins but silent. Either remove the dead fallback or add defaultPortId to the strict schema + allow-list.
HIGH
H1. searchOtherPorts ignores per-port ACL for super-admin extension (theoretical, currently fine)
src/lib/services/search.service.ts:1232-1314. The docstring at line 31 promises "ports the user can access other than portId". The implementation just excludes excludePortId and joins every other row in ports. Today super-admins can access every port, so the behavior matches. Risk: if a future "regional super-admin" role lands and reuses this code path (opts.includeOtherPorts && opts.isSuperAdmin) the leak is total — no ACL filter. Recommend passing in the set of accessible portIds as a parameter and using it in the port_lookup CTE WHERE, even though the current gate is binary.
H2. /api/v1/admin/users/[id]/permission-overrides PUT — port directors can promote anyone in their port to "owns everything"
src/app/api/v1/admin/users/[id]/permission-overrides/route.ts:153-244.
The route gates on admin.manage_users (port-scoped), and rejects self-target (line 163) + targets not assigned to the same port (line 173). But there is no guard preventing a port director from writing admin.permanently_delete_clients: true, system_backup: true, manage_users: true, etc. onto a different user in the same port — and then logging in as that user (or asking that user) to act with elevated permissions. Self-target is blocked but co-conspirator escalation is not. Mitigation idea: cap the overrides a non-super-admin can set to the leaves they themselves hold (effectively ctx.permissions ∩ overrides). The audit log is recorded, so this is detectable post-hoc, but not prevented.
H3. AdminLayout vs admin-API permission asymmetry
src/app/(dashboard)/[portSlug]/admin/layout.tsx:31-33 redirects every non-super-admin away from /[portSlug]/admin/.... Meanwhile /api/v1/admin/** endpoints are mostly gated on admin.manage_settings / admin.manage_users / admin.view_audit_log — leaves that the port-director role holds. So a port director can hit the APIs (via curl, scripts, or non-/admin UI surfaces such as settings/) but the matching UI is hidden behind a super-admin redirect. Pick a side: either gate the API endpoints on requireSuperAdmin, or let port directors into the corresponding sub-pages of /admin/ (alerting on the ones that should remain super-admin only — backup, queues, storage, ports, invitations).
H4. Super-admin with empty ctx.portId silently filters to zero rows
src/lib/api/helpers.ts:166-168 — only non-super-admins are blocked when portId is null. A super-admin without an X-Port-Id header AND without a preferences.defaultPortId (which is currently every super-admin per C3) gets ctx.portId = ''. Downstream services that do WHERE port_id = ${portId} silently return empty data, which is harmless. But endpoints that BRANCH on isSuperAdmin ? undefined : ctx.portId (e.g. error-events route.ts:32) will hand undefined to the service and return EVERY tenant's rows. Currently only the error-events listing does this — but the pattern is risky. A scoped super-admin with the wrong header today sees one port; without the header they see ALL ports — surprising to admins debugging "why am I seeing port-X data on port-Y?". Recommend an explicit ?allPorts=true opt-in on those endpoints rather than coupling cross-port reads to a missing header.
MEDIUM
M1. Port switcher only invalidates queries, doesn't abort in-flight ones
src/components/layout/user-menu.tsx:65-73. queryClient.invalidateQueries() marks queries stale but lets in-flight ones finish and write into the cache. If a long-running fetch (e.g. PDF generation, expensive report) was started under port-A and resolves after the user switches to port-B, the cache entry is now port-A data keyed on a query that the new page treats as port-B. Worth pairing with cancelQueries() and a re-key on portId (most query keys appear to not embed portId).
M2. /api/v1/expenses/export/parent-company lost its isSuperAdmin guard
src/app/api/v1/expenses/export/parent-company/route.ts:9-12. The comment says "Hard isSuperAdmin check used to lock out port admins who held expenses.export = true" — but the check is no longer in the route body, it now relies on the perm gate alone. The service exportParentCompany is single-port (filters expenses.port_id), so this is not a cross-port leak today. But the doc-vs-code drift should be reconciled either by adding requireSuperAdmin back or by deleting the stale comment.
M3. Search "otherPorts" cross-port hits expose port-level metadata to ALL super-admin queries
src/lib/services/search.service.ts:1862-1864, src/app/api/v1/search/route.ts:20. Toggle includeOtherPorts defaults to false — but any super-admin can flip the query param. The merge into SearchResults.otherPorts returns portId/portSlug/portName/type/id/label/sub from up to 5 other ports per request without rate-limiting the cross-port enumeration. Pairs with the existing search rate-limit (if any) — confirm and add a tighter ceiling on searchOtherPorts(query, limit). Currently limit defaults to whatever the searchQuery schema permits.
M4. Super-admin dashboard redirect always picks first port alphabetically
src/app/dashboard/page.tsx:24-27 — db.query.ports.findFirst({ orderBy: portsTable.name }). Predictable and stable, but ignores any "last-used port" signal. Combined with C3, a super-admin who manually picks port-B then closes the tab returns to port-A on next login. Cosmetic but disorienting. Easiest fix: persist last_used_port_id in userProfiles.preferences and read it here.
M5. Webhook + document workers fan out to ALL super-admins for in-app notifications
src/lib/queue/workers/webhooks.ts:264, src/lib/queue/workers/documents.ts:73. Both fetch every isSuperAdmin=true AND isActive=true user to send notifications. Not a security issue; flagging because a future "regional super-admin" rollout will make the broadcast list quietly cross-tenant. Wrap the queries in a notifySuperAdmins(portId) helper now so the porting work is one diff later.
M6. /admin/ports/[id] PATCH lets super-admin mutate any port without the rate-limit gate
src/app/api/v1/admin/ports/[id]/route.ts:34-50 — no withRateLimit on a PATCH that touches every port-wide setting (timezone, currency, branding…). Lower priority because callers are short and trusted, but pairs naturally with the audit log.
M7. AuthContext has no accessiblePortIds set
Every cross-port-aware code path re-derives "which ports can this user touch?" from userPortRoles or isSuperAdmin. Hoist into AuthContext (computed once in withAuth) so future endpoints don't have to re-implement the resolution and so cross-port filters can apply inArray(table.portId, ctx.accessiblePortIds) uniformly.
Findings that audit clean
/api/v1/admin/portsGET/POST correctly requirerequireSuperAdmin(route.ts:16,28)./api/v1/admin/ports/[id]correctly enforces port-in-scope for non-super-admins (assertPortInScope, line 15-20)./api/v1/admin/invitationscorrectly rejects port-director-minted super-admin invites (line 40-42)./api/v1/admin/auditis strictly port-scoped (line 40) — no cross-tenant peek even for super-admins.withAuthcorrectly refuses requests where the body tries to setportId(header-only); body-basedportIdis documented as forbidden (line 156-159).- Reports service consistently uses
ctx.portIdin WHERE clauses (reports.service.ts:103-163); no super-admin cross-port aggregation paths. - Public berth/inquiry endpoints take their portId from a query param / dedicated header, never from auth context — correctly decoupled from session port.
Recommended next steps (in order)
- Fix C1 by making the URL slug authoritative inside
apiFetch. - Fix C2 with a small
/api/v1/me/accessible-portsendpoint usable by every authed user. - Add
defaultPortIdto the/mePATCH allow-list (C3) — or strip the fallback fromwithAuth. - Add the "overrides ∩ caller's own perms" cap to permission-overrides PUT (H2).
- Reconcile AdminLayout vs admin-API gating (H3) — write one document of which leaves are super-admin only.
- Hoist
accessiblePortIdsintoAuthContext(M7) ahead of the next cross-port feature.
33. S3 vs internal DB pathing + storage routing audit (storage-pathing-auditor)
Audit — S3 vs Internal DB Pathing + Storage Routing
Scope: src/lib/storage/*, every getStorageBackend() consumer, migration script, magic-byte enforcement, encryption-at-rest boundary.
Date: 2026-05-12
Boundary summary (what lives where)
- In DB (Postgres): file metadata only —
files.storage_path,berth_pdf_versions.storage_key,brochure_versions.storage_key,gdpr_exports.storage_key,backup_jobs.storage_path, user-avatar FK (user_profiles.avatar_file_id→files), document signing state (documents.signed_file_id). AES-256-GCM-encrypted secrets:system_settings.storage_s3_secret_key_encrypted,storage_proxy_hmac_secret_encrypted,email_accounts.credentials_enc,webhooks.secret,ocr_config.api_key_encrypted. No BYTEA / JSONB blobs found (grep BYTEA → 0). - In backend (S3/filesystem): every uploaded blob — signed PDFs (
buildStoragePath(slug,'eoi-signed',…)), per-berth PDFs (berths/{id}/…), brochures, avatars, GDPR exports, pg_dump backups, expense receipts, generated reports, template source PDFs, send-out attachment fallbacks. - Routing:
getStorageBackend()reads globalsystem_settings.storage_backend('s3'|'filesystem'), caches by config fingerprint, invalidated viaresetStorageBackendCache()on settings write or migration flip. Code never importsminio/Clientoutsides3.ts(verified — only legacybuildStoragePathhelper survives insrc/lib/minio/index.ts). Interface methods: put/get/head/delete/listByPrefix/presignUpload/presignDownload — both backends implement all 7.
CRITICAL
C1. backup_jobs.storage_path missing from TABLES_WITH_STORAGE_KEYS — silent backup loss on backend swap
src/lib/storage/migrate.ts:55-60 lists only files, berth_pdf_versions, brochure_versions, gdpr_exports. backup_jobs.storage_path (pg_dump artefacts written by src/lib/services/backup.service.ts:54+72) is not in the list. Flipping S3 → filesystem (or vice-versa) leaves every historical database backup pointing at the old backend — getBackupDownloadUrl(id) will 404 / NoSuchKey, and the admin won't know until they try to restore. This is the worst category of data loss because backups are the recovery path of last resort. The comment in migrate.ts:51 calls out report_snapshots as a future addition but mentions nothing about backup_jobs. Add { table: 'backup_jobs', keyColumn: 'storage_path', pkColumn: 'id' } and ship the line with a smoke test.
C2. Orphan-blob risk: every backend.put runs outside the db.insert(files) transaction
Pattern repeated across 9+ services (files.ts:68-92, documents.service.ts:833-854 and 1134-1183, external-eoi.service.ts:71-96 — comment at L67-70 explicitly acknowledges "orphan reaper handles those" but no reaper exists, invoices.ts:603, document-templates.ts:537,674, reports.service.ts:231, gdpr-export.service.ts:169, backup.service.ts:62, berth-pdf.service.ts:229). Sequence is: PUT bytes → DB INSERT. If insert fails or the process dies in between, the blob is permanent and unreferenced. Only handleDocumentCompleted (documents.service.ts:1110) has an early-return idempotency gate; the rest leak. Over months of operation an S3 prefix accumulates dozens-to-hundreds of orphans that pay storage cost forever and survive every backup-restore. Add an orphan-reaper maintenance job that walks listByPrefix() against the union of all storage_* columns and deletes blobs older than 24 h without a DB pointer. Also wrap the put + insert pairs in a try/catch that explicitly deletes on insert failure (cheap defense in depth).
C3. S3 backend stores blobs without server-side encryption (SSE-S3 / SSE-KMS)
S3Backend.put() (src/lib/storage/s3.ts:191) passes only Content-Type to client.putObject. No x-amz-server-side-encryption header. Bytes-at-rest encryption depends entirely on the bucket's default-encryption policy, which is invisible to the application — a customer who provisions a MinIO/B2/R2 bucket without server-side encryption gets cleartext signed contracts, GDPR exports, and pg_dump archives sitting on disk. Same audit posture as SMTP/IMAP creds (which are AES-GCM in the DB) demands the same guarantee for the blob plane. Either add ServerSideEncryption: 'AES256' to every putObject call, or surface a boot-time check that asserts the bucket has default-encryption enabled and refuses to start otherwise (similar to the MULTI_NODE_DEPLOYMENT guard on FilesystemBackend).
HIGH
H1. Berth-PDF presigned-upload keys are not port-scoped
src/app/api/v1/berths/[id]/pdf-upload-url/handlers.ts:58 builds berths/{berthId}/uploads/{uuid}_{name} — no leading ${portSlug}/. Result: the optional port-binding (p field on the HMAC token, enforced in filesystem.ts:184-188 and documented in index.ts:43-49) cannot be wired here, and the storage-key namespacing convention diverges from buildStoragePath (which always prefixes the port slug). Tenant isolation today relies on the up-front berths.portId === ctx.portId check before mint, but the defense-in-depth port-binding is unwired. Normalise the key to ${portSlug}/berths/... and pass portSlug into backend.presignUpload.
H2. presignDownload callers never pass portSlug — port-binding token guard is dead code
presignDownloadUrl(...) (storage/index.ts:233) accepts portSlug and only 1 of ~12 callers uses it. files.ts:117,128, backup.service.ts:115, portal.service.ts:351, reports.service.ts:170, gdpr-export.service.ts:224,282 all pass undefined. The filesystem-proxy GET will therefore accept any valid HMAC token regardless of the storage-key's port prefix. The check is genuinely defensible (see filesystem.ts:179) but never engaged. Plumb the active port slug through every call site, or remove the optional p field and the verifier code so the contract isn't misleading.
H3. S3Backend.put and backup.service buffer entire blobs into memory
s3.ts:187 (Buffer.isBuffer(body) ? body : await streamToBuffer(body)) and backup.service.ts:60-62 (concatenates the entire pg_dump dump into memory before put). For a multi-GB database dump the worker OOMs. Comment at s3.ts:184-187 explicitly says "typical files are under 50MB" but runPgDump writes a dump file whose size scales with the tenant. Use client.fPutObject (file-path streaming) for backups; for streamable callers expose a putStream(key, stream, sizeBytes, opts) overload that pipes without streamToBuffer.
H4. Migrator's copyAndVerify double-buffers every blob and has no streaming hash
storage/migrate.ts:170-204 reads source → Buffer, sha256, put, then re-reads target → Buffer, sha256 again. For a 5 GB pg_dump (see C1 — once added) this allocates ~10 GB of heap. The sha256-verify round-trip is the right idea; pipe through crypto.createHash on both legs, never buffer.
H5. S3Backend.presignUpload lacks content-type / content-length binding
s3.ts:249-256 only calls presignedPutObject(bucket, key, expiry). The signed URL does not bind Content-Type or Content-Length — a browser can PUT 1 GB of arbitrary bytes against an EOI-signed key. Caps and magic-byte checks fire only on the register call afterwards (registerBrochureVersion and uploadBerthPdf HEAD-then-stream-first-5-bytes path). That's sufficient for the two consumers today, but the gate is one-deep — a future caller that forgets to wire a register endpoint exposes raw S3 directly. Switch to MinIO presignedPostPolicy with content-length-range + Content-Type conditions so the binding is on the signature itself.
MEDIUM
M1. CLAUDE.md drift on "TABLES_WITH_STORAGE_KEYS populated in 9a5ba87"
CLAUDE.md says the migrator covers "every blob in files, berth_pdf_versions, brochure_versions, gdpr_exports". Verified true — but backup_jobs is the missing 5th (see C1). Update the doc + add a unit test that asserts the array matches the set of tables with a storage_* column.
M2. email-compose.service.ts:124 reads attachment bytes into a Buffer
Each attachment under the email_attach_threshold_mb cap is fetched via storage.get(...) and concatenated. With multiple recipients × multiple attachments the worker holds N × size MB simultaneously. Stream into nodemailer's content: <Readable> API directly.
M3. UUID storage keys never check existence before put (no If-None-Match: "*")
crypto.randomUUID() collision is astronomical, but a buggy caller passing a duplicate key (or a re-run of a worker after a partial DB rollback) silently overwrites. Cheap belt: pass If-None-Match: '*' (S3) or O_EXCL (filesystem) — surfaces double-writes loudly.
M4. Per-port S3 routing not possible / listByPrefix unbounded
Storage config rows are global (portId IS NULL). Multi-tenant can't direct port-A vs port-B to separate buckets / KMS keys. listByPrefix returns every key in one array — script-only today but a footgun if called with empty prefix in production. Document the global-config assumption; add a cursor variant before any per-port-bucket customer lands.
M5. storage_filesystem_root change invalidates outstanding HMAC tokens silently
Cache swaps, but tokens minted under the old root still verify HMAC; resolveKeyForProxy then 404s under the new root. Customer download links emailed an hour earlier break with no warning. Either refuse runtime root changes, or warn in admin UI.
M6. Avatar URLs re-presign every 15 min — browser cache broken
No CDN / s-maxage fronts hot reads. Per-page avatar GET burns a presign + S3 round-trip. Issue 24 h URLs for category='avatar', or front with the Next.js Image route.
M7. Verified clean
withTimeout(...)wraps every minio call (s3.ts L143/150/190/203/219/237/285/292/300);system-monitoring.service.ts:153adds its own 5 s deadline. No bare minio calls escape. ✓MULTI_NODE_DEPLOYMENTguard readsenv.MULTI_NODE_DEPLOYMENT(zod-coerced,env.ts:80), test atfilesystem-backend.test.ts:139. ✓
M8. Magic-byte enforcement
- In-server uploads:
files.ts:58(bufferMatchesMime),berth-pdf.service.ts:218(isPdfMagic). ✓ - Presigned-PUT post-upload register:
brochures.service.ts:258(first-5-byte stream +%PDF-),berth-pdf.service.ts:259(readFirstBytes+isPdfMagic). ✓ - Filesystem proxy PUT: inline check
route.ts:220when token'sc=application/pdf. ✓ - S3 direct PUT: no inline check (relies on the register endpoint). Acceptable per CLAUDE.md, but document divergence: a future S3 consumer that forgets to call register leaks the gate.
Verified-clean (informational)
- No BYTEA / binary-JSONB blob columns. ✓
- Single canonical key format mismatch (
storage_pathvsstorage_key) is documented + handled by per-table column mapping. ✓ validateStorageKeyrejects traversal, absolute paths, dotfiles, and >1024 chars. ✓- Proxy token op-binding (
getvsput) is enforced — replay across ops blocked. ✓ - Proxy single-use replay protection via Redis SET NX with TTL pinned to token expiry. ✓
- Filesystem HMAC secret falls back to a derived dev value but throws in production when unset. ✓
- All blob keys are UUID-namespaced — collision-safe, not deterministic-audit-style. ✓
Recommended ordering
- C1 (one-line fix + smoke test) before any backend migration ships.
- C2 orphan reaper — cron job behind
maintenanceworker. - C3 SSE-S3 — single-line putObject change + bucket-policy assertion at boot.
- H1 + H2 port-binding plumbing (small refactor, big invariant).
- H3 + H4 + M2 streaming pass over backup + migrator + email attachments.
- Remainder during next storage-config UI sweep.
34. Dependency upgrade analysis — Context7-assisted (follow-up after deps-auditor)
Post-session follow-up. Where the original deps-auditor covered abandonment + vulnerabilities, this section queries upstream changelogs via Context7 to weigh the pros/cons of pulling every available major. Use this as your bump roadmap.
Dependency upgrade analysis (Context7-assisted)
Companion to the deps-auditor report from the original 33-agent run. That auditor checked vulnerabilities + abandonment + license risk; this follow-up adds per-dep pros/cons of bumping to the latest stable, informed by upstream changelogs/docs queried via Context7.
Top-line baseline: pnpm audit reports 0 known vulnerabilities.
No GPL/AGPL contamination. Lockfile reproducible. We are safe TODAY
without any upgrade; everything below is "should we pull the next
major in?" prioritization.
At a glance — what's outdated
| Package | Current | Latest | Bump size |
|---|---|---|---|
next |
15.5.18 | 16.2.6 | major |
eslint-config-next |
15.5.18 | 16.2.6 | major (matches next) |
zod |
3.25.76 | 4.4.3 | major |
tailwindcss |
3.4.19 | 4.3.0 | major |
@hookform/resolvers |
3.10.0 | 5.2.2 | TWO majors |
archiver |
7.0.1 | 8.0.0 | major |
react-day-picker |
9.14.0 | 10.0.0 | major |
eslint |
9.39.4 | 10.3.0 | major |
esbuild |
0.27.7 | 0.28.0 | pre-1.0 minor (effectively major) |
@playwright/test |
1.59.1 | 1.60.0 | minor |
libphonenumber-js |
1.12.43 | 1.13.1 | minor |
tailwind-merge |
3.5.0 | 3.6.0 | minor |
bullmq |
5.76.6 | 5.76.8 | patch |
@tanstack/react-query |
5.100.9 | 5.100.10 | patch |
better-auth |
1.6.9 | 1.6.10 | patch |
vitest |
4.1.5 | 4.1.6 | patch |
lint-staged |
17.0.3 | 17.0.4 | patch |
@vitest/coverage-v8 |
4.1.5 | 4.1.6 | patch |
@types/node deliberately pinned to ^20.19 to match Node 20 runtime
(audit findings — was previously ^25 against a Node 20 runtime, which
greenlit non-existent APIs).
Tier A — Pull the patches in now (zero-risk wins)
bullmq, @tanstack/react-query + react-query-devtools,
better-auth, vitest + @vitest/coverage-v8, lint-staged,
@playwright/test, libphonenumber-js, tailwind-merge.
Pros: patch / minor bumps, bug fixes only, no API changes documented.
Cons: none material — pin-bumps after a 30-second pnpm install
verify and full vitest run.
Recommended: DO as one batch commit. ~5 minutes.
Tier B — Per-major analysis
B-1 — Next.js 15.5 → 16.2 (touches every API route + middleware)
Upstream summary (via Context7):
middleware.tsis renamed toproxy.tsin Next 16. The named exportmiddleware→proxy. Config flags rename (skipMiddlewareUrlNormalize→skipProxyUrlNormalize). Edge runtime is NOT supported inproxy— if you need edge runtime you must keepmiddleware.ts(we already use the Node runtime, so this is just a rename for us).- Async
cookies()/headers()/params/searchParamswas the Next-15 change; Next 16 hardens the warning into an error. We're already async-safe (CLAUDE.md confirms the upgrade landed). - Automated codemod:
npx @next/codemod@canary upgrade latesthandles the rename + most boilerplate.
Risk for us:
src/middleware.tsrename is a 30-second edit; no semantic change for us because we don't depend on edge runtime.- The Documenso webhook + websocket server custom-server path (
src/server.ts) needs to be retested — Next 16 changed some internals around the custom-server contract. eslint-config-nextmust bump in lockstep (already at 15.5.18 → 16.2.6).- Turbopack defaults shifted; our dev script (
next dev --turbopack -H 0.0.0.0) needs a quick smoke run.
Recommended: WAIT 2-4 weeks. Next 16 dropped recently; let the field's bug reports settle. Then run the codemod + a full playwright smoke. Effort: 1-2h.
B-2 — Zod 3 → 4 (touches every validator file)
Upstream summary (via Context7):
- Top-level format helpers —
z.email()/z.uuid()/z.url()etc. replacez.string().email()/.uuid()/.url(). Old form is deprecated but still works. - Error customization unified:
{ message: '...' }→{ error: '...' }. Old form deprecated. z.function()API completely redesigned — now takesinput/outputschemas upfront, returns a function factory (not a schema).- ~14× perf improvement on parse paths.
- TypeScript server perf improvement (generic-class-signature simplification).
Risk for us:
- We have ~30 validator files using
z.string().email()/.uuid()style and{ message: '...' }style throughout. Both still work in 4.x but produce deprecation warnings on every parse — noisy in logs. @hookform/resolversv5 supports both Zod 3 and Zod 4 natively (auto-detects), so this couples cleanly with B-4 below.- We don't use
z.function()anywhere, so the biggest breaking change is a non-issue for us.
Recommended: GO once Tier A is in. Codemod-friendly: a single Find/Replace pass on z.string().email() → z.email() etc. covers ~95% of the churn. Effort: 2-3h including running full vitest + writing replacement codemods.
B-3 — Tailwind CSS 3 → 4 (touches tailwind.config.ts, globals.css, every dynamic-class site)
Upstream summary (via Context7):
- All-new Oxide engine — 5× faster full builds, 100× faster incremental.
- CSS-first config:
tailwind.config.tsis gone. Theme defined inglobals.cssvia@theme+ CSS custom properties (--color-brand: …). - PostCSS plugin consolidation:
postcss.config.mjsswitches fromtailwindcss + autoprefixer + postcss-importplugins to single@tailwindcss/postcss. - Built on native cascade layers, OKLCH colors, container queries,
@starting-style, popovers. - Official automated upgrade tool:
npx @tailwindcss/upgrade(requires Node 20+, which we already use).
Risk for us:
- We have a custom
tailwind.config.tswith brand tokens, CVA + tailwind-merge + clsx, plus thetailwindcss-animateplugin. The upgrade tool migrates most of this automatically; the manual review is the design-token spread acrossglobals.css. - shadcn/ui components (
components/ui/*) usecn()+ arbitrary values heavily. Some[--variable]syntax has changed in v4. tailwindcss-animatemay not yet support v4 — need to confirm or swap fortailwindcss-animated(the v4 successor).
Recommended: HIGH-RISK / HIGH-REWARD. Park until you have a clear afternoon. The build-time speedup is genuinely meaningful for dev experience. Run the official upgrade tool on a throwaway branch first; visually diff a handful of critical pages before merging. Effort: 3-4h on a focused day; visual regressions are the variable.
B-4 — @hookform/resolvers 3 → 5 (touches every form file)
Upstream summary (via Context7):
- v5 supports both Zod 3 and Zod 4 simultaneously via auto-detection — pulls
zod/v4if you opt into it explicitly. - Resolver options shape is the same as v3 (
{ mode: 'async' | 'sync', raw?: boolean }). - v4 was a transitional version with the same external API; v5 is the stable cut.
Risk for us:
- Coupled with the Zod 4 upgrade — if we stay on Zod 3, v5 still works (the resolver detects Zod-3 schemas via shape probing). Bumping resolvers without bumping Zod is safe.
Recommended: GO IN LOCKSTEP with B-2 (Zod 4). Effort: 5 min once Zod 4 is in.
B-5 — archiver 7 → 8 (touches GDPR-export bundle + backup-restore)
Upstream summary: Library "/gajus/archiver" not found in Context7 — fallback to npm changelog. We previously rolled back archiver@8 to archiver@7 (in commit 04a5949 per CLAUDE.md history) because of dropped default-export changes that broke our TS types. v8 stabilised since then.
Risk for us:
- Last time we tried this it broke. Read the v8 changelog before retrying.
- Used only for GDPR export + backup-restore — narrow blast radius. A failed upgrade is non-customer-facing.
Recommended: DEFER. Stay on 7 until either v8 demonstrably fixes a CVE / bug we care about, or until we have a green test suite to verify nothing regressed. Re-attempt only when there's a forcing function.
B-6 — react-day-picker 9 → 10 (touches every date-picker site)
Upstream summary: v10 is a recent cut. Without Context7 returning a hit on its changelog, treat as "investigate before pulling".
Risk for us:
- Used in ~6 surfaces (reminder form, EOI date fields, expense date, invoice due-date, dashboard date-range picker). A breaking change to the calendar render path would affect every form.
Recommended: DEFER 2-3 weeks to let bug reports surface. Effort to actually do it: ~1h once the spec is reviewed.
B-7 — eslint 9 → 10 + eslint-config-next (touches CI)
Risk for us:
- ESLint 10 likely drops support for some legacy rule configs.
eslint-config-nextshould bump in lockstep withnext(B-1).
Recommended: PAIR WITH B-1. No standalone value to bumping eslint without bumping Next.
B-8 — esbuild 0.27 → 0.28 (touches build pipeline)
Risk for us:
- We use esbuild via
pnpm.overridesplus directly inbuild:serverandbuild:workerscripts. - Pre-1.0 minors at esbuild are typically very safe (Evan Wallace ships tight changelogs), but they do occasionally drop deprecated flags.
Recommended: GO. Bundle the bump with the Tier A patches. Effort: 1 min + a pnpm build smoke.
Tier C — Things to leave alone
drizzle-orm 0.45.2— current major. No upgrade needed.react 19.2.6/react-dom 19.2.6— current React 19. Stable.@radix-ui/*— all current. These ship patch updates frequently; consider a quarterly sweep but not blocking.@dnd-kit/*,@pdfme/*,socket.io,bullmq,pino,postgres,minio,ioredis,pdf-lib,pdfkit,sharp,tesseract.js,recharts,cmdk,vaul,sonner,zustand,next-themes,date-fns,clsx,class-variance-authority,jose,nodemailer,mailparser,imapflow,openai,lucide-react,react-easy-crop,react-hook-form— all current within their major lines and either no risk-worthy bump available or already bumped.
Recommended sequencing
- Now — pull Tier A patches as one commit (~5 min).
- Now —
esbuild0.27 → 0.28 in same commit. - Next focused half-day — Zod 4 +
@hookform/resolversv5 together. Coupled because resolvers v5 supports both. Codemod-able. - 2–4 weeks — Next 15 → 16 +
eslint-config-next16 +eslint10. Lockstep. Run@next/codemodfirst. - When a tester-friendly afternoon opens up — Tailwind 4 via the official upgrade tool, with visual review across critical pages.
- Defer indefinitely — archiver 8, react-day-picker 10 (neither is delivering us anything we need).
Non-goal: chasing the bleeding edge on every dep. The audit's baseline finding stands — we are secure today. These are mostly developer-experience and perf wins, not security blockers.
35. Package adoption + PDF stack overhaul (Context7-assisted follow-up)
Companion to section 34. The deps-upgrade analysis answered "should we bump what we already have?" — this section answers two follow-on questions:
- PDF stack — are pdfme + pdfkit + pdf-lib the right tools? (No.)
- What aren't we using that we should be? — comprehensive sweep of the modern ecosystem against our actual pain points and codebase patterns.
User-directed exclusions:
react-hotkeys-hook(no keyboard-shortcut UX target).
35.A — PDF stack overhaul
Current state (5 packages, 4 distinct use cases)
| Package | Where it lives in our code | Use case |
|---|---|---|
@pdfme/common + generator + schemas v6.1.2 |
src/lib/pdf/generate.ts + 8 template files |
Declarative report/invoice/EOI templates |
pdf-lib v1.17.1 |
src/lib/pdf/fill-eoi-form.ts, src/lib/services/berth-pdf-parser.ts |
AcroForm fill (EOI) + uploaded-PDF parsing (berth specs) |
pdfkit v0.18.0 + @types/pdfkit |
src/lib/services/expense-pdf.service.ts (only site) |
Streaming receipt-attached expense reports |
tesseract.js v7.0.0 |
src/lib/ocr/tesseract-client.ts + scan-shell |
Berth PDF OCR fallback |
Bridge layer: 571-line src/lib/pdf/tiptap-to-pdfme.ts |
Admin template builder | Tiptap JSON → pdfme schema converter |
Pain points
- The 571-line
tiptap-to-pdfme.tsbridge is fragile glue between a rich-text format (Tiptap JSON) and a declarative PDF schema (pdfme). Every supported formatting subset (bold, italic, headings, lists, tables, images) is hand-coded. Addingblockquote/codeBlock/horizontalRule/taskListis currently rejected at save time because the bridge doesn't support them. - pdfme templates are JSON blobs with positional
{ x, y }coordinates. Reading/editing them is painful (compareinvoice-template.tsvs a declarative React component). @pdfme/generatorships a heavy runtime including the schema engine and font loaders — irrelevant for our use case because we're code-driven, not visual-editor-driven.- 3 different generation libraries (pdfme + pdfkit + pdf-lib) means three different mental models, three different test patterns, three different failure modes.
Recommendation per use case
Use case 1 — Template-driven PDFs (8 templates): invoice, client-summary, interest-summary, berth-spec, revenue-report, occupancy-report, pipeline-report, eoi-standard-inapp.
→ Replace with @react-pdf/renderer (/diegomura/react-pdf, 161 snippets,
benchmark 87.75).
Why it wins for us:
- Declarative React components — uses the same skills we already have. No
more positional
{ x, y }JSON. - Server-side rendering modes:
renderToBuffer(HTTP responses),renderToStream(large reports),renderToFile(background jobs). All three usage patterns are documented and idiomatic — replaces pdfme'sgenerate()call cleanly. - First-class page-break controls —
break,wrap={false},minPresenceAhead,orphans,widows. pdfme has none of these; we'd be hand-implementing them today if we needed them. - Fixed headers/footers via
fixedprop with auto page-number rendering (render={({ pageNumber, totalPages }) => …}). We currently re-render header content per page in pdfme. - The Tiptap bridge problem dissolves: a rich-text component renders Tiptap JSON directly via a recursive component (~80 LOC, replaces 571 LOC). No more constrained-subset rejections at save time.
- Tree-shakes — only the components we import ship; pdfme's generator pulls the full schema engine regardless.
Concrete migration cost: rewrite 8 templates as JSX. The shape is 1:1 with our current pdfme schemas (header section, repeating items, footer totals), so it's a mechanical translation. ~4-6 hours total. Bridge layer (571 LOC) goes to zero.
Caveats from Context7:
- Font registration is explicit (
Font.register({ family, src })) — our current fonts move from pdfme's font loader to a one-time call at boot. - No Tailwind class support — uses
StyleSheet.create({ ... })with a flexbox-style subset. Familiar to React Native devs.
Use case 2 — AcroForm fill (EOI):
→ Keep pdf-lib. Best-in-class for editing existing PDFs. No replacement
candidate is better. Already used correctly in fill-eoi-form.ts.
Use case 3 — Uploaded PDF parsing (berth specs):
→ Add unpdf (/unjs/unpdf, 66 snippets) for text extraction; keep
pdf-lib for AcroForm field extraction.
Why:
unpdfis the unjs ecosystem's serverless-friendly PDF parser built on pdf.js. Returns{ totalPages, text }per page in one call.- Better than
pdf-libfor text extraction because pdf-lib's text APIs are read-positional, not read-flow. getDocumentProxy()lets us share one parse acrossextractText,extractLinks,getMeta— useful for the 3-tier berth parser (AcroForm first, OCR fallback, AI fallback) because we can grab all metadata in one pass.
Our current parser uses pdf-lib's low-level text extraction which has known
issues with positionally-rendered text (the OCR fallback fires more often
than necessary). unpdf.extractText would reduce that fallback rate.
Use case 4 — Streaming receipt-attached expense reports:
→ Keep pdfkit short-term, migrate to @react-pdf/renderer.renderToStream
medium-term.
Why keep:
expense-pdf.service.tsis the onlypdfkitconsumer. Its streaming pattern (500 receipts at <100MB RSS) is the load-bearing reason for pdfkit's existence in our deps.@react-pdf/renderer.renderToStreamdocumented in Context7 supports the same use case — but verification needs an actual perf test against a 500-receipt fixture before committing.
Migration plan:
- Phase 1 (now): replace pdfme templates with @react-pdf/renderer.
- Phase 2 (after we have @react-pdf/renderer in the codebase): re-test
expense-pdf with
renderToStreamagainst the 500-receipt fixture. If memory stays under 200MB, swap pdfkit out. If not, keep pdfkit and document the constraint.
Net result after Phase 1
Remove: @pdfme/common, @pdfme/generator, @pdfme/schemas, 571-line
bridge file.
Keep: pdf-lib (AcroForm), pdfkit (streaming expenses, pending Phase 2),
tesseract.js (OCR).
Add: @react-pdf/renderer, unpdf.
Deps net: −2, −571 LOC of bridge code, +standard declarative API for all templates.
35.B — High-value package additions (prioritized)
Each row below has been validated via Context7 unless marked otherwise.
Tier 1 — Adopt alongside the planned Zod 4 / Tailwind 4 work
| Package | Replaces / unlocks | Where it lands in our code | Effort |
|---|---|---|---|
drizzle-zod (already in drizzle-orm) |
~30 hand-maintained validators in src/lib/validators/ |
createInsertSchema(clients).omit({ id, portId }) etc. |
2-3h |
@react-pdf/renderer |
8 pdfme templates + 571-line tiptap bridge | src/lib/pdf/templates/* |
4-6h |
react-email + @react-email/components |
8 hand-strung HTML templates in src/lib/email/templates/ |
Each becomes a .tsx component, rendered via await render(<…/>) then handed to nodemailer unchanged |
2-3h (one template at a time) |
@tanstack/react-virtual |
Pagination on client-list, yacht-list, berth-list, audit-log-list, inbox |
useVirtualizer({ count, estimateSize }) inside the list shells |
1h per list × 5 lists |
ts-pattern |
19-case dispatch in search.service.ts, 13-case Documenso webhook, 12-case client-restore.service.ts, 10-case recently-viewed/route.ts, 10-case custom-fields/[entityId]/route.ts |
match(input).with(...).exhaustive() |
30 min per site; start with the Documenso webhook |
unpdf |
Hand-rolled text extraction in berth-pdf-parser.ts |
extractText(await getDocumentProxy(buf)) |
1h |
Tier 2 — Independent adopts (polish + perf)
| Package | What it does for us | Effort |
|---|---|---|
@formkit/auto-animate |
One-liner useAutoAnimate() ref on any list. Drops into: deal pipeline kanban (pipeline-board.tsx), reminders rail, alerts rail, files list, notes list. Zero CSS. ~2kb. |
5 min per site |
motion (formerly framer-motion) |
Layout animations for kanban reorder (currently snaps), Vaul drawer enter/exit polish, sheet/drawer slides, <AnimatePresence> for inline edits. ~15kb gzip but tree-shakes well. |
1-2h to wire the kanban first |
use-debounce |
Replaces ad-hoc setTimeout debounce in yacht-picker, client-picker, audit-log-list, send-document-dialog, custom-fields-section, berth-picker, interest-picker, dedup-suggestion-panel (8 sites). Typed useDebouncedCallback. ~3kb. |
30 min total |
fast-deep-equal |
Memo comparator for DataTable and React Query select functions. Drops re-renders when stable references arrive with new identity. ~1kb. |
20 min |
@upstash/ratelimit |
Replaces hand-rolled rate limiters in src/lib/rate-limit.ts, api/helpers.ts, route-helpers.ts, document-sends.service.ts. Uses our existing Redis. Sliding-window / fixed-window / token-bucket algorithms tested at scale. |
1-2h |
Tier 3 — Strategic adopts (bigger commitments)
| Package | What it unlocks | Notes |
|---|---|---|
next-safe-action |
Type-safe server actions with built-in Zod validation, ownership middleware, useHookFormAction hook. Each form drops ~30 LOC of apiFetch + toastError + mutation-hook plumbing to ~5. Pairs with useHookFormAction which already speaks Zod/RHF. |
Migrate gradually — use for new forms first, keep API routes for external callers. Couples with Zod 4 (since safe-action v8+ targets Zod 4 best). |
@axe-core/playwright |
Accessibility audit during smoke tests. The 33-agent audit flagged WCAG gaps; this catches regressions automatically. | ~30 LOC of test setup. Fails CI on new violations. |
@tiptap/core + @tiptap/react + extension packs |
Real rich-text editor for notes (clients/interests/yachts/companies all have polymorphic notes). Currently plain text. Sales reps note things like "call after 4pm UTC, prefers WhatsApp" — bold/italic/links/lists/mentions would help. Tiptap's JSON output format is already in our codebase (the bridge layer), so we'd be storing the same shape we already render. |
Decision: keep notes plain or upgrade to rich? If yes, ~3h to wire one entity's notes; the others copy the pattern. |
@next/bundle-analyzer |
Wraps next.config.ts. Generates client + server bundle treemaps after every build. Catches when a tiny PR pulls in recharts on a route that shouldn't have it. The 33-agent audit flagged recharts + pdfme as bundle bloat — this is the tooling to keep that honest. |
15 min setup. Run with ANALYZE=true pnpm build. |
@sentry/nextjs |
Error tracking with frontend + backend correlation, release tracking, source maps, performance traces, replay (optional). We have pino logs but no aggregation/alerting/correlation. Important once we have customer-facing users. | Decision: do we want a SaaS dependency? Self-hosted GlitchTip is also an option (Sentry-protocol-compatible). |
@vercel/og (or satori) |
Generate Open Graph images for shared docs/portal links. Currently the portal has no social previews; if a client shares their EOI link in WhatsApp/Email, the preview is blank. ~10 LOC per route. | 1h for portal share routes. |
papaparse |
CSV import/export. Sales reps frequently ask for "export to Excel." Plays well with our existing TanStack Table data. ~17kb. | 30 min for client/interest list export. |
@formkit/tempo OR date-fns helpers |
We have 44 files with hand-rolled new Date().toLocaleString() / .toLocaleDateString(). Centralize via a formatDate(date, format, timezone) helper using date-fns (already installed) — no new package needed if we use date-fns's format, formatDistance, formatRelative which we already have. This is a refactor, not an adoption. |
2-3h sweep |
Tier 4 — Defer or skip
| Package | Reason |
|---|---|
next-pwa / @serwist/next |
PWA assets pending (per MEMORY.md). When that lands, @serwist/next is the modern choice (next-pwa is unmaintained). For now, skip. |
next-intl / i18next / @lingui/core |
No i18n target today. When we localize, next-intl is the strongest Next.js App Router integration. For now, skip. |
@knocklabs/node + @knocklabs/react |
Notification center + channel routing + preferences UI. Likely overkill — we have a simple in-app + email notification system that works. Revisit if we add SMS or push. |
inngest / trigger.dev |
Background jobs with observability. We use BullMQ; revisit only if we need step functions / cross-service workflows. |
posthog-js |
Product analytics + feature flags + session recording. We have Umami for web analytics; PostHog adds product-level tracking. Decision pending. |
@growthbook/growthbook |
Feature flags only. We don't have any flagged features today. |
fuse.js / minisearch |
Client-side fuzzy search. Useful for already-loaded list filtering, but TanStack Table's built-in filter is usually enough. |
@uppy/core + @uppy/dashboard |
Rich file upload UI with resume, chunking. We have basic file inputs (0 patterns found in audit grep) — not currently a pain point. |
@tanstack/react-form |
Successor to react-hook-form by same team. RHF is mature, well-known, and we have 8 forms on it. No compelling migration. |
valibot / arktype |
Faster zod alternatives. We're committed to Zod 4. |
react-hotkeys-hook |
Excluded per user direction. |
35.C — Deprecation / cleanup candidates
| Package | Reason | Action |
|---|---|---|
@radix-ui/react-icons |
We use lucide-react everywhere. Audit grep shows no imports from @radix-ui/react-icons. |
Drop after grep-confirm. ~30s. |
@pdfme/common + @pdfme/generator + @pdfme/schemas |
Replaced by @react-pdf/renderer in Phase 1. |
After PDF migration. |
tailwindcss-animate v1.0.7 |
Last published 2024, no v4 support. Replace with tw-animate-css (the v4-native successor shadcn now recommends). |
Required if we move to Tailwind 4. |
@types/pdfkit |
Tops at v7.0.0. We're on pdfkit v0.18 — types are loose but functional. Keep until we migrate expense-pdf to @react-pdf/renderer. |
Defer. |
pino-pretty in dependencies |
Should be devDependencies only — ships ~500kb to prod worker images if it leaks into the runtime path. Audit-verify the build doesn't include it; move if it does. |
5 min check. |
35.D — Surfaced refactor opportunities (no new package required)
These came up while sweeping for package gaps. They're refactor wins, not package adoptions.
| Opportunity | Concrete sites | Tool |
|---|---|---|
| Centralize date formatting | 44 files with hand-rolled .toLocaleString() / .toLocaleDateString() |
formatDate(date, format, timezone) helper using existing date-fns |
| Centralize debounce | 8 picker/list components | use-debounce (or hand-rolled hook) |
| Centralize rate-limiting | 4 hand-rolled limiters | @upstash/ratelimit |
| Replace 5-9 large switch statements with exhaustive matchers | search.service.ts (19 cases), Documenso webhook (13), client-restore.service.ts (12), recently-viewed/route.ts (10), custom-fields/[entityId]/route.ts (10) |
ts-pattern |
35.E — Final adoption order (revised, incorporating section 35)
This supersedes section 34's sequencing where they overlap.
- Now (one focused day) — Zod 4 +
@hookform/resolvers5 +drizzle-zod. One PR. Codemod-friendly. Highest correctness payoff. - Independent (any time) —
react-emailmigration of one template (portal-auth.tsrecommended first), then expand. Independent of any version upgrade. - Independent (any time) —
@react-pdf/renderer+unpdf. Replace 8 pdfme templates, delete 571-LOC bridge, add unpdf to berth parser. - Independent (any time) —
ts-patternin the Documenso webhook switch first (the audit's bug-class poster child), then sweep the other 4 sites. - Independent (any time) —
@tanstack/react-virtualonclient-listfirst, copy pattern to 4 other lists. - Independent (any time) —
@formkit/auto-animatesprinkle. 5-minute wins per site. - Independent (any time) —
@next/bundle-analyzerinstall. 15-min setup; ongoing bundle hygiene. - Next focused half-day —
motionwire to the kanban for smooth reorder. - 2-4 weeks — Next 15 → 16 + eslint-config-next 16 + eslint 10 (lockstep, codemod).
- Focused afternoon — Tailwind 4 via official upgrade tool + swap
tailwindcss-animatefortw-animate-css. - When we have a new form to build — pilot
next-safe-actionthere; backfill existing forms gradually. - Decision required first —
@sentry/nextjs(SaaS dep),@tiptap/*(rich notes Y/N?),posthog-js(analytics scope),papaparse(CSV export priority).
35.F — Skipped per user direction
react-hotkeys-hook— no keyboard-shortcut UX target across the platform.
36. Second-pass package sweep — mobile, fluidity, data speed, DX
Section 35 covered the headline adoption candidates. This section is the deliberate second sweep the user requested — looking specifically for libraries we may have missed across four dimensions: current functionality gaps, optimization (mobile included), UI fluidity, and data retrieval/writing speed.
Findings are grouped by dimension. Each entry says (a) what we have now, (b) what the library adds, (c) where in our codebase it'd land, (d) effort.
36.A — Data speed & concurrency
36.A.1 p-queue + p-limit + p-retry (Sindre Sorhus suite)
Concrete pain: 74 Promise.all(...) sites in services/routes. 8 mass-
operation services (expense-pdf, berth-pdf, brochures, backup,
document-templates, email-compose, documents, email-threads).
Naive Promise.all([...mapped]) will:
- Fire all 500 expense receipts to S3 simultaneously → MinIO connection
pool exhaustion + memory spike (
expense-pdf.service.tsdocs explicitly call this out as a past problem). - Fire all bulk-send-document calls at Documenso simultaneously → hit Documenso's per-second rate limit, cause cascade failures.
- Fire all email-compose attachments at SMTP simultaneously → SMTP connection limit on Mailgun/SES drops requests silently.
p-limit caps concurrency: pLimit(5) runs at most 5 at a time.
p-queue is p-limit + interval rate limiting + pause/resume.
p-retry handles exponential backoff retries for transient failures.
Land sites:
expense-pdf.service.ts— already has streaming logic, but the per-receipt S3getcalls are unbounded.email-compose.service.ts— bulk send-out is the obvious one.backup.service.ts— GDPR export streaming.documents.service.ts— multi-file folder operations.
Effort: 30 min per service. ~1.5kb each.
36.A.2 @tanstack/query-broadcast-client-experimental
Concrete pain: A rep has the CRM open in two tabs. They update a client in tab A — tab B's stale cache continues showing old values until the next refetch.
What it adds: BroadcastChannel sync between tabs. Free cross-tab cache coherence with no server roundtrips.
Land site: One line in src/providers/query-provider.tsx:
broadcastQueryClient({ queryClient, broadcastChannel: 'pn-crm' });
Effort: 5 minutes. ~2kb.
36.A.3 Underused Drizzle ORM features (no new package)
We have drizzle-orm 0.45.2 and use ~60% of its capabilities.
db.batch(...)for atomic multi-statement transactions on Postgres. Currently we use explicitdb.transaction(async (tx) => {...})blocks everywhere —batchis shorter and lets the driver pipeline.- Prepared statements via
.prepare()— repeated queries (e.g.,getClient(id)called per-request) can be prepared once at boot and reused. Postgres saves the parse+plan cost. with(CTE) clauses — we have 30+ places where we'd benefit fromWITH active_interests AS (...) SELECT ...instead of joining the same subquery twice. Audit found N+1 patterns; CTEs flatten them.
Land sites: the recommender SQL aggregate (already uses CTEs),
dashboard.service.ts analytics queries, search.service.ts graph
expansion. These are all "we already wrote raw SQL strings; rewriting as
typed Drizzle CTEs" wins.
Effort: opportunistic. No package change.
36.A.4 postgres.js cursor for large reads
We have postgres ^3.4.9. Its await sql\...`.cursor(rows => ...)streams large result sets in batches without buffering all rows. Currently the GDPR-export bundling and the backupdump-tables` paths buffer
everything in memory.
Land sites: backup.service.ts, gdpr-export.service.ts (when we
build it — currently parked).
Effort: opportunistic refactor when we touch those services.
36.B — UI fluidity & animation
36.B.1 @use-gesture/react (mobile gestures)
Concrete pain: mobile users can't swipe-to-dismiss the Vaul drawer, swipe sideways between kanban columns, or pinch-zoom berth photos. The audit's mobile pass flagged these.
What it adds: declarative gesture handlers (useDrag, usePinch,
useScroll). Composes with motion for spring-physics responses.
Land sites:
- Pipeline kanban: swipe between stage columns on mobile.
- Vaul drawer: swipe-down to dismiss (Vaul already does this, but adding
custom velocity thresholds via
@use-gesturepolishes the feel). - Berth/yacht photo galleries: pinch-zoom.
Effort: 1h to wire one site as the template. ~5kb.
36.B.2 embla-carousel-react
Concrete pain: berth photos and yacht photos render as static grids (per the audit). On mobile, users want to swipe through them.
What it adds: lightweight, touch-native, accessibility-compliant
carousel. Plays with framer-motion if we want fancy transitions.
shadcn/ui has a Carousel component built on this — drop-in via the
shadcn CLI.
Effort: npx shadcn@latest add carousel, then 10 lines to render the
photo array. ~10kb gzip.
36.B.3 yet-another-react-lightbox
Concrete pain: clicking a berth photo currently navigates to a fullscreen image route or doesn't expand at all. Sales reps want lightbox-style preview.
What it adds: fullscreen lightbox with keyboard nav, zoom, swipe, slideshow, captions. Plugin system for video/PDF embed if we extend.
Land sites: berth/yacht detail pages, client docs preview.
Effort: 1h. ~15kb gzip with plugins.
36.B.4 react-resizable-panels
Concrete pain: the docs hub has a fixed-width folder sidebar (per CLAUDE.md's documents-hub rewrite). Power users on wide monitors want to drag-resize it.
What it adds: keyboard-accessible resizable split panes with
persistent sizing (localStorage). shadcn/ui has a Resizable component
built on this.
Land sites: docs hub (sidebar | content), email inbox (folder | thread), admin settings (nav | section).
Effort: npx shadcn@latest add resizable, drop in. ~5kb.
36.C — Mobile optimization
36.C.1 browser-image-compression
Concrete pain: the expense-scanner (scan-shell.tsx) and receipt
upload paths accept full-resolution phone photos (typically 4-12 MB each).
Mobile users on cellular pay bandwidth + battery for sending 4× more
data than necessary. The server then re-runs sharp to resize anyway.
What it adds: client-side image compression in WebWorker before
upload. Targets maxSizeMB, maxWidthOrHeight, useWebWorker. The
server still validates magic-bytes + sharp-resizes, but receives a
500KB-resized JPG instead of a 12MB original.
Concrete win: a rep on 3G uploading a receipt: ~30s wait → ~5s wait.
Server CPU on sharp resize drops to a no-op since the client did it.
Effort: 30 min to wire scan-shell.tsx. ~25kb gzip (worker-bundled so
zero main-thread cost).
36.C.2 partysocket
Concrete pain: mobile users on flaky networks frequently lose the Socket.IO connection. Our current client uses Socket.IO's built-in reconnect, which is good but not great for mobile.
What it adds: drop-in WebSocket wrapper with:
- Exponential backoff with jitter (default Socket.IO is linear).
- Message queue while disconnected (Socket.IO buffers via volatile flag only).
- Auto-reconnect on
onlineevent +visibilitychange(page wake). - Optional auto-detect connection quality (slow vs fast).
Land site: src/providers/socket-provider.tsx.
Effort: depends — partysocket works with raw WS, not Socket.IO's
protocol. For Socket.IO we'd need socket.io-client + manual reconnect
tuning, or migrate the realtime layer to plain WebSockets (significant).
Park as a "mobile flake" investigation, not an immediate adoption.
36.C.3 react-virtuoso (alternative to TanStack Virtual)
Concrete pain: the inbox (src/components/layout/inbox.tsx) uses a
plain <ScrollArea className="max-h-[400px]"> with no virtualization.
For users with hundreds of unread items, mobile scrolling chugs.
What it adds: specialized virtualization for chat-like / inbox-like UIs with variable-height items and "scroll to bottom on new message" semantics. TanStack Virtual is more headless / generic; Virtuoso is opinionated and better for inbox-shaped UIs.
Land site: inbox.tsx specifically. For the regular lists
(client/yacht/berth), TanStack Virtual is still the right call (section
35.B.4).
Effort: 45 min. ~10kb.
36.C.4 @formkit/auto-animate (revisit for mobile)
Already in section 35.B but worth re-emphasising: on mobile, list items appearing/disappearing without animation feels janky. Free polish.
36.D — Input quality & forms
36.D.1 react-imask or react-number-format
Concrete pain: we have currency inputs, phone inputs, date inputs spread across berth-form, expense-form, invoice-form, client-form. The audit flagged inconsistent formatting (decimals, thousand-separators, phone-prefix handling).
What it adds: declarative input masks — <IMaskInput mask="$num" scale={2} thousandsSeparator="," />. Plays cleanly with react-hook-form.
react-number-format is the lighter-weight, currency-specific option.
react-imask covers more patterns (phone, date, custom).
Land sites: ~6 form components.
Effort: 30 min per form × 6 = 3h. OR keep our hand-rolled formatters and don't add the dep. Decision pending.
36.D.2 @hookform/devtools (dev-only)
What it adds: a floating panel in the browser showing react-hook-form state in real time (values, errors, isDirty, isValid, touched fields). Massive debug-time win.
Land site: wrap forms in <DevTool control={form.control} /> in dev
builds only.
Effort: 15 min. dev-only, ships zero to prod.
36.E — Security & sanitization
36.E.1 isomorphic-dompurify
Concrete pain: src/lib/utils/markdown-email.ts hand-rolls HTML
escape + safe-link rendering for email bodies. The audit raised XSS
concerns (CRIT-2 in section 4) about admin-supplied content in templates
and email bodies. Our hand-rolled escapeHtml is correct for the basic
cases, but DOMPurify handles edge cases the audit listed (data URLs,
nested encoding, javascript: in href attrs).
What it adds: battle-tested HTML sanitizer used by Google, Microsoft,
GitHub. Works in Node + browser (the isomorphic- prefix is the
SSR-compatible wrapper around the regular dompurify).
Land sites:
renderEmailBody()inmarkdown-email.ts.- Anywhere we render user-supplied HTML (template preview, document body display).
Effort: 1h migration + audit. ~25kb (Node) / ~50kb (browser), acceptable.
36.E.2 @noble/hashes (already covered by better-auth)
We use better-auth for password hashing. No need to add.
36.E.3 WebAuthn / Passkeys (@simplewebauthn/server + /browser)
What it adds: passwordless authentication via device passkeys (Touch ID, Windows Hello, YubiKey). Better Auth has a WebAuthn plugin that wraps these.
Decision required: is passwordless a 2026 roadmap item?
36.F — Observability & perf measurement
36.F.1 web-vitals
Concrete pain: we have no real-user perf data. We don't know our P75 LCP, P75 INP, or P75 CLS across our user base. Any future perf optimization (Cache Components, Tailwind 4, dynamic imports) is shooting in the dark without baseline measurement.
What it adds: Google's official Core Web Vitals library. Ships
onLCP, onINP, onCLS, onFCP, onTTFB callbacks. Reports values
once per page lifecycle.
Land site: src/app/(dashboard)/layout.tsx — wire a listener that
POSTs vitals to /api/v1/internal/vitals (new endpoint, append to
existing client_metrics table or similar). 30 LOC end-to-end.
Effort: 1h including backend logging. ~2kb. High value because without this we're guessing about perf wins.
36.F.2 pino-http
Concrete pain: we have request logging via custom middleware. pino-http
is the canonical pino HTTP request logger with automatic request-id
propagation, response time, status code, and integration with our pino
logger. Likely already partially implemented via our hand-rolled
middleware.
Effort: check existing middleware first — may already cover this.
36.F.3 @sentry/nextjs (revisit from section 35)
Covered in 35.B Tier 3. Adoption gated on the SaaS-dep decision.
36.G — TypeScript ergonomics
36.G.1 @total-typescript/ts-reset
Concrete pain: TypeScript's stdlib types have well-known foot-guns:
Array.isArray(x)narrows toany[](drops the actual type).JSON.parse(s)returnsany(defeats type safety entirely).fetch().json()returnsPromise<any>..filter(Boolean)doesn't removenull | undefinedfrom the type.Array.prototype.includesis too strict on its argument.
ts-reset is a single .d.ts import (import '@total-typescript/ts-reset')
that fixes all of these globally. Used by Anthropic, Stripe, Vercel internally.
Concrete impact: likely catches 10-20 latent bugs across our 1000+
TS files where someone called JSON.parse(body) and continued treating
the result as a typed object without parsing through Zod.
Effort: 1 line in src/types/globals.d.ts. dev-time only, ships
zero runtime.
36.G.2 type-fest
What it adds: ~150 utility types (SetRequired, SetOptional,
PartialDeep, MergeDeep, Promisable, Jsonifiable, etc.) that
extend TypeScript's built-ins.
Land sites: anywhere we're hand-rolling Omit<X, Y> & Pick<Z, W>
gymnastics — type-fest usually has a named util that's clearer.
Effort: opportunistic. ~0kb runtime (types only).
36.G.3 tsc-files
Concrete pain: pre-commit hook runs ESLint on staged files (fast) but no type-check. Type errors slip through to CI.
What it adds: typecheck only the staged TS files and their dependencies, not the full repo. Drops a pre-commit hook from "skip because too slow" to "always on, sub-2-second."
Land site: .husky/pre-commit + lint-staged.config.mjs —
"*.ts": ["tsc-files --noEmit"].
Effort: 15 min.
36.H — In-browser PDF viewing
36.H.1 pdfjs-dist + a viewer wrapper
Concrete pain: the docs hub (per CLAUDE.md) lets users upload and file PDFs. There's currently no in-app preview — clicking a file likely downloads it or opens in a new tab. A real CRM should preview the PDF inline.
What it adds:
pdfjs-distis Mozilla's pdf.js — the engine.@react-pdf-viewer/coreis the most feature-rich React wrapper (zoom, search, annotations).- Alternatively,
react-pdf(Wojtek Maj's, not @react-pdf/renderer) is a lighter wrapper.
Land site: docs hub file detail / preview pane. EOI signing preview in admin.
Effort: 2-3h for a polished viewer with zoom + page nav. ~150kb gzip (pdf.js is unavoidable; lazy-load only when preview opens).
Note vs section 35.A: @react-pdf/renderer (generator) and pdfjs-dist
(viewer) are complementary. We need both: one to make PDFs, one to
show them.
36.I — Testing & development data
36.I.1 @faker-js/faker
Concrete pain: seed data is currently hand-maintained (mostly). Faker would replace hand-rolled fake names, emails, addresses, phone numbers, vehicle/yacht names, dates, marina locations with reproducible, locale-aware fakes.
Land site: src/lib/db/seed.ts, src/lib/db/seed-synthetic.ts.
Effort: 1-2h. ~3MB gzip — dev-only, not shipped.
36.I.2 msw (Mock Service Worker)
Concrete pain: integration tests that hit external services
(Documenso, SMTP, IMAP) either skip in CI or fail intermittently.
msw intercepts fetch/HTTP at the network layer in tests so we can
mock external responses deterministically.
Land site: tests/integration/ setup — wrap Documenso + SMTP
clients with MSW handlers.
Effort: 2-3h. dev-only.
36.J — Workflow & state machines
36.J.1 @xstate/react
Audit found only one multi-step flow (send-document-dialog.tsx).
EOI signing has steps but they're sequential, not state-machine-y. The
GDPR export job is a backend state machine but bullmq handles it.
Verdict: not warranted right now. Revisit if we build the client-onboarding flow or the multi-step EOI-with-multi-berth-and- payment-and-signing wizard the roadmap mentions.
36.K — Search & filtering
36.K.1 Postgres-native FTS (no new package — schema migration)
Concrete pain: search.service.ts uses LIKE '%term%' on client/yacht/
company tables. Slow at scale; doesn't rank.
What we could add: Postgres tsvector columns + GIN indexes + a
single to_tsquery() call per search. This is all native Postgres
— no new npm dep. Drizzle supports it via sql\...`` template literals.
Effort: migration (30 min) + service refactor (2h) + e2e re-run.
36.K.2 External search engines (meilisearch, typesense)
Verdict: overkill until we're past 100k clients per port. Postgres FTS will hold for years. Defer indefinitely.
36.L — Final updated adoption order (incorporating section 36)
Layered on section 35.E:
Same-day adopts (low-risk, high-leverage):
@total-typescript/ts-reset— 1-line type-safety upgrade. Do this before any Zod 4 work — it'll catch latent bugs along the way.web-vitals— establish perf baseline before any optimization.@hookform/devtools— dev-only DX win.
Adopt alongside section 35.B Tier 1:
p-limit— pair with the section 35 mass-operation refactors. The Documenso bulk-send path is the highest-priority site.@tanstack/query-broadcast-client-experimental— 1-liner in the query provider.
Adopt with mobile/UX work:
browser-image-compression— wire into scan-shell first.embla-carousel-react+yet-another-react-lightbox— pair with berth/yacht photo gallery work.react-resizable-panels— pair with docs hub UX work.@use-gesture/react— pair with kanban-on-mobile polish.
Adopt with security pass:
isomorphic-dompurify— replaces hand-rolled escapeHtml. Pair with the audit's XSS hardening pass.
Adopt with the docs hub Phase 2:
pdfjs-dist+ viewer wrapper — when in-app PDF preview becomes a user request.
Park / defer:
partysocket(requires Socket.IO investigation first).@xstate/react(no current target).- External search engines.
- WebAuthn / passkeys (roadmap decision).
36.M — Final summary
The first sweep (section 35) found the headline replacements: Zod 4 + drizzle-zod + react-email + @react-pdf/renderer is the single highest-leverage week of work.
This second sweep (section 36) found the operational hardening layer:
p-limitfamily for the 74 unboundedPromise.allsites.@total-typescript/ts-resetfor free type safety across 1000+ files.web-vitalsto establish a perf baseline before we optimize.isomorphic-dompurifyto harden the email/template rendering.browser-image-compressionfor mobile bandwidth / battery.@tanstack/query-broadcast-client-experimentalfor free cross-tab cache sync.react-resizable-panels+embla-carousel-react+yet-another-react-lightboxfor the photo/preview surfaces.
Together with section 35, this gives us a concrete shopping list of ~20 packages with explicit land-sites in our code and effort estimates, plus 5-6 cleanup-candidate removals. Adopting all of them would shed ~600 LOC of hand-rolled code, eliminate ~5 categories of latent bugs (timezone, XSS, race conditions, type stdlib quirks, missing exhaustiveness), and meaningfully improve mobile UX + perf measurability.
Bottom line: the deps audit (section 34) showed we're secure today. This section (35) shows where we can make the codebase meaningfully better — smaller, cleaner, more declarative — by leveraging packages we don't yet use. The single highest-leverage move is Zod 4 + drizzle-zod + react-email in the same focused day: it kills the validator-drift problem, lands the 14× parse-perf win, and starts paying down the hand-strung-email-templates debt all at once. The PDF stack overhaul (35.A) is the second-highest-leverage move: removing pdfme + the 571-line Tiptap bridge in favor of declarative React components is a category-of-bug eliminator, not just a refactor.