Files

Matt ad74e4a174 audit: Tier 1/3/6/7 batch — PII redaction, mobile safe-area, perf, build hardening

Tier 1.4: error_events.request_body_excerpt sanitizer now redacts
GDPR-relevant fields (email, phone, dob, address, fullName, firstName,
lastName, postcode, nationalId, etc.) on top of the existing
credential list. A 5xx in /api/v1/clients no longer lands full client
PII in the super-admin inspector.

Tier 3.10: ScanShell <main> now adds pb-[max(1.5rem, env(safe-area-
inset-bottom))]. Mobile-pwa audit caught the Save expense button sitting
flush against the iPhone 14/15 home indicator in standalone PWA mode.

Tier 6.2: dashboard widget-registry now dynamic-imports every
recharts-backed chart widget (berth status, lead source, occupancy
timeline, pipeline funnel, revenue breakdown, source conversion).
~80-150KB initial-bundle savings when reps have charts disabled.
ssr:false because recharts needs window.

Tier 6.3: DataTable wraps the assembled columns in useMemo keyed on
(columns, hasBulkActions). TanStack docs explicitly warn that
rebuilding columns every render resets the table's internal state.

Tier 7.1: Added .dockerignore (was missing — 7.6 GB context with
.env reachable via COPY . .). Excludes git, env files, node_modules,
build artefacts, IDE config, test artefacts, audit docs.

Tier 7.4: Dockerfile.dev now runs as the node user (uid 1000) — was
root. Working dir moves to /home/node/app.

Tier 7.5: docker-compose.prod.yml adds memory limits (2g postgres,
512m redis, 1g crm-app, 1g crm-worker) and json-file log rotation
(max-size, max-file) to every service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-12 17:18:35 +02:00

13 KiB

Raw Blame History

Parked questions — needs product / business / design decision

Items from the 33-agent audit that I deliberately did NOT fix automatically, because they need a call from you (or someone in product / legal / design) before code can be written. Each entry: the finding, why it's parked, and the proposed options.

Numbered to match the tiers in AUDIT-TRIAGE.md.

P-0.1 — Migration runner: which approach?

Finding. pnpm db:push silently skips CREATE INDEX CONCURRENTLY and NULLS NOT DISTINCT constraints, plus the berths.current_pdf_version_id circular FK. Production is running without 6 composite indexes from migration 0052.

Why parked. Three viable approaches:

Drizzle's built-in migrate() — simplest, but doesn't support CREATE INDEX CONCURRENTLY (the kit wraps every migration in a transaction, and CONCURRENTLY can't run inside one).
A custom tsx script that reads 0001*.sql … 0056*.sql in order, splits on --> statement-breakpoint, runs each statement, special-cases CONCURRENTLY by running it outside a tx, tracks state in a __drizzle_migrations table.
Adopt a third-party migrator (graphile-migrate, dbmate, pg-migrate). Best ergonomics, biggest dependency to take on.

Question. Which one do you want? If you don't know, my recommendation is custom tsx script — keeps the dependency surface tight and matches the rest of the platform's "write a script for it" pattern.

P-0.4 — Resolve-identifier hit-path still echoes real email

Finding. Rate-limit + synthetic-miss are in, but on a hit the endpoint still returns the user's canonical email. A guessable-username window still leaks.

Why parked. The real fix is to delete the endpoint entirely and have the login form POST {identifier, password} to a server-side proxy that resolves + calls Better Auth in one round-trip, never returning the email. That's a noticeable refactor to the login page and possibly the portal-login page too.

Question. Do I do the proxy refactor (~30 min) or keep the current rate-limited shape and accept the residual leak?

P-0.5 — Orphan-blob windows in 9+ services

Finding. Every storage.put runs outside the db.insert(files) tx in documents, brochures, invoices, gdpr-export, backup, berth-pdf, external-eoi, document-templates, reports. A comment in one site claims a "reaper handles it" — no reaper exists.

Why parked. Two valid patterns, both meaningful work:

Compensating delete — wrap each storage.put in a try/catch and storage.delete() on tx failure.
Saga / 2-phase — write to a pending_blobs table inside the tx, async-confirm after the tx commits, async-reaper for orphans.

Compensating-delete is faster to ship but doesn't catch process-crash gaps. Saga is more robust but is a bigger change.

Question. Which pattern? Recommendation: compensating-delete for now + a simple cron reaper that lists all blobs not referenced by any files/berth_pdf_versions/etc. row and deletes them after a grace period.

Finding. gdpr-bundle-builder.ts is missing ~10 PII-bearing tables — portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions.

Why parked. Each table needs (a) FK verification that "row belongs to this client" is unambiguous, (b) whether port-isolation must be enforced, (c) whether to include verbatim PII (email bodies, message contents) or redacted versions. This is a careful per-table audit that benefits from someone who knows the data model intimately.

Question. Want me to do a per-table table-by-table follow-up (estimated ~45 min) once you confirm the redaction policy? Or have legal review the scope first?

P-1.2 — Right-to-be-forgotten doesn't actually erase

Finding. client-hard-delete.service.ts nullifies FKs but verbatim PII survives in email_messages.body_html, files, document_sends.recipient_email.

Why parked. This is a legal decision, not a coding one. Some jurisdictions (notably France) require true erasure even of email-body content; others accept anonymization. The fix is mechanical once you decide the policy: a wipeClientPii(clientId) helper that overwrites every PII column with a tombstone string. But the scope (which fields, which timeline, which audit trail) is yours / legal's.

Question. What's the erasure policy? Anonymize (preserve audit trail) or truly delete (loses business records)?

P-1.3 — Activation / reset tokens travel in `?token=` query strings

Finding. Browser history, proxy logs, Referer header all see the token.

Why parked. Fix is a redesign of the URL scheme — switch to #token=… (fragment) or POST-on-load. Both work but require coordinated changes to email templates + the landing pages + Better Auth integration. Estimated 30-45 min.

Question. Want me to do the fragment-based redesign?

P-2.1 — `pipelineValueUsd` sums mixed currencies as USD

Finding. The dashboard tile labelled "Pipeline Value" sums berth prices in their native currencies but renders the total as USD.

Why parked. Three valid UX options:

Convert at display time — fetch each price, convert to port-default-currency via currency.service, sum the converted values. Today's rates introduce drift relative to historical reports.
Show as port-default-currency totalled — the dashboard tile labels it as the port's own currency; honest about ambiguity.
Show "mixed (X USD, Y EUR, Z GBP)" — explicit, prevents misreading, but uglier.

Question. Which display do you want? My recommendation is option 2 (show port-default-currency, convert at display) — it's the least visually noisy and lines up with what most CRMs do.

P-2.5 — "Active interest" means 4 different things

Finding. Dashboard tiles use outcome IS NULL OR 'won', kanban uses archivedAt NULL only (lost cards visible), hot deals uses outcome IS NULL (excludes won), PDF reports use archivedAt NULL only.

Why parked. Need a canonical definition. Recommendation: active = archivedAt IS NULL AND outcome IS NULL (not yet won, not yet lost, not yet cancelled, not yet archived). But that demotes won deals out of "active" everywhere — affects the kanban "won" column and the dashboard "active deals" tile.

Question. Confirm the canonical definition, then I extract an activeInterestsWhere(portId) helper and route every site through it.

P-2.6 — Occupancy rate: berths.status vs berth_reservations

Finding. KPI tile + PDF use berths.status ("occupied"/"available"/etc). Analytics timeline uses berth_reservations. Same dashboard, two different numbers.

Why parked. Need to know which is the source of truth. Probably berth_reservations (richer; supports timeline), but switching the KPI tile changes the displayed number for every port.

Question. Which is canonical? I'll switch the other to match.

P-2.7 — Revenue PDF unweighted vs dashboard weighted

Finding. Revenue PDF shows gross berth prices per stage. Dashboard revenue-forecast tile multiplies by pipeline_weights. They will never reconcile.

Why parked. Need PM call on what "Revenue" means in each context. The PDF is probably a board / investor doc and should match dashboard, but maybe they want both.

Question. Make the PDF match the dashboard (weighted)? Or leave divergent and label them differently?

P-3.1 — "Interest" / "lead" / "prospect" / "deal" used interchangeably

Finding. All four nouns appear in client-facing UI. berth-detail-header.tsx literally parenthesises one as a synonym ("the prospect (interest)"). berth-tabs.tsx has a "Deal Documents" tab + /deal-documents URL path.

Why parked. Need a canonical noun. Without one I'd be guessing; with one I can do a codemod across the platform.

Question. Which one is canonical? Recommendation: interest (matches schema + URL + most code). Then everything else becomes a deprecated alias.

P-3.3 — 16 `window.confirm()` sites for destructive flows

Finding. Cancel signing envelope, delete files, archive interest/company/yacht, etc. all use the native browser dialog.

Why parked. Mechanical fix once you confirm: each site swaps window.confirm() for <AlertDialog> from @/components/ui/alert-dialog. But there are 16 of them; ~5 min each.

Question. OK to do the sweep automatically with the same dialog copy + visual treatment? Or do you want bespoke copy per surface?

P-3.4 — Signing-status labels diverge across 5 surfaces

Finding. Hub list, interest-tab, SigningProgress, notification-digest, realtime-toast all use different strings for the same document state.

Why parked. Need one canonical mapping. I drafted PORTAL_SIGNING_LABELS for the portal but the CRM side has different needs (more granular for reps).

Question. Want me to extract a shared signingStatusLabel() and route every site through it? If yes, I need a confirmed label map.

P-3.5 — 6× "Save" button variants

Finding. "Save", "Save Changes", "Save changes", "Update", "Apply" — plus "Saving..." vs "Saving…".

Why parked. Mechanical sweep once you confirm the canonical text. Recommendation: "Save changes" for edits, "Create X" for new entities, "Saving…" (Unicode ellipsis) for the loading state. Trivial codemod but it touches 30+ files.

Question. OK to do the sweep with that policy?

P-3.6 — Live Documenso template missing `Berth Range` field

Finding. The CRM sends a Berth Range form value through buildDocumensoPayload, but the live template at Documenso doesn't have that field — Documenso silently drops unknown formValues. Every multi-berth EOI ships with only the primary mooring.

Why parked. Not code — Documenso admin action. Someone needs to log into the Documenso instance and add a Berth Range text field to template id 8. The CRM is ready.

Question. Who has Documenso admin access? Can they add the field?

P-4.5 — "Convert to client" prefill qs params unused

Finding. The inquiry-inbox triage flow writes prefill_name/email/phone/inquiry_id/source query-string params. No consumer reads them. The flow eagerly flips the inquiry to "converted" then drops the operator on a blank form, losing the inquiry_id linkage forever.

Why parked. Fix is a wire-up: the create-client form's useEffect reads searchParams and hydrates initial values. But it also has to push the inquiry_id into the resulting client's metadata so the linkage survives. Not difficult; needs ~30 min and design review on what the linkage looks like.

Question. Want me to wire it up with the inquiry_id stored on clients.metadata.source_inquiry_id?

P-5.1 — `handleDocumentCompleted` TOCTTOU

Finding. Two concurrent retries can both pass the idempotency gate, both write the signed PDF blob, both insert duplicate files rows. Webhook + poll-worker race specifically.

Why parked. Fix is a SELECT … FOR UPDATE on the documents row inside the handler. Mechanical but invasive — touches the hottest path in the signing flow. I want to test before shipping, and that needs a real Documenso webhook replay.

Question. OK to ship the FOR UPDATE without a replay test, relying on existing vitest? Or hold until you can replay?

P-5.2 — Zero BullMQ `jobId` usage repo-wide

Finding. Every queue.add is unkeyed; any double-fire creates a duplicate job. The audit found this is the most pervasive concurrency hazard in the codebase.

Why parked. Fix is mechanical: pass a deterministic jobId to every queue.add call. But "deterministic" varies by surface (webhook deliveries should use the delivery row id, notifications should use a hash of the dedupeKey, etc.). ~20 sites to touch.

Question. Want me to do the sweep with per-surface jobId conventions, or batch by surface (webhooks first, then notifications, etc.)?

P-6.2 — Recharts in initial bundle (~80-150KB)

Finding. Every dashboard chart imports recharts statically via widget-registry.tsx. Initial-page-load bundle includes recharts even if the user has all chart widgets disabled.

Why parked. Fix is straightforward (dynamic import each chart widget), but the widget-registry is hot-pathed by the dashboard renderer and by the widget picker UI. Touching it has surface area.

Question. OK to ship a next/dynamic lazy-import for each chart widget? Adds a loading skeleton flash but kills the bundle bloat.

Everything in AUDIT-TRIAGE.md Tier 8 is already shipped. Everything not listed in this file has been fixed without parking — see the commit log on feat/documents-folders.

13 KiB Raw Blame History Unescape Escape