Files
pn-new-crm/docs/AUDIT-PARKED-QUESTIONS.md
Matt ad74e4a174 audit: Tier 1/3/6/7 batch — PII redaction, mobile safe-area, perf, build hardening
Tier 1.4: error_events.request_body_excerpt sanitizer now redacts
GDPR-relevant fields (email, phone, dob, address, fullName, firstName,
lastName, postcode, nationalId, etc.) on top of the existing
credential list. A 5xx in /api/v1/clients no longer lands full client
PII in the super-admin inspector.

Tier 3.10: ScanShell <main> now adds pb-[max(1.5rem, env(safe-area-
inset-bottom))]. Mobile-pwa audit caught the Save expense button sitting
flush against the iPhone 14/15 home indicator in standalone PWA mode.

Tier 6.2: dashboard widget-registry now dynamic-imports every
recharts-backed chart widget (berth status, lead source, occupancy
timeline, pipeline funnel, revenue breakdown, source conversion).
~80-150KB initial-bundle savings when reps have charts disabled.
ssr:false because recharts needs window.

Tier 6.3: DataTable wraps the assembled columns in useMemo keyed on
(columns, hasBulkActions). TanStack docs explicitly warn that
rebuilding columns every render resets the table's internal state.

Tier 7.1: Added .dockerignore (was missing — 7.6 GB context with
.env reachable via COPY . .). Excludes git, env files, node_modules,
build artefacts, IDE config, test artefacts, audit docs.

Tier 7.4: Dockerfile.dev now runs as the node user (uid 1000) — was
root. Working dir moves to /home/node/app.

Tier 7.5: docker-compose.prod.yml adds memory limits (2g postgres,
512m redis, 1g crm-app, 1g crm-worker) and json-file log rotation
(max-size, max-file) to every service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:18:35 +02:00

213 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Parked questions — needs product / business / design decision
Items from the 33-agent audit that I deliberately did NOT fix automatically, because they need a call from you (or someone in product / legal / design) before code can be written. Each entry: the finding, why it's parked, and the proposed options.
Numbered to match the tiers in `AUDIT-TRIAGE.md`.
---
## P-0.1 — Migration runner: which approach?
**Finding.** `pnpm db:push` silently skips `CREATE INDEX CONCURRENTLY` and `NULLS NOT DISTINCT` constraints, plus the `berths.current_pdf_version_id` circular FK. Production is running without 6 composite indexes from migration 0052.
**Why parked.** Three viable approaches:
- **Drizzle's built-in `migrate()`** — simplest, but doesn't support `CREATE INDEX CONCURRENTLY` (the kit wraps every migration in a transaction, and CONCURRENTLY can't run inside one).
- **A custom tsx script** that reads `0001*.sql``0056*.sql` in order, splits on `--> statement-breakpoint`, runs each statement, special-cases CONCURRENTLY by running it outside a tx, tracks state in a `__drizzle_migrations` table.
- **Adopt a third-party migrator** (graphile-migrate, dbmate, pg-migrate). Best ergonomics, biggest dependency to take on.
**Question.** Which one do you want? If you don't know, my recommendation is **custom tsx script** — keeps the dependency surface tight and matches the rest of the platform's "write a script for it" pattern.
---
## P-0.4 — Resolve-identifier hit-path still echoes real email
**Finding.** Rate-limit + synthetic-miss are in, but on a hit the endpoint still returns the user's canonical email. A guessable-username window still leaks.
**Why parked.** The real fix is to delete the endpoint entirely and have the login form POST `{identifier, password}` to a server-side proxy that resolves + calls Better Auth in one round-trip, never returning the email. That's a noticeable refactor to the login page and possibly the portal-login page too.
**Question.** Do I do the proxy refactor (~30 min) or keep the current rate-limited shape and accept the residual leak?
---
## P-0.5 — Orphan-blob windows in 9+ services
**Finding.** Every `storage.put` runs outside the `db.insert(files)` tx in `documents`, `brochures`, `invoices`, `gdpr-export`, `backup`, `berth-pdf`, `external-eoi`, `document-templates`, `reports`. A comment in one site claims a "reaper handles it" — no reaper exists.
**Why parked.** Two valid patterns, both meaningful work:
- **Compensating delete** — wrap each `storage.put` in a try/catch and `storage.delete()` on tx failure.
- **Saga / 2-phase** — write to a `pending_blobs` table inside the tx, async-confirm after the tx commits, async-reaper for orphans.
Compensating-delete is faster to ship but doesn't catch process-crash gaps. Saga is more robust but is a bigger change.
**Question.** Which pattern? Recommendation: compensating-delete for now + a simple `cron` reaper that lists all blobs not referenced by any `files`/`berth_pdf_versions`/etc. row and deletes them after a grace period.
---
## P-1.1 — GDPR Article-15 export completeness
**Finding.** `gdpr-bundle-builder.ts` is missing ~10 PII-bearing tables — portal_users, email_threads/messages, document_sends, reminders, files, scratchpadNotes, client_merge_log, contact_log, website_submissions, form_submissions.
**Why parked.** Each table needs (a) FK verification that "row belongs to this client" is unambiguous, (b) whether port-isolation must be enforced, (c) whether to include verbatim PII (email bodies, message contents) or redacted versions. This is a careful per-table audit that benefits from someone who knows the data model intimately.
**Question.** Want me to do a per-table table-by-table follow-up (estimated ~45 min) once you confirm the redaction policy? Or have legal review the scope first?
---
## P-1.2 — Right-to-be-forgotten doesn't actually erase
**Finding.** `client-hard-delete.service.ts` nullifies FKs but verbatim PII survives in `email_messages.body_html`, `files`, `document_sends.recipient_email`.
**Why parked.** **This is a legal decision, not a coding one.** Some jurisdictions (notably France) require true erasure even of email-body content; others accept anonymization. The fix is mechanical once you decide the policy: a `wipeClientPii(clientId)` helper that overwrites every PII column with a tombstone string. But the scope (which fields, which timeline, which audit trail) is yours / legal's.
**Question.** What's the erasure policy? Anonymize (preserve audit trail) or truly delete (loses business records)?
---
## P-1.3 — Activation / reset tokens travel in `?token=` query strings
**Finding.** Browser history, proxy logs, Referer header all see the token.
**Why parked.** Fix is a redesign of the URL scheme — switch to `#token=…` (fragment) or POST-on-load. Both work but require coordinated changes to email templates + the landing pages + Better Auth integration. Estimated 30-45 min.
**Question.** Want me to do the fragment-based redesign?
---
## P-2.1 — `pipelineValueUsd` sums mixed currencies as USD
**Finding.** The dashboard tile labelled "Pipeline Value" sums berth prices in their native currencies but renders the total as USD.
**Why parked.** Three valid UX options:
- **Convert at display time** — fetch each price, convert to port-default-currency via `currency.service`, sum the converted values. Today's rates introduce drift relative to historical reports.
- **Show as port-default-currency totalled** — the dashboard tile labels it as the port's own currency; honest about ambiguity.
- **Show "mixed (X USD, Y EUR, Z GBP)"** — explicit, prevents misreading, but uglier.
**Question.** Which display do you want? My recommendation is **option 2** (show port-default-currency, convert at display) — it's the least visually noisy and lines up with what most CRMs do.
---
## P-2.5 — "Active interest" means 4 different things
**Finding.** Dashboard tiles use `outcome IS NULL OR 'won'`, kanban uses `archivedAt NULL` only (lost cards visible), hot deals uses `outcome IS NULL` (excludes won), PDF reports use `archivedAt NULL` only.
**Why parked.** Need a canonical definition. Recommendation: **active = `archivedAt IS NULL AND outcome IS NULL`** (not yet won, not yet lost, not yet cancelled, not yet archived). But that demotes won deals out of "active" everywhere — affects the kanban "won" column and the dashboard "active deals" tile.
**Question.** Confirm the canonical definition, then I extract an `activeInterestsWhere(portId)` helper and route every site through it.
---
## P-2.6 — Occupancy rate: berths.status vs berth_reservations
**Finding.** KPI tile + PDF use `berths.status` ("occupied"/"available"/etc). Analytics timeline uses `berth_reservations`. Same dashboard, two different numbers.
**Why parked.** Need to know which is the source of truth. Probably `berth_reservations` (richer; supports timeline), but switching the KPI tile changes the displayed number for every port.
**Question.** Which is canonical? I'll switch the other to match.
---
## P-2.7 — Revenue PDF unweighted vs dashboard weighted
**Finding.** Revenue PDF shows gross berth prices per stage. Dashboard revenue-forecast tile multiplies by `pipeline_weights`. They will never reconcile.
**Why parked.** Need PM call on what "Revenue" means in each context. The PDF is probably a board / investor doc and should match dashboard, but maybe they want both.
**Question.** Make the PDF match the dashboard (weighted)? Or leave divergent and label them differently?
---
## P-3.1 — "Interest" / "lead" / "prospect" / "deal" used interchangeably
**Finding.** All four nouns appear in client-facing UI. `berth-detail-header.tsx` literally parenthesises one as a synonym ("the prospect (interest)"). `berth-tabs.tsx` has a "Deal Documents" tab + `/deal-documents` URL path.
**Why parked.** Need a canonical noun. Without one I'd be guessing; with one I can do a codemod across the platform.
**Question.** Which one is canonical? Recommendation: **interest** (matches schema + URL + most code). Then everything else becomes a deprecated alias.
---
## P-3.3 — 16 `window.confirm()` sites for destructive flows
**Finding.** Cancel signing envelope, delete files, archive interest/company/yacht, etc. all use the native browser dialog.
**Why parked.** Mechanical fix once you confirm: each site swaps `window.confirm()` for `<AlertDialog>` from `@/components/ui/alert-dialog`. But there are 16 of them; ~5 min each.
**Question.** OK to do the sweep automatically with the same dialog copy + visual treatment? Or do you want bespoke copy per surface?
---
## P-3.4 — Signing-status labels diverge across 5 surfaces
**Finding.** Hub list, interest-tab, SigningProgress, notification-digest, realtime-toast all use different strings for the same document state.
**Why parked.** Need one canonical mapping. I drafted `PORTAL_SIGNING_LABELS` for the portal but the CRM side has different needs (more granular for reps).
**Question.** Want me to extract a shared `signingStatusLabel()` and route every site through it? If yes, I need a confirmed label map.
---
## P-3.5 — 6× "Save" button variants
**Finding.** "Save", "Save Changes", "Save changes", "Update", "Apply" — plus "Saving..." vs "Saving…".
**Why parked.** Mechanical sweep once you confirm the canonical text. Recommendation: **"Save changes"** for edits, **"Create X"** for new entities, **"Saving…"** (Unicode ellipsis) for the loading state. Trivial codemod but it touches 30+ files.
**Question.** OK to do the sweep with that policy?
---
## P-3.6 — Live Documenso template missing `Berth Range` field
**Finding.** The CRM sends a `Berth Range` form value through `buildDocumensoPayload`, but the live template at Documenso doesn't have that field — Documenso silently drops unknown formValues. Every multi-berth EOI ships with only the primary mooring.
**Why parked.** **Not code — Documenso admin action.** Someone needs to log into the Documenso instance and add a `Berth Range` text field to template id 8. The CRM is ready.
**Question.** Who has Documenso admin access? Can they add the field?
---
## P-4.5 — "Convert to client" prefill qs params unused
**Finding.** The inquiry-inbox triage flow writes `prefill_name/email/phone/inquiry_id/source` query-string params. No consumer reads them. The flow eagerly flips the inquiry to "converted" then drops the operator on a blank form, losing the inquiry_id linkage forever.
**Why parked.** Fix is a wire-up: the create-client form's `useEffect` reads searchParams and hydrates initial values. But it also has to push the `inquiry_id` into the resulting client's `metadata` so the linkage survives. Not difficult; needs ~30 min and design review on what the linkage looks like.
**Question.** Want me to wire it up with the inquiry_id stored on `clients.metadata.source_inquiry_id`?
---
## P-5.1 — `handleDocumentCompleted` TOCTTOU
**Finding.** Two concurrent retries can both pass the idempotency gate, both write the signed PDF blob, both insert duplicate files rows. Webhook + poll-worker race specifically.
**Why parked.** Fix is a `SELECT … FOR UPDATE` on the documents row inside the handler. Mechanical but invasive — touches the hottest path in the signing flow. I want to test before shipping, and that needs a real Documenso webhook replay.
**Question.** OK to ship the FOR UPDATE without a replay test, relying on existing vitest? Or hold until you can replay?
---
## P-5.2 — Zero BullMQ `jobId` usage repo-wide
**Finding.** Every `queue.add` is unkeyed; any double-fire creates a duplicate job. The audit found this is the most pervasive concurrency hazard in the codebase.
**Why parked.** Fix is mechanical: pass a deterministic `jobId` to every `queue.add` call. But "deterministic" varies by surface (webhook deliveries should use the delivery row id, notifications should use a hash of the dedupeKey, etc.). ~20 sites to touch.
**Question.** Want me to do the sweep with per-surface jobId conventions, or batch by surface (webhooks first, then notifications, etc.)?
---
## P-6.2 — Recharts in initial bundle (~80-150KB)
**Finding.** Every dashboard chart imports recharts statically via `widget-registry.tsx`. Initial-page-load bundle includes recharts even if the user has all chart widgets disabled.
**Why parked.** Fix is straightforward (dynamic import each chart widget), but the widget-registry is hot-pathed by the dashboard renderer and by the widget picker UI. Touching it has surface area.
**Question.** OK to ship a `next/dynamic` lazy-import for each chart widget? Adds a loading skeleton flash but kills the bundle bloat.
---
*Everything in `AUDIT-TRIAGE.md` Tier 8 is already shipped. Everything not listed in this file has been fixed without parking — see the commit log on `feat/documents-folders`.*