pn-new-crm

Author	SHA1	Message	Date
Matt	221ae5784e	chore(autonomous-session): consolidate uncommitted work from prior session Bundles the prior autonomous-session output that was sitting unstaged: - Em-dash sweep across src/ + tests/ (en-dash/em-dash to hyphen, ~2280 instances) - country-flag-icons rollout (CountryFlag component, replaces emoji glyphs that never rendered on Windows; lazy-loads the 3x2 SVG index as a single chunk after the per-subpath dynamic-import approach silently failed in webpack) - Admin IA Phase 1+2: 7-domain regroup, 41 to 38 pages, /admin/berths index, redirects (ocr to ai, reports to dashboard, invitations to users), docs/admin-ia-proposal.md - Per-template email tester (registry + endpoint + UI on Email admin page) - Cancel-document mode picker (delete-from-Documenso vs keep-for-audit) - Dashboard PDF report: 25 widgets, SVG charts, date-range picker, 11 resolvers - Customize-widgets per-region sortables at xl+ (charts/rails/feed); single flat sortable below xl when the layout stacks; per-viewport saved orders - Audit doc updates capturing each shipped item - Lint fixes: react-compiler immutability in DonutChart (reduce instead of let-reassign), set-state-in-effect disables in CountryFlag and UploadForSigning preview-bytes effect, unused 'confirm' destructures in interest contract + reservation tabs, unescaped apostrophe in test-template card copy	2026-05-23 00:52:59 +02:00
Matt	e8a852856e	feat(berth-parser): unpdf for tier-2 PDF text extraction Phase 1 / commit 13 of 14 — replaces a quietly-broken tesseract.js pathway with unpdf for tier-2 of the berth-PDF parser. The previous code did: const tesseract = await import('tesseract.js'); await tesseract.recognize(buffer, 'eng'); // ← buffer is a PDF tesseract.recognize() expects an image, not a PDF. The PDFs we get from the AcroForm-stripped berth-spec sheets would have failed at runtime (either an "unsupported format" error or silently empty text). Tier-2 was dark code. unpdf (serverless-friendly pdfjs wrapper) extracts text directly from the PDF stream. Works on text-PDFs (real text streams), returns empty on scanned/raster PDFs — those legitimately fall through to the AI tier where they belong. The OcrAdapter interface shape is preserved so: - Existing unit tests that stub the adapter still work - parseAnyBerthPdf(buffer, { adapter }) override still works - The 30-second timeout race + warning collection still works tesseract.js stays as a dep — scan-shell.tsx (receipt scanner) still uses it for on-device image OCR, which is its intended use case. 1298/1298 vitest green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 21:13:10 +02:00
Matt Ciaccio	fc7595faf8	fix(audit-tier-2): error-surface hygiene — toastError + CodedError sweep Two mechanical sweeps closing the audit's HIGH §16 + MED §11 findings: * 38 client components / 56 toast.error sites converted to toastError(err) so the new admin error inspector becomes usable from user-reported issues — every failed inline-edit, save, send, archive, upload, etc. now carries the request-id + error-code (Copy ID action). * 26 service files / 62 bare-Error throws converted to CodedError or the existing AppError subclasses. Adds new error codes: DOCUMENSO_UPSTREAM_ERROR (502), DOCUMENSO_AUTH_FAILURE (502), DOCUMENSO_TIMEOUT (504), OCR_UPSTREAM_ERROR (502), IMAP_UPSTREAM_ERROR (502), UMAMI_UPSTREAM_ERROR (502), UMAMI_NOT_CONFIGURED (409), and INSERT_RETURNING_EMPTY (500) for post-insert returning-empty guards. * Five vitest assertions updated to match the new user-facing wording (client-merge "already been merged", expense/interest "couldn't find that …", documenso "signing service didn't respond"). Test status: 1168/1168 vitest, tsc clean. Refs: docs/audit-comprehensive-2026-05-05.md HIGH §16 (auditor-H Issue 1) + MED §11 (auditor-G Issue 1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:18:05 +02:00
Matt Ciaccio	014bbe1923	feat(expenses): streaming expense-PDF export + receipt-less expense flag + audit-3 fixes Replaces the legacy text-only expense PDF (was just dumping rows into a single pdfme text field — no images, no pagination) with a proper streaming export modelled on the legacy Nuxt client-portal but re-architected for memory safety. The legacy implementation OOM'd on hundreds of receipts because it: - buffered every receipt image into memory simultaneously - accumulated PDF chunks into an array, concat'd at end - base64-encoded the whole PDF into a JSON response (3x peak memory) - had no image downscaling The new design: - `streamExpensePdf()` (src/lib/services/expense-pdf.service.ts): pdfkit pipes bytes directly to the HTTP response (no Buffer accumulation). Receipts are processed serially so peak heap is one image at a time. Sharp downscales any receipt > 500 KB or > 1500 px to JPEG q80 — typical 8 MB phone photo collapses to ~250 KB. For a 500-receipt export, peak RSS stays under ~100 MB; legacy needed >2 GB for the same input. - Pages: cover summary box (count, totals, currency equiv, optional processing fee), grouped expense table (groupBy=none\|payer\|category\| date), one-page-per-receipt with header (establishment, amount, date, payer, category, file name) and full-bleed image. - Storage backend abstraction — receipts stream from `getStorageBackend().get(storageKey)`, works on MinIO/S3/filesystem. - Route: POST /api/v1/expenses/export/pdf streams binary application/pdf with cache-control:no-store. Validator caps expenseIds at 1000 to prevent runaway loops. Receipt-less expense flow (per user request): - Schema: 0033 migration adds `expenses.no_receipt_acknowledged` boolean (default false). - Validator: createExpenseSchema requires either receiptFileIds OR noReceiptAcknowledged=true; the .refine() error message tells the rep exactly what to do. updateExpenseSchema is partial and skips the rule (existing rows can be edited without re-acknowledging). - PDF: receiptless expenses get an inline red "(no receipt)" tag in the establishment cell + a red footer warning in the summary box showing the count and at-risk amount. - The legacy parent-company reimbursement queue may refuse to pay receiptless expenses, so the warning is load-bearing for ops. Audit-3 fixes piggy-backed: - 🔴 Tesseract OCR runtime now races a 30s timeout (CPU-bomb DoS protection — a crafted PDF rasterizing to high-res noise could pin the worker indefinitely). - 🟠 brochures.service.ts:listBrochures dropped a wasted query (the legacy single-brochure fast-path was discarding its result on the multi-brochure branch). - 🟠 berth-pdf.service.ts:listBerthPdfVersions now Promise.all's the presignDownload calls instead of awaiting each in a for-loop — 20-version berths went from 20× round-trip to 1×. - 🟡 public berths route no longer logs the full `row` object on enum drift (was dumping price + amenity columns into ops logs). - 🟡 dropped the dead `void sql` import from public berths route. Tests still 1163/1163. tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:38:32 +02:00
Matt Ciaccio	a3e002852b	fix(audit-2): integration regressions + data-integrity from second-pass review Two reviewer agents did a second-pass deep audit of the 21-commit refactor. Eight findings; four fixed here (one was deferred with a schema comment, three were 🟡 nice-to-haves left for follow-up). Integration regressions (🟠 high): - Outbound webhook `interest.berth_linked` now fires from the new junction-add handler. Was emitting a socket-only event, leaving external integrations silent post-refactor. - Two new webhook events `interest.berth_unlinked` and `interest.berth_link_updated` added to WEBHOOK_EVENTS + INTERNAL_TO_WEBHOOK_MAP. PATCH and DELETE handlers now dispatch them alongside the existing socket emits — lifecycle parity restored. - BerthInterestPulse adds useRealtimeInvalidation for berth-link events. The query key was berth-scoped while the linked-berths dialog invalidates interest-scoped keys (no prefix match), so the pulse went stale. Bridges via the realtime hook now. Recommender semantic fix (🟠 medium-high): - aggregates CTE: active_interest_count now filters on `ib.is_specific_interest = true`, matching the public-map "Under Offer" derivation. EOI-bundle-only links no longer demote a berth to Tier C for other reps. Smoke test confirms previously-all-Tier-C results now correctly classify as Tier A. - Same CTE: `total_interest_count` uses COUNT(ib.berth_id) instead of COUNT(*) so a berth with no junction rows reports 0 (not 1 from the LEFT JOIN's NULL-right-side row). Prevents heat over-counting. Data integrity (🟠): - AcroForm tier rejects negative numerics in coerceFieldValue (was letting through `length_ft="-50"` which would poison the recommender feasibility filter on apply). - FilesystemBackend.resolveHmacSecret throws in production when storage_proxy_hmac_secret_encrypted is null. Dev still derives from BETTER_AUTH_SECRET for ergonomics; prod must explicitly configure. - Documented the circular FK between berths.current_pdf_version_id and berth_pdf_versions.id. Drizzle's `.references()` can't express the cycle so the schema column is plain text + a comment; the FK is authoritatively maintained by migration 0030. Tests still 1163/1163. tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 04:20:38 +02:00
Matt Ciaccio	249ffe3e4a	feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit `83693dd`) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 03:34:24 +02:00

6 Commits