Files
pn-new-crm/src/app/api/v1/berths/[id]/pdf-upload-url/handlers.ts

85 lines
3.3 KiB
TypeScript
Raw Normal View History

feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
/**
* Returns a presigned URL the browser can use to PUT a PDF directly to the
* active storage backend. The URL is constrained by content-length-range up
* to `system_settings.berth_pdf_max_upload_mb` (default 15 MB) per §11.1.
*
* For S3 backends this is a true signed URL; for filesystem backends it's a
* CRM-internal proxy URL with an HMAC token (see `FilesystemBackend`).
*/
import { NextResponse } from 'next/server';
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs Wave through the remaining audit-final-deferred items that aren't blocked on the back-burnered Documenso work. Multi-tenant isolation: - Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim; verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a buggy issuer in some future code path that mixes port scopes — every storage key generated by generateStorageKey() already prefixes the slug. document-sends opts in for 24h emailed download links; other callers continue working unchanged via the optional field. DB schema reconciliation: - Migration 0047 rebuilds system_settings unique index with NULLS NOT DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate (storage_backend, NULL) rows that had accumulated from race-prone delete-then-insert patterns in ocr-config / settings / residential- stages / ai-budget services. All four services converted to true onConflictDoUpdate upserts so the race window is closed. API uniformity: - Response shape standardization: 16 routes converted from `{ success: true }` to 204 No Content. CLAUDE.md documents the convention (`{ data: <T> }` for content, 204 for empty mutations, portal-auth retains `{ success: true }` for the frontend's auth chain). - req.json() → parseBody() migration across 9 admin/CRM routes (custom-fields, expenses/export ×3, currency convert, search/recently-viewed, admin/duplicates, berths/pdf-{upload-url, versions, parse-results}). Uniform 400 error shapes for ZodError-flagged bodies. Custom-fields merge tokens (shipped end-to-end): - merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the `{{custom.<fieldName>}}` shape. - document-templates validator accepts the dynamic shape alongside the static catalog tokens. - document-sends.service mergeCustomFieldValues resolver fetches per-port custom_field_definitions for client/interest/berth contexts and substitutes stored values keyed by `{{custom.fieldName}}`. - custom-fields-manager amber banner updated to reflect that merge tokens now expand (search index + entity-diff remain documented design limitations). /api/v1/files cross-entity filtering: - Validator + listFiles + uploadFile accept companyId AND yachtId alongside clientId. file-upload-zone propagates both. - New CompanyFilesTab component mirrors ClientFilesTab; restored as a visible Documents tab in company-tabs.tsx (was a hidden stub). Inline TODOs: - Reviewed remaining two TODOs (per-user reminder schedule, import worker handlers). Both are placeholders for future feature surfaces, not bugs — per-port digest works for every customer; nothing currently enqueues import jobs (verified). Annotated in BACKLOG. BACKLOG.md updated to reflect what landed and what's still pending (Documenso-related items still bundled with the back-burnered phases). Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
import { z } from 'zod';
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
import { type RouteHandler } from '@/lib/api/helpers';
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs Wave through the remaining audit-final-deferred items that aren't blocked on the back-burnered Documenso work. Multi-tenant isolation: - Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim; verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a buggy issuer in some future code path that mixes port scopes — every storage key generated by generateStorageKey() already prefixes the slug. document-sends opts in for 24h emailed download links; other callers continue working unchanged via the optional field. DB schema reconciliation: - Migration 0047 rebuilds system_settings unique index with NULLS NOT DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate (storage_backend, NULL) rows that had accumulated from race-prone delete-then-insert patterns in ocr-config / settings / residential- stages / ai-budget services. All four services converted to true onConflictDoUpdate upserts so the race window is closed. API uniformity: - Response shape standardization: 16 routes converted from `{ success: true }` to 204 No Content. CLAUDE.md documents the convention (`{ data: <T> }` for content, 204 for empty mutations, portal-auth retains `{ success: true }` for the frontend's auth chain). - req.json() → parseBody() migration across 9 admin/CRM routes (custom-fields, expenses/export ×3, currency convert, search/recently-viewed, admin/duplicates, berths/pdf-{upload-url, versions, parse-results}). Uniform 400 error shapes for ZodError-flagged bodies. Custom-fields merge tokens (shipped end-to-end): - merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the `{{custom.<fieldName>}}` shape. - document-templates validator accepts the dynamic shape alongside the static catalog tokens. - document-sends.service mergeCustomFieldValues resolver fetches per-port custom_field_definitions for client/interest/berth contexts and substitutes stored values keyed by `{{custom.fieldName}}`. - custom-fields-manager amber banner updated to reflect that merge tokens now expand (search index + entity-diff remain documented design limitations). /api/v1/files cross-entity filtering: - Validator + listFiles + uploadFile accept companyId AND yachtId alongside clientId. file-upload-zone propagates both. - New CompanyFilesTab component mirrors ClientFilesTab; restored as a visible Documents tab in company-tabs.tsx (was a hidden stub). Inline TODOs: - Reviewed remaining two TODOs (per-user reminder schedule, import worker handlers). Both are placeholders for future feature surfaces, not bugs — per-port digest works for every customer; nothing currently enqueues import jobs (verified). Annotated in BACKLOG. BACKLOG.md updated to reflect what landed and what's still pending (Documenso-related items still bundled with the back-burnered phases). Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
import { parseBody } from '@/lib/api/route-helpers';
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
import { db } from '@/lib/db';
import { berths } from '@/lib/db/schema/berths';
fix(security): scope berth-pdf service entrypoints by portId Post-merge security review caught a cross-tenant authorization bypass in the per-berth PDF endpoints (HIGH severity, confidence 10): GET /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-upload-url POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback POST /api/v1/berths/[id]/pdf-versions/parse-results/apply Each handler looked up the target berth by id only — `eq(berths.id, ...)`. withAuth resolves ctx.portId from the user-controlled X-Port-Id header (only verifying the user has SOME role on that port), and withPermission('berths', 'view'|'edit', ...) is a coarse capability check, not a row-level grant. A rep with berths:edit on Port A could supply a Port B berth UUID and: - list + receive 15-min presigned download URLs to every PDF version - mint an upload URL targeting `berths/<port-B-id>/uploads/...` - POST a new version (overwriting current_pdf_version_id on foreign berth) - rollback to any prior version on a foreign berth - apply rep-confirmed parse-result fields onto a foreign berth's columns Sibling routes (waiting-list etc.) already pair the id filter with `eq(berths.portId, ctx.portId)`, so this was an omission, not design. Fix: - Push `portId: string` into uploadBerthPdf, listBerthPdfVersions, rollbackToVersion, applyParseResults, reconcilePdfWithBerth. - Each function now filters the berth lookup with `and(eq(berths.id, ...), eq(berths.portId, portId))` and throws NotFoundError on mismatch (no foreign-port disclosure). - Inline the same `and(...)` filter in the pdf-upload-url handler. - Every handler passes ctx.portId through. Coverage: - New `cross-port tenant guard` test exercises every entrypoint with a foreign-port id and asserts NotFoundError. - 1164/1164 vitest passing. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
import { and, eq } from 'drizzle-orm';
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
import { errorResponse, NotFoundError, ValidationError } from '@/lib/errors';
import { getMaxUploadMb } from '@/lib/services/berth-pdf.service';
import { getStorageBackend } from '@/lib/storage';
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs Wave through the remaining audit-final-deferred items that aren't blocked on the back-burnered Documenso work. Multi-tenant isolation: - Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim; verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a buggy issuer in some future code path that mixes port scopes — every storage key generated by generateStorageKey() already prefixes the slug. document-sends opts in for 24h emailed download links; other callers continue working unchanged via the optional field. DB schema reconciliation: - Migration 0047 rebuilds system_settings unique index with NULLS NOT DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate (storage_backend, NULL) rows that had accumulated from race-prone delete-then-insert patterns in ocr-config / settings / residential- stages / ai-budget services. All four services converted to true onConflictDoUpdate upserts so the race window is closed. API uniformity: - Response shape standardization: 16 routes converted from `{ success: true }` to 204 No Content. CLAUDE.md documents the convention (`{ data: <T> }` for content, 204 for empty mutations, portal-auth retains `{ success: true }` for the frontend's auth chain). - req.json() → parseBody() migration across 9 admin/CRM routes (custom-fields, expenses/export ×3, currency convert, search/recently-viewed, admin/duplicates, berths/pdf-{upload-url, versions, parse-results}). Uniform 400 error shapes for ZodError-flagged bodies. Custom-fields merge tokens (shipped end-to-end): - merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the `{{custom.<fieldName>}}` shape. - document-templates validator accepts the dynamic shape alongside the static catalog tokens. - document-sends.service mergeCustomFieldValues resolver fetches per-port custom_field_definitions for client/interest/berth contexts and substitutes stored values keyed by `{{custom.fieldName}}`. - custom-fields-manager amber banner updated to reflect that merge tokens now expand (search index + entity-diff remain documented design limitations). /api/v1/files cross-entity filtering: - Validator + listFiles + uploadFile accept companyId AND yachtId alongside clientId. file-upload-zone propagates both. - New CompanyFilesTab component mirrors ClientFilesTab; restored as a visible Documents tab in company-tabs.tsx (was a hidden stub). Inline TODOs: - Reviewed remaining two TODOs (per-user reminder schedule, import worker handlers). Both are placeholders for future feature surfaces, not bugs — per-port digest works for every customer; nothing currently enqueues import jobs (verified). Annotated in BACKLOG. BACKLOG.md updated to reflect what landed and what's still pending (Documenso-related items still bundled with the back-burnered phases). Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
const postBodySchema = z.object({
fileName: z.string().min(1).max(255),
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
/** Size hint in bytes used to early-reject oversized uploads before we
* burn a presigned URL. */
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs Wave through the remaining audit-final-deferred items that aren't blocked on the back-burnered Documenso work. Multi-tenant isolation: - Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim; verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a buggy issuer in some future code path that mixes port scopes — every storage key generated by generateStorageKey() already prefixes the slug. document-sends opts in for 24h emailed download links; other callers continue working unchanged via the optional field. DB schema reconciliation: - Migration 0047 rebuilds system_settings unique index with NULLS NOT DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate (storage_backend, NULL) rows that had accumulated from race-prone delete-then-insert patterns in ocr-config / settings / residential- stages / ai-budget services. All four services converted to true onConflictDoUpdate upserts so the race window is closed. API uniformity: - Response shape standardization: 16 routes converted from `{ success: true }` to 204 No Content. CLAUDE.md documents the convention (`{ data: <T> }` for content, 204 for empty mutations, portal-auth retains `{ success: true }` for the frontend's auth chain). - req.json() → parseBody() migration across 9 admin/CRM routes (custom-fields, expenses/export ×3, currency convert, search/recently-viewed, admin/duplicates, berths/pdf-{upload-url, versions, parse-results}). Uniform 400 error shapes for ZodError-flagged bodies. Custom-fields merge tokens (shipped end-to-end): - merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the `{{custom.<fieldName>}}` shape. - document-templates validator accepts the dynamic shape alongside the static catalog tokens. - document-sends.service mergeCustomFieldValues resolver fetches per-port custom_field_definitions for client/interest/berth contexts and substitutes stored values keyed by `{{custom.fieldName}}`. - custom-fields-manager amber banner updated to reflect that merge tokens now expand (search index + entity-diff remain documented design limitations). /api/v1/files cross-entity filtering: - Validator + listFiles + uploadFile accept companyId AND yachtId alongside clientId. file-upload-zone propagates both. - New CompanyFilesTab component mirrors ClientFilesTab; restored as a visible Documents tab in company-tabs.tsx (was a hidden stub). Inline TODOs: - Reviewed remaining two TODOs (per-user reminder schedule, import worker handlers). Both are placeholders for future feature surfaces, not bugs — per-port digest works for every customer; nothing currently enqueues import jobs (verified). Annotated in BACKLOG. BACKLOG.md updated to reflect what landed and what's still pending (Documenso-related items still bundled with the back-burnered phases). Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
sizeBytes: z.number().int().nonnegative().optional(),
});
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
fix(security): scope berth-pdf service entrypoints by portId Post-merge security review caught a cross-tenant authorization bypass in the per-berth PDF endpoints (HIGH severity, confidence 10): GET /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-upload-url POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback POST /api/v1/berths/[id]/pdf-versions/parse-results/apply Each handler looked up the target berth by id only — `eq(berths.id, ...)`. withAuth resolves ctx.portId from the user-controlled X-Port-Id header (only verifying the user has SOME role on that port), and withPermission('berths', 'view'|'edit', ...) is a coarse capability check, not a row-level grant. A rep with berths:edit on Port A could supply a Port B berth UUID and: - list + receive 15-min presigned download URLs to every PDF version - mint an upload URL targeting `berths/<port-B-id>/uploads/...` - POST a new version (overwriting current_pdf_version_id on foreign berth) - rollback to any prior version on a foreign berth - apply rep-confirmed parse-result fields onto a foreign berth's columns Sibling routes (waiting-list etc.) already pair the id filter with `eq(berths.portId, ctx.portId)`, so this was an omission, not design. Fix: - Push `portId: string` into uploadBerthPdf, listBerthPdfVersions, rollbackToVersion, applyParseResults, reconcilePdfWithBerth. - Each function now filters the berth lookup with `and(eq(berths.id, ...), eq(berths.portId, portId))` and throws NotFoundError on mismatch (no foreign-port disclosure). - Inline the same `and(...)` filter in the pdf-upload-url handler. - Every handler passes ctx.portId through. Coverage: - New `cross-port tenant guard` test exercises every entrypoint with a foreign-port id and asserts NotFoundError. - 1164/1164 vitest passing. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
export const postHandler: RouteHandler = async (req, ctx, params) => {
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
try {
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs Wave through the remaining audit-final-deferred items that aren't blocked on the back-burnered Documenso work. Multi-tenant isolation: - Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim; verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a buggy issuer in some future code path that mixes port scopes — every storage key generated by generateStorageKey() already prefixes the slug. document-sends opts in for 24h emailed download links; other callers continue working unchanged via the optional field. DB schema reconciliation: - Migration 0047 rebuilds system_settings unique index with NULLS NOT DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate (storage_backend, NULL) rows that had accumulated from race-prone delete-then-insert patterns in ocr-config / settings / residential- stages / ai-budget services. All four services converted to true onConflictDoUpdate upserts so the race window is closed. API uniformity: - Response shape standardization: 16 routes converted from `{ success: true }` to 204 No Content. CLAUDE.md documents the convention (`{ data: <T> }` for content, 204 for empty mutations, portal-auth retains `{ success: true }` for the frontend's auth chain). - req.json() → parseBody() migration across 9 admin/CRM routes (custom-fields, expenses/export ×3, currency convert, search/recently-viewed, admin/duplicates, berths/pdf-{upload-url, versions, parse-results}). Uniform 400 error shapes for ZodError-flagged bodies. Custom-fields merge tokens (shipped end-to-end): - merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the `{{custom.<fieldName>}}` shape. - document-templates validator accepts the dynamic shape alongside the static catalog tokens. - document-sends.service mergeCustomFieldValues resolver fetches per-port custom_field_definitions for client/interest/berth contexts and substitutes stored values keyed by `{{custom.fieldName}}`. - custom-fields-manager amber banner updated to reflect that merge tokens now expand (search index + entity-diff remain documented design limitations). /api/v1/files cross-entity filtering: - Validator + listFiles + uploadFile accept companyId AND yachtId alongside clientId. file-upload-zone propagates both. - New CompanyFilesTab component mirrors ClientFilesTab; restored as a visible Documents tab in company-tabs.tsx (was a hidden stub). Inline TODOs: - Reviewed remaining two TODOs (per-user reminder schedule, import worker handlers). Both are placeholders for future feature surfaces, not bugs — per-port digest works for every customer; nothing currently enqueues import jobs (verified). Annotated in BACKLOG. BACKLOG.md updated to reflect what landed and what's still pending (Documenso-related items still bundled with the back-burnered phases). Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
const body = await parseBody(req, postBodySchema);
const fileName = body.fileName.trim();
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
if (!fileName) throw new ValidationError('fileName is required');
fix(security): scope berth-pdf service entrypoints by portId Post-merge security review caught a cross-tenant authorization bypass in the per-berth PDF endpoints (HIGH severity, confidence 10): GET /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-versions POST /api/v1/berths/[id]/pdf-upload-url POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback POST /api/v1/berths/[id]/pdf-versions/parse-results/apply Each handler looked up the target berth by id only — `eq(berths.id, ...)`. withAuth resolves ctx.portId from the user-controlled X-Port-Id header (only verifying the user has SOME role on that port), and withPermission('berths', 'view'|'edit', ...) is a coarse capability check, not a row-level grant. A rep with berths:edit on Port A could supply a Port B berth UUID and: - list + receive 15-min presigned download URLs to every PDF version - mint an upload URL targeting `berths/<port-B-id>/uploads/...` - POST a new version (overwriting current_pdf_version_id on foreign berth) - rollback to any prior version on a foreign berth - apply rep-confirmed parse-result fields onto a foreign berth's columns Sibling routes (waiting-list etc.) already pair the id filter with `eq(berths.portId, ctx.portId)`, so this was an omission, not design. Fix: - Push `portId: string` into uploadBerthPdf, listBerthPdfVersions, rollbackToVersion, applyParseResults, reconcilePdfWithBerth. - Each function now filters the berth lookup with `and(eq(berths.id, ...), eq(berths.portId, portId))` and throws NotFoundError on mismatch (no foreign-port disclosure). - Inline the same `and(...)` filter in the pdf-upload-url handler. - Every handler passes ctx.portId through. Coverage: - New `cross-port tenant guard` test exercises every entrypoint with a foreign-port id and asserts NotFoundError. - 1164/1164 vitest passing. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
// Tenant-scoped berth lookup. Without `eq(berths.portId, ctx.portId)` a
// rep with berths:edit on port A could mint an upload URL targeting a
// port-B berth (the storage key namespace would land under that berth's
// id, leaking access).
const berthRow = await db.query.berths.findFirst({
where: and(eq(berths.id, params.id!), eq(berths.portId, ctx.portId)),
});
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
if (!berthRow) throw new NotFoundError('Berth');
const maxMb = await getMaxUploadMb(berthRow.portId);
const maxBytes = maxMb * 1024 * 1024;
if (typeof body.sizeBytes === 'number' && body.sizeBytes > maxBytes) {
throw new ValidationError(
`File exceeds ${maxMb} MB upload cap (got ${(body.sizeBytes / 1024 / 1024).toFixed(1)} MB).`,
);
}
// Provisional version number: the actual row insert happens in POST
// /pdf-versions and re-computes via SELECT max+1 inside a transaction,
// so a race between two reps just shifts which one wins the version
// slot. The storage key is gen_random_uuid()-namespaced so collisions
// in the storage layer are impossible.
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the tractable Critical/High items from each: error-ux-auditor (5 items) - C2: 17 toast.error(err.message) sites swept to toastError(err, …) so every user-visible failure carries a copy-paste Reference ID - C3: apiFetch synthesizes a client-side correlation id when a 5xx comes back with a non-JSON body (reverse-proxy HTML pages); message becomes "The server is unreachable. Please try again." with code UPSTREAM_UNREACHABLE - C4: checkRateLimit fails OPEN when Redis is unavailable so an outage no longer 500s login + portal sign-in; logged at warn so monitoring catches it - H2: StorageTimeoutError (name='TimeoutError') replaces the plain Error throw in s3.ts withTimeout — error-classifier hints fire now - H5: errorResponse() adopted across /api/storage/[token], /api/public/website-inquiries, and the Documenso webhook body (drops the "Invalid secret" reconnaissance string) outbound-webhook-auditor (5 items) - C1: signature is now HMAC(secret, `${ts}.${body}`) with the timestamp surfaced as X-Webhook-Timestamp so receivers can reject replays outside a freshness window - C3: dead-letter with reason missing_signing_secret when secret is null (defence-in-depth against DB tampering / future migration mistakes) - H2: webhooks queue bumped to maxAttempts=8 with 30 s base exponential backoff so a 30 s receiver blip during a deploy no longer dead-letters every in-flight event; per-queue backoffDelayMs added to QUEUE_CONFIGS - M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192 - M2: dispatch-time https:// assertion before fetch, so a bad DB edit can't slip plaintext through storage-pathing-auditor (2 items) - H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…` with portSlug threaded into backend.presignUpload — engages the filesystem-proxy port-binding `p` token verifier - H2: presignDownloadUrl auto-derives portSlug from the key's first segment when callers don't pass it, so all 8 download sites engage the `p`-token guard without per-site plumbing search-auditor (1 item) - H3: removed dead void wantEmail; void wantPhone; pair plus the unused looksLikeEmail helper — the bucket-reorder it was scaffolded for was never wired maintainability-auditor (1 item) - M2: swept seven abandoned `void <symbol>` markers and their dead imports across clients/bulk, interests/bulk, admin/email-templates, admin/website-submissions, alert-rules, and notes.service Deferred to future work (substantial refactors, schema migrations, or multi-file UI work): - error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage, ErrorBanner component, /api/ready route, worker DLQ admin surface) - maintainability C1-C4 (documents/search/notes service splits, interest-tabs split — multi-hour refactors) - currency C1-H5 (mixed-currency dashboard aggregation, FX history table, rounding policy) — wait for second non-USD port - outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy) - storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type binding) Tests: 1315/1315 vitest ✅ ; tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
//
// storage-pathing-auditor H1: prefix the port slug so the
// filesystem-proxy port-binding token (`p` field) can be wired and
// the namespace matches `buildStoragePath` (which always leads with
// the port slug).
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
const sanitized = fileName.replace(/[^a-zA-Z0-9._-]/g, '_').slice(0, 200) || 'berth.pdf';
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the tractable Critical/High items from each: error-ux-auditor (5 items) - C2: 17 toast.error(err.message) sites swept to toastError(err, …) so every user-visible failure carries a copy-paste Reference ID - C3: apiFetch synthesizes a client-side correlation id when a 5xx comes back with a non-JSON body (reverse-proxy HTML pages); message becomes "The server is unreachable. Please try again." with code UPSTREAM_UNREACHABLE - C4: checkRateLimit fails OPEN when Redis is unavailable so an outage no longer 500s login + portal sign-in; logged at warn so monitoring catches it - H2: StorageTimeoutError (name='TimeoutError') replaces the plain Error throw in s3.ts withTimeout — error-classifier hints fire now - H5: errorResponse() adopted across /api/storage/[token], /api/public/website-inquiries, and the Documenso webhook body (drops the "Invalid secret" reconnaissance string) outbound-webhook-auditor (5 items) - C1: signature is now HMAC(secret, `${ts}.${body}`) with the timestamp surfaced as X-Webhook-Timestamp so receivers can reject replays outside a freshness window - C3: dead-letter with reason missing_signing_secret when secret is null (defence-in-depth against DB tampering / future migration mistakes) - H2: webhooks queue bumped to maxAttempts=8 with 30 s base exponential backoff so a 30 s receiver blip during a deploy no longer dead-letters every in-flight event; per-queue backoffDelayMs added to QUEUE_CONFIGS - M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192 - M2: dispatch-time https:// assertion before fetch, so a bad DB edit can't slip plaintext through storage-pathing-auditor (2 items) - H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…` with portSlug threaded into backend.presignUpload — engages the filesystem-proxy port-binding `p` token verifier - H2: presignDownloadUrl auto-derives portSlug from the key's first segment when callers don't pass it, so all 8 download sites engage the `p`-token guard without per-site plumbing search-auditor (1 item) - H3: removed dead void wantEmail; void wantPhone; pair plus the unused looksLikeEmail helper — the bucket-reorder it was scaffolded for was never wired maintainability-auditor (1 item) - M2: swept seven abandoned `void <symbol>` markers and their dead imports across clients/bulk, interests/bulk, admin/email-templates, admin/website-submissions, alert-rules, and notes.service Deferred to future work (substantial refactors, schema migrations, or multi-file UI work): - error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage, ErrorBanner component, /api/ready route, worker DLQ admin surface) - maintainability C1-C4 (documents/search/notes service splits, interest-tabs split — multi-hour refactors) - currency C1-H5 (mixed-currency dashboard aggregation, FX history table, rounding policy) — wait for second non-USD port - outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy) - storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type binding) Tests: 1315/1315 vitest ✅ ; tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
const storageKey = `${ctx.portSlug}/berths/${params.id!}/uploads/${crypto.randomUUID()}_${sanitized}`;
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
const backend = await getStorageBackend();
const presigned = await backend.presignUpload(storageKey, {
contentType: 'application/pdf',
expirySeconds: 900,
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the tractable Critical/High items from each: error-ux-auditor (5 items) - C2: 17 toast.error(err.message) sites swept to toastError(err, …) so every user-visible failure carries a copy-paste Reference ID - C3: apiFetch synthesizes a client-side correlation id when a 5xx comes back with a non-JSON body (reverse-proxy HTML pages); message becomes "The server is unreachable. Please try again." with code UPSTREAM_UNREACHABLE - C4: checkRateLimit fails OPEN when Redis is unavailable so an outage no longer 500s login + portal sign-in; logged at warn so monitoring catches it - H2: StorageTimeoutError (name='TimeoutError') replaces the plain Error throw in s3.ts withTimeout — error-classifier hints fire now - H5: errorResponse() adopted across /api/storage/[token], /api/public/website-inquiries, and the Documenso webhook body (drops the "Invalid secret" reconnaissance string) outbound-webhook-auditor (5 items) - C1: signature is now HMAC(secret, `${ts}.${body}`) with the timestamp surfaced as X-Webhook-Timestamp so receivers can reject replays outside a freshness window - C3: dead-letter with reason missing_signing_secret when secret is null (defence-in-depth against DB tampering / future migration mistakes) - H2: webhooks queue bumped to maxAttempts=8 with 30 s base exponential backoff so a 30 s receiver blip during a deploy no longer dead-letters every in-flight event; per-queue backoffDelayMs added to QUEUE_CONFIGS - M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192 - M2: dispatch-time https:// assertion before fetch, so a bad DB edit can't slip plaintext through storage-pathing-auditor (2 items) - H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…` with portSlug threaded into backend.presignUpload — engages the filesystem-proxy port-binding `p` token verifier - H2: presignDownloadUrl auto-derives portSlug from the key's first segment when callers don't pass it, so all 8 download sites engage the `p`-token guard without per-site plumbing search-auditor (1 item) - H3: removed dead void wantEmail; void wantPhone; pair plus the unused looksLikeEmail helper — the bucket-reorder it was scaffolded for was never wired maintainability-auditor (1 item) - M2: swept seven abandoned `void <symbol>` markers and their dead imports across clients/bulk, interests/bulk, admin/email-templates, admin/website-submissions, alert-rules, and notes.service Deferred to future work (substantial refactors, schema migrations, or multi-file UI work): - error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage, ErrorBanner component, /api/ready route, worker DLQ admin surface) - maintainability C1-C4 (documents/search/notes service splits, interest-tabs split — multi-hour refactors) - currency C1-H5 (mixed-currency dashboard aggregation, FX history table, rounding policy) — wait for second non-USD port - outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy) - storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type binding) Tests: 1315/1315 vitest ✅ ; tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
portSlug: ctx.portSlug,
feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
});
return NextResponse.json({
data: {
url: presigned.url,
method: presigned.method,
storageKey,
maxBytes,
backend: backend.name,
},
});
} catch (error) {
return errorResponse(error);
}
};