feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
/**
|
|
|
|
|
* Returns a presigned URL the browser can use to PUT a PDF directly to the
|
|
|
|
|
* active storage backend. The URL is constrained by content-length-range up
|
|
|
|
|
* to `system_settings.berth_pdf_max_upload_mb` (default 15 MB) per §11.1.
|
|
|
|
|
*
|
|
|
|
|
* For S3 backends this is a true signed URL; for filesystem backends it's a
|
|
|
|
|
* CRM-internal proxy URL with an HMAC token (see `FilesystemBackend`).
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
import { NextResponse } from 'next/server';
|
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs
Wave through the remaining audit-final-deferred items that aren't blocked
on the back-burnered Documenso work.
Multi-tenant isolation:
- Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim;
verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a
buggy issuer in some future code path that mixes port scopes — every
storage key generated by generateStorageKey() already prefixes the
slug. document-sends opts in for 24h emailed download links; other
callers continue working unchanged via the optional field.
DB schema reconciliation:
- Migration 0047 rebuilds system_settings unique index with NULLS NOT
DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are
uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate
(storage_backend, NULL) rows that had accumulated from race-prone
delete-then-insert patterns in ocr-config / settings / residential-
stages / ai-budget services. All four services converted to true
onConflictDoUpdate upserts so the race window is closed.
API uniformity:
- Response shape standardization: 16 routes converted from
`{ success: true }` to 204 No Content. CLAUDE.md documents the
convention (`{ data: <T> }` for content, 204 for empty mutations,
portal-auth retains `{ success: true }` for the frontend's auth chain).
- req.json() → parseBody() migration across 9 admin/CRM routes
(custom-fields, expenses/export ×3, currency convert,
search/recently-viewed, admin/duplicates, berths/pdf-{upload-url,
versions, parse-results}). Uniform 400 error shapes for
ZodError-flagged bodies.
Custom-fields merge tokens (shipped end-to-end):
- merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the
`{{custom.<fieldName>}}` shape.
- document-templates validator accepts the dynamic shape alongside
the static catalog tokens.
- document-sends.service mergeCustomFieldValues resolver fetches
per-port custom_field_definitions for client/interest/berth contexts
and substitutes stored values keyed by `{{custom.fieldName}}`.
- custom-fields-manager amber banner updated to reflect that merge
tokens now expand (search index + entity-diff remain documented
design limitations).
/api/v1/files cross-entity filtering:
- Validator + listFiles + uploadFile accept companyId AND yachtId
alongside clientId. file-upload-zone propagates both.
- New CompanyFilesTab component mirrors ClientFilesTab; restored as a
visible Documents tab in company-tabs.tsx (was a hidden stub).
Inline TODOs:
- Reviewed remaining two TODOs (per-user reminder schedule, import
worker handlers). Both are placeholders for future feature surfaces,
not bugs — per-port digest works for every customer; nothing
currently enqueues import jobs (verified). Annotated in BACKLOG.
BACKLOG.md updated to reflect what landed and what's still pending
(Documenso-related items still bundled with the back-burnered phases).
Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
|
|
|
import { z } from 'zod';
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
|
|
|
|
|
import { type RouteHandler } from '@/lib/api/helpers';
|
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs
Wave through the remaining audit-final-deferred items that aren't blocked
on the back-burnered Documenso work.
Multi-tenant isolation:
- Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim;
verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a
buggy issuer in some future code path that mixes port scopes — every
storage key generated by generateStorageKey() already prefixes the
slug. document-sends opts in for 24h emailed download links; other
callers continue working unchanged via the optional field.
DB schema reconciliation:
- Migration 0047 rebuilds system_settings unique index with NULLS NOT
DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are
uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate
(storage_backend, NULL) rows that had accumulated from race-prone
delete-then-insert patterns in ocr-config / settings / residential-
stages / ai-budget services. All four services converted to true
onConflictDoUpdate upserts so the race window is closed.
API uniformity:
- Response shape standardization: 16 routes converted from
`{ success: true }` to 204 No Content. CLAUDE.md documents the
convention (`{ data: <T> }` for content, 204 for empty mutations,
portal-auth retains `{ success: true }` for the frontend's auth chain).
- req.json() → parseBody() migration across 9 admin/CRM routes
(custom-fields, expenses/export ×3, currency convert,
search/recently-viewed, admin/duplicates, berths/pdf-{upload-url,
versions, parse-results}). Uniform 400 error shapes for
ZodError-flagged bodies.
Custom-fields merge tokens (shipped end-to-end):
- merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the
`{{custom.<fieldName>}}` shape.
- document-templates validator accepts the dynamic shape alongside
the static catalog tokens.
- document-sends.service mergeCustomFieldValues resolver fetches
per-port custom_field_definitions for client/interest/berth contexts
and substitutes stored values keyed by `{{custom.fieldName}}`.
- custom-fields-manager amber banner updated to reflect that merge
tokens now expand (search index + entity-diff remain documented
design limitations).
/api/v1/files cross-entity filtering:
- Validator + listFiles + uploadFile accept companyId AND yachtId
alongside clientId. file-upload-zone propagates both.
- New CompanyFilesTab component mirrors ClientFilesTab; restored as a
visible Documents tab in company-tabs.tsx (was a hidden stub).
Inline TODOs:
- Reviewed remaining two TODOs (per-user reminder schedule, import
worker handlers). Both are placeholders for future feature surfaces,
not bugs — per-port digest works for every customer; nothing
currently enqueues import jobs (verified). Annotated in BACKLOG.
BACKLOG.md updated to reflect what landed and what's still pending
(Documenso-related items still bundled with the back-burnered phases).
Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
|
|
|
import { parseBody } from '@/lib/api/route-helpers';
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
import { db } from '@/lib/db';
|
|
|
|
|
import { berths } from '@/lib/db/schema/berths';
|
fix(security): scope berth-pdf service entrypoints by portId
Post-merge security review caught a cross-tenant authorization bypass
in the per-berth PDF endpoints (HIGH severity, confidence 10):
GET /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-upload-url
POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback
POST /api/v1/berths/[id]/pdf-versions/parse-results/apply
Each handler looked up the target berth by id only — `eq(berths.id, ...)`.
withAuth resolves ctx.portId from the user-controlled X-Port-Id header
(only verifying the user has SOME role on that port), and
withPermission('berths', 'view'|'edit', ...) is a coarse capability
check, not a row-level grant. A rep with berths:edit on Port A could
supply a Port B berth UUID and:
- list + receive 15-min presigned download URLs to every PDF version
- mint an upload URL targeting `berths/<port-B-id>/uploads/...`
- POST a new version (overwriting current_pdf_version_id on foreign berth)
- rollback to any prior version on a foreign berth
- apply rep-confirmed parse-result fields onto a foreign berth's columns
Sibling routes (waiting-list etc.) already pair the id filter with
`eq(berths.portId, ctx.portId)`, so this was an omission, not design.
Fix:
- Push `portId: string` into uploadBerthPdf, listBerthPdfVersions,
rollbackToVersion, applyParseResults, reconcilePdfWithBerth.
- Each function now filters the berth lookup with
`and(eq(berths.id, ...), eq(berths.portId, portId))` and throws
NotFoundError on mismatch (no foreign-port disclosure).
- Inline the same `and(...)` filter in the pdf-upload-url handler.
- Every handler passes ctx.portId through.
Coverage:
- New `cross-port tenant guard` test exercises every entrypoint with a
foreign-port id and asserts NotFoundError.
- 1164/1164 vitest passing. Typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
|
|
|
import { and, eq } from 'drizzle-orm';
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
import { errorResponse, NotFoundError, ValidationError } from '@/lib/errors';
|
|
|
|
|
import { getMaxUploadMb } from '@/lib/services/berth-pdf.service';
|
|
|
|
|
import { getStorageBackend } from '@/lib/storage';
|
|
|
|
|
|
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs
Wave through the remaining audit-final-deferred items that aren't blocked
on the back-burnered Documenso work.
Multi-tenant isolation:
- Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim;
verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a
buggy issuer in some future code path that mixes port scopes — every
storage key generated by generateStorageKey() already prefixes the
slug. document-sends opts in for 24h emailed download links; other
callers continue working unchanged via the optional field.
DB schema reconciliation:
- Migration 0047 rebuilds system_settings unique index with NULLS NOT
DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are
uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate
(storage_backend, NULL) rows that had accumulated from race-prone
delete-then-insert patterns in ocr-config / settings / residential-
stages / ai-budget services. All four services converted to true
onConflictDoUpdate upserts so the race window is closed.
API uniformity:
- Response shape standardization: 16 routes converted from
`{ success: true }` to 204 No Content. CLAUDE.md documents the
convention (`{ data: <T> }` for content, 204 for empty mutations,
portal-auth retains `{ success: true }` for the frontend's auth chain).
- req.json() → parseBody() migration across 9 admin/CRM routes
(custom-fields, expenses/export ×3, currency convert,
search/recently-viewed, admin/duplicates, berths/pdf-{upload-url,
versions, parse-results}). Uniform 400 error shapes for
ZodError-flagged bodies.
Custom-fields merge tokens (shipped end-to-end):
- merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the
`{{custom.<fieldName>}}` shape.
- document-templates validator accepts the dynamic shape alongside
the static catalog tokens.
- document-sends.service mergeCustomFieldValues resolver fetches
per-port custom_field_definitions for client/interest/berth contexts
and substitutes stored values keyed by `{{custom.fieldName}}`.
- custom-fields-manager amber banner updated to reflect that merge
tokens now expand (search index + entity-diff remain documented
design limitations).
/api/v1/files cross-entity filtering:
- Validator + listFiles + uploadFile accept companyId AND yachtId
alongside clientId. file-upload-zone propagates both.
- New CompanyFilesTab component mirrors ClientFilesTab; restored as a
visible Documents tab in company-tabs.tsx (was a hidden stub).
Inline TODOs:
- Reviewed remaining two TODOs (per-user reminder schedule, import
worker handlers). Both are placeholders for future feature surfaces,
not bugs — per-port digest works for every customer; nothing
currently enqueues import jobs (verified). Annotated in BACKLOG.
BACKLOG.md updated to reflect what landed and what's still pending
(Documenso-related items still bundled with the back-burnered phases).
Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
|
|
|
const postBodySchema = z.object({
|
|
|
|
|
fileName: z.string().min(1).max(255),
|
chore(autonomous-session): consolidate uncommitted work from prior session
Bundles the prior autonomous-session output that was sitting unstaged:
- Em-dash sweep across src/ + tests/ (en-dash/em-dash to hyphen, ~2280 instances)
- country-flag-icons rollout (CountryFlag component, replaces emoji glyphs that
never rendered on Windows; lazy-loads the 3x2 SVG index as a single chunk
after the per-subpath dynamic-import approach silently failed in webpack)
- Admin IA Phase 1+2: 7-domain regroup, 41 to 38 pages, /admin/berths index,
redirects (ocr to ai, reports to dashboard, invitations to users),
docs/admin-ia-proposal.md
- Per-template email tester (registry + endpoint + UI on Email admin page)
- Cancel-document mode picker (delete-from-Documenso vs keep-for-audit)
- Dashboard PDF report: 25 widgets, SVG charts, date-range picker, 11 resolvers
- Customize-widgets per-region sortables at xl+ (charts/rails/feed); single
flat sortable below xl when the layout stacks; per-viewport saved orders
- Audit doc updates capturing each shipped item
- Lint fixes: react-compiler immutability in DonutChart (reduce instead of
let-reassign), set-state-in-effect disables in CountryFlag and
UploadForSigning preview-bytes effect, unused 'confirm' destructures in
interest contract + reservation tabs, unescaped apostrophe in test-template
card copy
2026-05-23 00:52:59 +02:00
|
|
|
/** Size hint in bytes - used to early-reject oversized uploads before we
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
* burn a presigned URL. */
|
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs
Wave through the remaining audit-final-deferred items that aren't blocked
on the back-burnered Documenso work.
Multi-tenant isolation:
- Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim;
verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a
buggy issuer in some future code path that mixes port scopes — every
storage key generated by generateStorageKey() already prefixes the
slug. document-sends opts in for 24h emailed download links; other
callers continue working unchanged via the optional field.
DB schema reconciliation:
- Migration 0047 rebuilds system_settings unique index with NULLS NOT
DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are
uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate
(storage_backend, NULL) rows that had accumulated from race-prone
delete-then-insert patterns in ocr-config / settings / residential-
stages / ai-budget services. All four services converted to true
onConflictDoUpdate upserts so the race window is closed.
API uniformity:
- Response shape standardization: 16 routes converted from
`{ success: true }` to 204 No Content. CLAUDE.md documents the
convention (`{ data: <T> }` for content, 204 for empty mutations,
portal-auth retains `{ success: true }` for the frontend's auth chain).
- req.json() → parseBody() migration across 9 admin/CRM routes
(custom-fields, expenses/export ×3, currency convert,
search/recently-viewed, admin/duplicates, berths/pdf-{upload-url,
versions, parse-results}). Uniform 400 error shapes for
ZodError-flagged bodies.
Custom-fields merge tokens (shipped end-to-end):
- merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the
`{{custom.<fieldName>}}` shape.
- document-templates validator accepts the dynamic shape alongside
the static catalog tokens.
- document-sends.service mergeCustomFieldValues resolver fetches
per-port custom_field_definitions for client/interest/berth contexts
and substitutes stored values keyed by `{{custom.fieldName}}`.
- custom-fields-manager amber banner updated to reflect that merge
tokens now expand (search index + entity-diff remain documented
design limitations).
/api/v1/files cross-entity filtering:
- Validator + listFiles + uploadFile accept companyId AND yachtId
alongside clientId. file-upload-zone propagates both.
- New CompanyFilesTab component mirrors ClientFilesTab; restored as a
visible Documents tab in company-tabs.tsx (was a hidden stub).
Inline TODOs:
- Reviewed remaining two TODOs (per-user reminder schedule, import
worker handlers). Both are placeholders for future feature surfaces,
not bugs — per-port digest works for every customer; nothing
currently enqueues import jobs (verified). Annotated in BACKLOG.
BACKLOG.md updated to reflect what landed and what's still pending
(Documenso-related items still bundled with the back-burnered phases).
Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
|
|
|
sizeBytes: z.number().int().nonnegative().optional(),
|
|
|
|
|
});
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
|
fix(security): scope berth-pdf service entrypoints by portId
Post-merge security review caught a cross-tenant authorization bypass
in the per-berth PDF endpoints (HIGH severity, confidence 10):
GET /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-upload-url
POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback
POST /api/v1/berths/[id]/pdf-versions/parse-results/apply
Each handler looked up the target berth by id only — `eq(berths.id, ...)`.
withAuth resolves ctx.portId from the user-controlled X-Port-Id header
(only verifying the user has SOME role on that port), and
withPermission('berths', 'view'|'edit', ...) is a coarse capability
check, not a row-level grant. A rep with berths:edit on Port A could
supply a Port B berth UUID and:
- list + receive 15-min presigned download URLs to every PDF version
- mint an upload URL targeting `berths/<port-B-id>/uploads/...`
- POST a new version (overwriting current_pdf_version_id on foreign berth)
- rollback to any prior version on a foreign berth
- apply rep-confirmed parse-result fields onto a foreign berth's columns
Sibling routes (waiting-list etc.) already pair the id filter with
`eq(berths.portId, ctx.portId)`, so this was an omission, not design.
Fix:
- Push `portId: string` into uploadBerthPdf, listBerthPdfVersions,
rollbackToVersion, applyParseResults, reconcilePdfWithBerth.
- Each function now filters the berth lookup with
`and(eq(berths.id, ...), eq(berths.portId, portId))` and throws
NotFoundError on mismatch (no foreign-port disclosure).
- Inline the same `and(...)` filter in the pdf-upload-url handler.
- Every handler passes ctx.portId through.
Coverage:
- New `cross-port tenant guard` test exercises every entrypoint with a
foreign-port id and asserts NotFoundError.
- 1164/1164 vitest passing. Typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
|
|
|
export const postHandler: RouteHandler = async (req, ctx, params) => {
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
try {
|
fix(audit): non-Documenso backlog sweep — port-binding, NULLS NOT DISTINCT, custom merge tokens, company docs
Wave through the remaining audit-final-deferred items that aren't blocked
on the back-burnered Documenso work.
Multi-tenant isolation:
- Storage proxy ProxyTokenPayload gains optional `p` (port slug) claim;
verifier asserts `key.startsWith(${p}/)`. Defense-in-depth against a
buggy issuer in some future code path that mixes port scopes — every
storage key generated by generateStorageKey() already prefixes the
slug. document-sends opts in for 24h emailed download links; other
callers continue working unchanged via the optional field.
DB schema reconciliation:
- Migration 0047 rebuilds system_settings unique index with NULLS NOT
DISTINCT (Postgres 15+) so global settings (port_id IS NULL) are
uniquely keyed by `key` alone. Surfaced + dedupe'd 65 duplicate
(storage_backend, NULL) rows that had accumulated from race-prone
delete-then-insert patterns in ocr-config / settings / residential-
stages / ai-budget services. All four services converted to true
onConflictDoUpdate upserts so the race window is closed.
API uniformity:
- Response shape standardization: 16 routes converted from
`{ success: true }` to 204 No Content. CLAUDE.md documents the
convention (`{ data: <T> }` for content, 204 for empty mutations,
portal-auth retains `{ success: true }` for the frontend's auth chain).
- req.json() → parseBody() migration across 9 admin/CRM routes
(custom-fields, expenses/export ×3, currency convert,
search/recently-viewed, admin/duplicates, berths/pdf-{upload-url,
versions, parse-results}). Uniform 400 error shapes for
ZodError-flagged bodies.
Custom-fields merge tokens (shipped end-to-end):
- merge-fields.ts gains CUSTOM_MERGE_TOKEN_RE + helpers for the
`{{custom.<fieldName>}}` shape.
- document-templates validator accepts the dynamic shape alongside
the static catalog tokens.
- document-sends.service mergeCustomFieldValues resolver fetches
per-port custom_field_definitions for client/interest/berth contexts
and substitutes stored values keyed by `{{custom.fieldName}}`.
- custom-fields-manager amber banner updated to reflect that merge
tokens now expand (search index + entity-diff remain documented
design limitations).
/api/v1/files cross-entity filtering:
- Validator + listFiles + uploadFile accept companyId AND yachtId
alongside clientId. file-upload-zone propagates both.
- New CompanyFilesTab component mirrors ClientFilesTab; restored as a
visible Documents tab in company-tabs.tsx (was a hidden stub).
Inline TODOs:
- Reviewed remaining two TODOs (per-user reminder schedule, import
worker handlers). Both are placeholders for future feature surfaces,
not bugs — per-port digest works for every customer; nothing
currently enqueues import jobs (verified). Annotated in BACKLOG.
BACKLOG.md updated to reflect what landed and what's still pending
(Documenso-related items still bundled with the back-burnered phases).
Tests: 1185/1185 vitest, tsc clean.
2026-05-08 02:20:27 +02:00
|
|
|
const body = await parseBody(req, postBodySchema);
|
|
|
|
|
const fileName = body.fileName.trim();
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
if (!fileName) throw new ValidationError('fileName is required');
|
|
|
|
|
|
fix(security): scope berth-pdf service entrypoints by portId
Post-merge security review caught a cross-tenant authorization bypass
in the per-berth PDF endpoints (HIGH severity, confidence 10):
GET /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-versions
POST /api/v1/berths/[id]/pdf-upload-url
POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback
POST /api/v1/berths/[id]/pdf-versions/parse-results/apply
Each handler looked up the target berth by id only — `eq(berths.id, ...)`.
withAuth resolves ctx.portId from the user-controlled X-Port-Id header
(only verifying the user has SOME role on that port), and
withPermission('berths', 'view'|'edit', ...) is a coarse capability
check, not a row-level grant. A rep with berths:edit on Port A could
supply a Port B berth UUID and:
- list + receive 15-min presigned download URLs to every PDF version
- mint an upload URL targeting `berths/<port-B-id>/uploads/...`
- POST a new version (overwriting current_pdf_version_id on foreign berth)
- rollback to any prior version on a foreign berth
- apply rep-confirmed parse-result fields onto a foreign berth's columns
Sibling routes (waiting-list etc.) already pair the id filter with
`eq(berths.portId, ctx.portId)`, so this was an omission, not design.
Fix:
- Push `portId: string` into uploadBerthPdf, listBerthPdfVersions,
rollbackToVersion, applyParseResults, reconcilePdfWithBerth.
- Each function now filters the berth lookup with
`and(eq(berths.id, ...), eq(berths.portId, portId))` and throws
NotFoundError on mismatch (no foreign-port disclosure).
- Inline the same `and(...)` filter in the pdf-upload-url handler.
- Every handler passes ctx.portId through.
Coverage:
- New `cross-port tenant guard` test exercises every entrypoint with a
foreign-port id and asserts NotFoundError.
- 1164/1164 vitest passing. Typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 05:31:33 +02:00
|
|
|
// Tenant-scoped berth lookup. Without `eq(berths.portId, ctx.portId)` a
|
|
|
|
|
// rep with berths:edit on port A could mint an upload URL targeting a
|
|
|
|
|
// port-B berth (the storage key namespace would land under that berth's
|
|
|
|
|
// id, leaking access).
|
|
|
|
|
const berthRow = await db.query.berths.findFirst({
|
|
|
|
|
where: and(eq(berths.id, params.id!), eq(berths.portId, ctx.portId)),
|
|
|
|
|
});
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
if (!berthRow) throw new NotFoundError('Berth');
|
|
|
|
|
|
|
|
|
|
const maxMb = await getMaxUploadMb(berthRow.portId);
|
|
|
|
|
const maxBytes = maxMb * 1024 * 1024;
|
|
|
|
|
if (typeof body.sizeBytes === 'number' && body.sizeBytes > maxBytes) {
|
|
|
|
|
throw new ValidationError(
|
|
|
|
|
`File exceeds ${maxMb} MB upload cap (got ${(body.sizeBytes / 1024 / 1024).toFixed(1)} MB).`,
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Provisional version number: the actual row insert happens in POST
|
|
|
|
|
// /pdf-versions and re-computes via SELECT max+1 inside a transaction,
|
|
|
|
|
// so a race between two reps just shifts which one wins the version
|
|
|
|
|
// slot. The storage key is gen_random_uuid()-namespaced so collisions
|
|
|
|
|
// in the storage layer are impossible.
|
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability
Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the
tractable Critical/High items from each:
error-ux-auditor (5 items)
- C2: 17 toast.error(err.message) sites swept to toastError(err, …) so
every user-visible failure carries a copy-paste Reference ID
- C3: apiFetch synthesizes a client-side correlation id when a 5xx
comes back with a non-JSON body (reverse-proxy HTML pages); message
becomes "The server is unreachable. Please try again." with code
UPSTREAM_UNREACHABLE
- C4: checkRateLimit fails OPEN when Redis is unavailable so an outage
no longer 500s login + portal sign-in; logged at warn so monitoring
catches it
- H2: StorageTimeoutError (name='TimeoutError') replaces the plain
Error throw in s3.ts withTimeout — error-classifier hints fire now
- H5: errorResponse() adopted across /api/storage/[token],
/api/public/website-inquiries, and the Documenso webhook body (drops
the "Invalid secret" reconnaissance string)
outbound-webhook-auditor (5 items)
- C1: signature is now HMAC(secret, `${ts}.${body}`) with the
timestamp surfaced as X-Webhook-Timestamp so receivers can reject
replays outside a freshness window
- C3: dead-letter with reason missing_signing_secret when secret is
null (defence-in-depth against DB tampering / future migration
mistakes)
- H2: webhooks queue bumped to maxAttempts=8 with 30 s base
exponential backoff so a 30 s receiver blip during a deploy no
longer dead-letters every in-flight event; per-queue
backoffDelayMs added to QUEUE_CONFIGS
- M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192
- M2: dispatch-time https:// assertion before fetch, so a bad DB edit
can't slip plaintext through
storage-pathing-auditor (2 items)
- H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…`
with portSlug threaded into backend.presignUpload — engages the
filesystem-proxy port-binding `p` token verifier
- H2: presignDownloadUrl auto-derives portSlug from the key's first
segment when callers don't pass it, so all 8 download sites engage
the `p`-token guard without per-site plumbing
search-auditor (1 item)
- H3: removed dead void wantEmail; void wantPhone; pair plus the
unused looksLikeEmail helper — the bucket-reorder it was scaffolded
for was never wired
maintainability-auditor (1 item)
- M2: swept seven abandoned `void <symbol>` markers and their dead
imports across clients/bulk, interests/bulk, admin/email-templates,
admin/website-submissions, alert-rules, and notes.service
Deferred to future work (substantial refactors, schema migrations, or
multi-file UI work):
- error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage,
ErrorBanner component, /api/ready route, worker DLQ admin surface)
- maintainability C1-C4 (documents/search/notes service splits,
interest-tabs split — multi-hour refactors)
- currency C1-H5 (mixed-currency dashboard aggregation, FX history
table, rounding policy) — wait for second non-USD port
- outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU
with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy)
- storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type
binding)
Tests: 1315/1315 vitest ✅ ; tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
|
|
|
//
|
|
|
|
|
// storage-pathing-auditor H1: prefix the port slug so the
|
|
|
|
|
// filesystem-proxy port-binding token (`p` field) can be wired and
|
|
|
|
|
// the namespace matches `buildStoragePath` (which always leads with
|
|
|
|
|
// the port slug).
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
const sanitized = fileName.replace(/[^a-zA-Z0-9._-]/g, '_').slice(0, 200) || 'berth.pdf';
|
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability
Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the
tractable Critical/High items from each:
error-ux-auditor (5 items)
- C2: 17 toast.error(err.message) sites swept to toastError(err, …) so
every user-visible failure carries a copy-paste Reference ID
- C3: apiFetch synthesizes a client-side correlation id when a 5xx
comes back with a non-JSON body (reverse-proxy HTML pages); message
becomes "The server is unreachable. Please try again." with code
UPSTREAM_UNREACHABLE
- C4: checkRateLimit fails OPEN when Redis is unavailable so an outage
no longer 500s login + portal sign-in; logged at warn so monitoring
catches it
- H2: StorageTimeoutError (name='TimeoutError') replaces the plain
Error throw in s3.ts withTimeout — error-classifier hints fire now
- H5: errorResponse() adopted across /api/storage/[token],
/api/public/website-inquiries, and the Documenso webhook body (drops
the "Invalid secret" reconnaissance string)
outbound-webhook-auditor (5 items)
- C1: signature is now HMAC(secret, `${ts}.${body}`) with the
timestamp surfaced as X-Webhook-Timestamp so receivers can reject
replays outside a freshness window
- C3: dead-letter with reason missing_signing_secret when secret is
null (defence-in-depth against DB tampering / future migration
mistakes)
- H2: webhooks queue bumped to maxAttempts=8 with 30 s base
exponential backoff so a 30 s receiver blip during a deploy no
longer dead-letters every in-flight event; per-queue
backoffDelayMs added to QUEUE_CONFIGS
- M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192
- M2: dispatch-time https:// assertion before fetch, so a bad DB edit
can't slip plaintext through
storage-pathing-auditor (2 items)
- H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…`
with portSlug threaded into backend.presignUpload — engages the
filesystem-proxy port-binding `p` token verifier
- H2: presignDownloadUrl auto-derives portSlug from the key's first
segment when callers don't pass it, so all 8 download sites engage
the `p`-token guard without per-site plumbing
search-auditor (1 item)
- H3: removed dead void wantEmail; void wantPhone; pair plus the
unused looksLikeEmail helper — the bucket-reorder it was scaffolded
for was never wired
maintainability-auditor (1 item)
- M2: swept seven abandoned `void <symbol>` markers and their dead
imports across clients/bulk, interests/bulk, admin/email-templates,
admin/website-submissions, alert-rules, and notes.service
Deferred to future work (substantial refactors, schema migrations, or
multi-file UI work):
- error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage,
ErrorBanner component, /api/ready route, worker DLQ admin surface)
- maintainability C1-C4 (documents/search/notes service splits,
interest-tabs split — multi-hour refactors)
- currency C1-H5 (mixed-currency dashboard aggregation, FX history
table, rounding policy) — wait for second non-USD port
- outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU
with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy)
- storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type
binding)
Tests: 1315/1315 vitest ✅ ; tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
|
|
|
const storageKey = `${ctx.portSlug}/berths/${params.id!}/uploads/${crypto.randomUUID()}_${sanitized}`;
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
|
|
|
|
|
const backend = await getStorageBackend();
|
|
|
|
|
const presigned = await backend.presignUpload(storageKey, {
|
|
|
|
|
contentType: 'application/pdf',
|
|
|
|
|
expirySeconds: 900,
|
fix(audit-wave-11): dossier sweep — error-ux + webhook + storage + search + maintainability
Final pass over the unaddressed AUDIT-2026-05-12 dossiers, taking the
tractable Critical/High items from each:
error-ux-auditor (5 items)
- C2: 17 toast.error(err.message) sites swept to toastError(err, …) so
every user-visible failure carries a copy-paste Reference ID
- C3: apiFetch synthesizes a client-side correlation id when a 5xx
comes back with a non-JSON body (reverse-proxy HTML pages); message
becomes "The server is unreachable. Please try again." with code
UPSTREAM_UNREACHABLE
- C4: checkRateLimit fails OPEN when Redis is unavailable so an outage
no longer 500s login + portal sign-in; logged at warn so monitoring
catches it
- H2: StorageTimeoutError (name='TimeoutError') replaces the plain
Error throw in s3.ts withTimeout — error-classifier hints fire now
- H5: errorResponse() adopted across /api/storage/[token],
/api/public/website-inquiries, and the Documenso webhook body (drops
the "Invalid secret" reconnaissance string)
outbound-webhook-auditor (5 items)
- C1: signature is now HMAC(secret, `${ts}.${body}`) with the
timestamp surfaced as X-Webhook-Timestamp so receivers can reject
replays outside a freshness window
- C3: dead-letter with reason missing_signing_secret when secret is
null (defence-in-depth against DB tampering / future migration
mistakes)
- H2: webhooks queue bumped to maxAttempts=8 with 30 s base
exponential backoff so a 30 s receiver blip during a deploy no
longer dead-letters every in-flight event; per-queue
backoffDelayMs added to QUEUE_CONFIGS
- M1: SSRF denylist gains Oracle Cloud metadata 192.0.0.192
- M2: dispatch-time https:// assertion before fetch, so a bad DB edit
can't slip plaintext through
storage-pathing-auditor (2 items)
- H1: berth-PDF presigned-upload keys now `${portSlug}/berths/…/…`
with portSlug threaded into backend.presignUpload — engages the
filesystem-proxy port-binding `p` token verifier
- H2: presignDownloadUrl auto-derives portSlug from the key's first
segment when callers don't pass it, so all 8 download sites engage
the `p`-token guard without per-site plumbing
search-auditor (1 item)
- H3: removed dead void wantEmail; void wantPhone; pair plus the
unused looksLikeEmail helper — the bucket-reorder it was scaffolded
for was never wired
maintainability-auditor (1 item)
- M2: swept seven abandoned `void <symbol>` markers and their dead
imports across clients/bulk, interests/bulk, admin/email-templates,
admin/website-submissions, alert-rules, and notes.service
Deferred to future work (substantial refactors, schema migrations, or
multi-file UI work):
- error-ux M3-M8 (global-error.tsx, per-route loading.tsx coverage,
ErrorBanner component, /api/ready route, worker DLQ admin surface)
- maintainability C1-C4 (documents/search/notes service splits,
interest-tabs split — multi-hour refactors)
- currency C1-H5 (mixed-currency dashboard aggregation, FX history
table, rounding policy) — wait for second non-USD port
- outbound-webhook C2 (deliveries reaper job), H1 (DNS-rebind TOCTOU
with undici Agent), H3 (circuit-breaker), H5 (presigned-post-policy)
- storage-pathing C2 (orphan reaper), H3-H5 (streaming + content-type
binding)
Tests: 1315/1315 vitest ✅ ; tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:27:32 +02:00
|
|
|
portSlug: ctx.portSlug,
|
feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
|
|
|
});
|
|
|
|
|
|
|
|
|
|
return NextResponse.json({
|
|
|
|
|
data: {
|
|
|
|
|
url: presigned.url,
|
|
|
|
|
method: presigned.method,
|
|
|
|
|
storageKey,
|
|
|
|
|
maxBytes,
|
|
|
|
|
backend: backend.name,
|
|
|
|
|
},
|
|
|
|
|
});
|
|
|
|
|
} catch (error) {
|
|
|
|
|
return errorResponse(error);
|
|
|
|
|
}
|
|
|
|
|
};
|