Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
89 lines
3.1 KiB
TypeScript
89 lines
3.1 KiB
TypeScript
/**
|
|
* Route handlers for `/api/v1/berths/[id]/pdf-versions` (Phase 6b).
|
|
*
|
|
* Lives in handlers.ts (not route.ts) so integration tests can call them
|
|
* directly, bypassing the auth/permission middleware (per CLAUDE.md
|
|
* "Route handler exports" convention).
|
|
*/
|
|
|
|
import { NextResponse } from 'next/server';
|
|
|
|
import { type RouteHandler } from '@/lib/api/helpers';
|
|
import { errorResponse, ValidationError } from '@/lib/errors';
|
|
import { listBerthPdfVersions, uploadBerthPdf } from '@/lib/services/berth-pdf.service';
|
|
|
|
interface PostBody {
|
|
storageKey: string;
|
|
fileName: string;
|
|
fileSizeBytes: number;
|
|
sha256: string;
|
|
parseResults?: {
|
|
engine: 'acroform' | 'ocr' | 'ai';
|
|
extracted?: Record<string, unknown>;
|
|
meanConfidence?: number;
|
|
warnings?: string[];
|
|
};
|
|
}
|
|
|
|
export const getHandler: RouteHandler = async (_req, _ctx, params) => {
|
|
try {
|
|
const versions = await listBerthPdfVersions(params.id!);
|
|
return NextResponse.json({ data: versions });
|
|
} catch (error) {
|
|
return errorResponse(error);
|
|
}
|
|
};
|
|
|
|
export const postHandler: RouteHandler = async (req, ctx, params) => {
|
|
try {
|
|
const body = (await req.json()) as Partial<PostBody>;
|
|
if (!body.storageKey || !body.fileName) {
|
|
throw new ValidationError('storageKey and fileName are required');
|
|
}
|
|
if (typeof body.fileSizeBytes !== 'number' || body.fileSizeBytes <= 0) {
|
|
throw new ValidationError('fileSizeBytes must be a positive integer');
|
|
}
|
|
if (!body.sha256 || typeof body.sha256 !== 'string') {
|
|
throw new ValidationError('sha256 is required');
|
|
}
|
|
const result = await uploadBerthPdf({
|
|
berthId: params.id!,
|
|
storageKey: body.storageKey,
|
|
fileName: body.fileName,
|
|
fileSizeBytes: body.fileSizeBytes,
|
|
sha256: body.sha256,
|
|
uploadedBy: ctx.userId,
|
|
parseResult: body.parseResults
|
|
? {
|
|
engine: body.parseResults.engine,
|
|
// Reconstruct just enough of the ParseResult shape to round-trip
|
|
// through serialization; the rep already saw the conflicts in the
|
|
// diff dialog, so storing the engine + extracted is what we need
|
|
// for audit.
|
|
fields: Object.fromEntries(
|
|
Object.entries(body.parseResults.extracted ?? {}).map(([k, v]) => {
|
|
if (v && typeof v === 'object' && 'value' in v) {
|
|
const obj = v as { value: unknown; confidence?: number };
|
|
return [
|
|
k,
|
|
{
|
|
value: obj.value as never,
|
|
confidence: typeof obj.confidence === 'number' ? obj.confidence : 1,
|
|
engine: body.parseResults!.engine,
|
|
},
|
|
];
|
|
}
|
|
return [k, undefined];
|
|
}),
|
|
) as never,
|
|
meanConfidence: body.parseResults.meanConfidence ?? 1,
|
|
warnings: body.parseResults.warnings ?? [],
|
|
}
|
|
: undefined,
|
|
});
|
|
return NextResponse.json({ data: result }, { status: 201 });
|
|
} catch (error) {
|
|
return errorResponse(error);
|
|
}
|
|
};
|