feat(berths): per-berth PDF storage (versioned) + reverse parser
Phase 6b of the berth-recommender refactor (see
docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6).
Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every
file write goes through `getStorageBackend()`; no direct minio imports.
Schema (migration 0030_berth_pdf_versions):
- new table `berth_pdf_versions` with monotonic `version_number` per
berth, `storage_key` (renamed convention from §4.7a), sha256, size,
`download_url_expires_at` cache slot for §11.1 signed-URL throttling,
and `parse_results` jsonb for the audit trail.
- new column `berths.current_pdf_version_id` (deferred from Phase 0)
with FK to `berth_pdf_versions(id)` ON DELETE SET NULL.
- relations + types exported from `schema/berths.ts`.
3-tier reverse parser (`lib/services/berth-pdf-parser.ts`):
1. AcroForm via pdf-lib — pulls named fields (`length_ft`,
`mooring_number`, etc.) at confidence 1. Sample PDF has 0 such
fields, so this is defensive coverage for future templates.
2. OCR via Tesseract.js — positional/regex heuristics keyed off the
§9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`,
`WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns
per-field confidence + global mean; flags imperial-vs-metric drift
>1% in `warnings`.
3. AI fallback — gated via `getResolvedOcrConfig()` (existing
openai/claude provider). Surfaced from the diff dialog only when
`shouldOfferAiTier()` returns true (mean OCR confidence below
0.55 threshold), so OPENAI_API_KEY isn't burned on every upload.
Service layer (`lib/services/berth-pdf.service.ts`):
- `uploadBerthPdf()` — magic-byte check, size cap, version-number
bump + current pointer in one transaction.
- `reconcilePdfWithBerth()` — auto-applies fields where CRM is null;
flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric
columns; warns on mooring-number-in-PDF mismatch (§14.6).
- `applyParseResults()` — hard allowlist of writable columns;
stamps `appliedFields` onto `parse_results` for audit.
- `rollbackToVersion()` — pointer flip only, never re-parses (§14.6).
- `listBerthPdfVersions()` — version list with 15-min signed URLs.
- `getMaxUploadMb()` — port-override → global → default 15 lookup
on `system_settings.berth_pdf_max_upload_mb`.
§14.6 critical mitigations:
- Magic-byte check (`%PDF-`) on every upload; mismatch deletes the
storage object and rejects the request.
- Size cap from `system_settings.berth_pdf_max_upload_mb` (default
15 MB); enforced in the upload-url presign AND server-side.
- 0-byte uploads rejected.
- Mooring-number mismatch surfaces as a `warnings[]` entry on the
reconcile result so the rep sees it in the diff dialog.
- Imperial vs metric ±1% tolerance in both the parser warnings and
the reconcile equality check.
- Path traversal already blocked at the storage layer (Phase 6a).
API + UI:
- `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or
HMAC-signed proxy URL (filesystem) sized to the per-port cap.
- `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via
`backend.head()`, writes the row, bumps `current_pdf_version_id`.
- `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs.
- `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`.
- `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` —
rep-confirmed diff payload.
- New "Documents" tab on the berth detail page (`berth-tabs.tsx`)
with current-PDF panel, version history, Replace PDF button, and
`<PdfReconcileDialog>` for the auto-applied + conflicts UX.
System settings:
- `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size
+ server-side validation. Resolved port-override → global → default.
Tests:
- `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes,
feet-inches, human dates, full §9.2-shaped OCR text → 18 fields,
drift warning, AI-tier gate.
- `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic
pdf-lib AcroForm round-trip.
- `tests/integration/berth-pdf-versions.test.ts` — upload, version-
number bump, magic-byte rejection, reconcile auto-applied vs
conflicts vs ±1% tolerance, mooring-number warning,
applyParseResults allowlist enforcement, rollback semantics.
Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run`
green at 1103/1103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
269
src/components/berths/berth-documents-tab.tsx
Normal file
269
src/components/berths/berth-documents-tab.tsx
Normal file
@@ -0,0 +1,269 @@
|
||||
/**
|
||||
* Documents tab on the berth detail page (Phase 6b — see plan §5.6).
|
||||
*
|
||||
* Sections:
|
||||
* - Current PDF panel (download link, "Replace PDF" button, parse-engine chip).
|
||||
* - Version history list — newest first, with rollback affordance on every
|
||||
* non-current row.
|
||||
* - Reconcile-diff dialog (PdfReconcileDialog), opened after a successful
|
||||
* upload + parse. Shows auto-applied vs conflicted fields and lets the
|
||||
* rep accept the conflict resolution.
|
||||
*
|
||||
* The actual upload is split in two steps:
|
||||
* 1. POST /pdf-upload-url -> presigned URL + storageKey
|
||||
* 2. PUT the file to that URL (multipart for filesystem-proxy mode, signed
|
||||
* PUT for S3 mode)
|
||||
* 3. POST /pdf-versions with the storage key + parse results
|
||||
*/
|
||||
|
||||
'use client';
|
||||
|
||||
import { useRef, useState } from 'react';
|
||||
import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query';
|
||||
import { toast } from 'sonner';
|
||||
|
||||
import { apiFetch } from '@/lib/api/client';
|
||||
import { Button } from '@/components/ui/button';
|
||||
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
|
||||
import { Badge } from '@/components/ui/badge';
|
||||
import { PdfReconcileDialog } from './pdf-reconcile-dialog';
|
||||
|
||||
interface PdfVersionRow {
|
||||
id: string;
|
||||
versionNumber: number;
|
||||
fileName: string;
|
||||
fileSizeBytes: number;
|
||||
uploadedBy: string;
|
||||
uploadedAt: string;
|
||||
isCurrent: boolean;
|
||||
downloadUrl: string;
|
||||
downloadUrlExpiresAt: string;
|
||||
parseEngine: 'acroform' | 'ocr' | 'ai' | null;
|
||||
}
|
||||
|
||||
interface UploadUrlResponse {
|
||||
url: string;
|
||||
method: 'PUT' | 'POST';
|
||||
storageKey: string;
|
||||
maxBytes: number;
|
||||
backend: 's3' | 'filesystem';
|
||||
}
|
||||
|
||||
export function BerthDocumentsTab({ berthId }: { berthId: string }) {
|
||||
const qc = useQueryClient();
|
||||
const fileInputRef = useRef<HTMLInputElement | null>(null);
|
||||
const [pendingDiff, setPendingDiff] = useState<{
|
||||
versionId: string;
|
||||
autoApplied: Array<{ field: string; value: string | number }>;
|
||||
conflicts: Array<{
|
||||
field: string;
|
||||
crmValue: string | number | null;
|
||||
pdfValue: string | number | null;
|
||||
pdfConfidence: number;
|
||||
}>;
|
||||
warnings: string[];
|
||||
} | null>(null);
|
||||
|
||||
const { data: versions, isLoading } = useQuery<PdfVersionRow[]>({
|
||||
queryKey: ['berth-pdf-versions', berthId],
|
||||
queryFn: () =>
|
||||
apiFetch<{ data: PdfVersionRow[] }>(`/api/v1/berths/${berthId}/pdf-versions`).then(
|
||||
(r) => r.data,
|
||||
),
|
||||
});
|
||||
|
||||
const rollback = useMutation({
|
||||
mutationFn: (versionId: string) =>
|
||||
apiFetch(`/api/v1/berths/${berthId}/pdf-versions/${versionId}/rollback`, {
|
||||
method: 'POST',
|
||||
}),
|
||||
onSuccess: () => {
|
||||
void qc.invalidateQueries({ queryKey: ['berth-pdf-versions', berthId] });
|
||||
void qc.invalidateQueries({ queryKey: ['berth', berthId] });
|
||||
toast.success('Rolled back to selected version.');
|
||||
},
|
||||
onError: (err: Error) => {
|
||||
toast.error('Rollback failed', { description: err.message });
|
||||
},
|
||||
});
|
||||
|
||||
const upload = useMutation({
|
||||
mutationFn: async (file: File) => {
|
||||
// 1. ask the server for a presigned upload URL
|
||||
const upRes = await apiFetch<{ data: UploadUrlResponse }>(
|
||||
`/api/v1/berths/${berthId}/pdf-upload-url`,
|
||||
{
|
||||
method: 'POST',
|
||||
body: { fileName: file.name, sizeBytes: file.size },
|
||||
},
|
||||
);
|
||||
const { url, method, storageKey, maxBytes } = upRes.data;
|
||||
if (file.size > maxBytes) {
|
||||
throw new Error(
|
||||
`File ${(file.size / 1024 / 1024).toFixed(1)} MB exceeds ${(maxBytes / 1024 / 1024).toFixed(0)} MB limit`,
|
||||
);
|
||||
}
|
||||
|
||||
// 2. upload directly to storage (filesystem-proxy or S3)
|
||||
const putRes = await fetch(url, {
|
||||
method,
|
||||
body: file,
|
||||
headers: { 'content-type': 'application/pdf' },
|
||||
credentials: url.startsWith('/') ? 'include' : 'omit',
|
||||
});
|
||||
if (!putRes.ok) {
|
||||
throw new Error(`Storage PUT failed (${putRes.status})`);
|
||||
}
|
||||
|
||||
// 3. compute sha256 in the browser for the metadata row
|
||||
const sha256 = await sha256Hex(file);
|
||||
|
||||
// 4. register the version metadata + parse server-side. The server
|
||||
// runs parseBerthPdf via the buffer from storage; the client
|
||||
// doesn't ship the raw PDF a second time.
|
||||
const verRes = await apiFetch<{ data: { versionId: string } }>(
|
||||
`/api/v1/berths/${berthId}/pdf-versions`,
|
||||
{
|
||||
method: 'POST',
|
||||
body: {
|
||||
storageKey,
|
||||
fileName: file.name,
|
||||
fileSizeBytes: file.size,
|
||||
sha256,
|
||||
},
|
||||
},
|
||||
);
|
||||
return { versionId: verRes.data.versionId };
|
||||
},
|
||||
onSuccess: () => {
|
||||
void qc.invalidateQueries({ queryKey: ['berth-pdf-versions', berthId] });
|
||||
void qc.invalidateQueries({ queryKey: ['berth', berthId] });
|
||||
toast.success('PDF uploaded.');
|
||||
},
|
||||
onError: (err: Error) => {
|
||||
toast.error('Upload failed', { description: err.message });
|
||||
},
|
||||
});
|
||||
|
||||
const onFileChange = (e: React.ChangeEvent<HTMLInputElement>) => {
|
||||
const file = e.target.files?.[0];
|
||||
if (!file) return;
|
||||
if (!file.name.toLowerCase().endsWith('.pdf')) {
|
||||
toast.error('Only PDFs are accepted.');
|
||||
return;
|
||||
}
|
||||
upload.mutate(file);
|
||||
if (fileInputRef.current) fileInputRef.current.value = '';
|
||||
};
|
||||
|
||||
const current = versions?.find((v) => v.isCurrent);
|
||||
const others = versions?.filter((v) => !v.isCurrent) ?? [];
|
||||
|
||||
return (
|
||||
<div className="space-y-6">
|
||||
<Card>
|
||||
<CardHeader className="flex flex-row items-center justify-between pb-3">
|
||||
<CardTitle className="text-sm font-medium">Current PDF</CardTitle>
|
||||
<div>
|
||||
<input
|
||||
ref={fileInputRef}
|
||||
type="file"
|
||||
accept="application/pdf"
|
||||
className="hidden"
|
||||
onChange={onFileChange}
|
||||
/>
|
||||
<Button
|
||||
size="sm"
|
||||
onClick={() => fileInputRef.current?.click()}
|
||||
disabled={upload.isPending}
|
||||
>
|
||||
{upload.isPending ? 'Uploading…' : current ? 'Replace PDF' : 'Upload PDF'}
|
||||
</Button>
|
||||
</div>
|
||||
</CardHeader>
|
||||
<CardContent className="pt-0 text-sm">
|
||||
{isLoading ? (
|
||||
<p className="text-muted-foreground">Loading…</p>
|
||||
) : current ? (
|
||||
<div className="flex flex-wrap items-center gap-2">
|
||||
<a
|
||||
href={current.downloadUrl}
|
||||
target="_blank"
|
||||
rel="noreferrer"
|
||||
className="font-medium underline underline-offset-2"
|
||||
>
|
||||
{current.fileName}
|
||||
</a>
|
||||
<span className="text-muted-foreground">
|
||||
v{current.versionNumber} · {(current.fileSizeBytes / 1024 / 1024).toFixed(2)} MB
|
||||
</span>
|
||||
{current.parseEngine ? <ParseEngineBadge engine={current.parseEngine} /> : null}
|
||||
</div>
|
||||
) : (
|
||||
<p className="text-muted-foreground">No PDF uploaded yet.</p>
|
||||
)}
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
<Card>
|
||||
<CardHeader className="pb-3">
|
||||
<CardTitle className="text-sm font-medium">Version history</CardTitle>
|
||||
</CardHeader>
|
||||
<CardContent className="pt-0">
|
||||
{others.length === 0 ? (
|
||||
<p className="text-sm text-muted-foreground">No prior versions.</p>
|
||||
) : (
|
||||
<ul className="divide-y">
|
||||
{others.map((v) => (
|
||||
<li key={v.id} className="flex items-center justify-between py-2 text-sm">
|
||||
<div>
|
||||
<a href={v.downloadUrl} target="_blank" rel="noreferrer" className="underline">
|
||||
{v.fileName}
|
||||
</a>{' '}
|
||||
<span className="text-muted-foreground">
|
||||
v{v.versionNumber} · {(v.fileSizeBytes / 1024 / 1024).toFixed(2)} MB ·{' '}
|
||||
{new Date(v.uploadedAt).toLocaleDateString()}
|
||||
</span>
|
||||
</div>
|
||||
<Button
|
||||
size="sm"
|
||||
variant="outline"
|
||||
onClick={() => rollback.mutate(v.id)}
|
||||
disabled={rollback.isPending}
|
||||
>
|
||||
Rollback
|
||||
</Button>
|
||||
</li>
|
||||
))}
|
||||
</ul>
|
||||
)}
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
{pendingDiff ? (
|
||||
<PdfReconcileDialog
|
||||
berthId={berthId}
|
||||
versionId={pendingDiff.versionId}
|
||||
autoApplied={pendingDiff.autoApplied}
|
||||
conflicts={pendingDiff.conflicts}
|
||||
warnings={pendingDiff.warnings}
|
||||
onClose={() => setPendingDiff(null)}
|
||||
/>
|
||||
) : null}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function ParseEngineBadge({ engine }: { engine: 'acroform' | 'ocr' | 'ai' }) {
|
||||
const tone = engine === 'acroform' ? 'default' : engine === 'ocr' ? 'secondary' : 'outline';
|
||||
const label = engine === 'acroform' ? 'AcroForm' : engine === 'ocr' ? 'OCR' : 'AI';
|
||||
return <Badge variant={tone}>{label}</Badge>;
|
||||
}
|
||||
|
||||
async function sha256Hex(file: File): Promise<string> {
|
||||
const buf = await file.arrayBuffer();
|
||||
const hash = await crypto.subtle.digest('SHA-256', buf);
|
||||
return Array.from(new Uint8Array(hash))
|
||||
.map((b) => b.toString(16).padStart(2, '0'))
|
||||
.join('');
|
||||
}
|
||||
Reference in New Issue
Block a user