Files
pn-new-crm/src/components/berths/pdf-reconcile-dialog.tsx

196 lines
6.8 KiB
TypeScript
Raw Normal View History

feat(berths): per-berth PDF storage (versioned) + reverse parser Phase 6b of the berth-recommender refactor (see docs/berth-recommender-and-pdf-plan.md §3.2, §3.3, §4.7b, §11.1, §14.6). Builds on the Phase 6a pluggable storage backend (commit 83693dd) — every file write goes through `getStorageBackend()`; no direct minio imports. Schema (migration 0030_berth_pdf_versions): - new table `berth_pdf_versions` with monotonic `version_number` per berth, `storage_key` (renamed convention from §4.7a), sha256, size, `download_url_expires_at` cache slot for §11.1 signed-URL throttling, and `parse_results` jsonb for the audit trail. - new column `berths.current_pdf_version_id` (deferred from Phase 0) with FK to `berth_pdf_versions(id)` ON DELETE SET NULL. - relations + types exported from `schema/berths.ts`. 3-tier reverse parser (`lib/services/berth-pdf-parser.ts`): 1. AcroForm via pdf-lib — pulls named fields (`length_ft`, `mooring_number`, etc.) at confidence 1. Sample PDF has 0 such fields, so this is defensive coverage for future templates. 2. OCR via Tesseract.js — positional/regex heuristics keyed off the §9.2 layout (Length/Width/Water Depth as `<imperial> / <metric>`, `WEEK HIGH / LOW`, `CONFIRMED THROUGH UNTIL <date>`, etc.). Returns per-field confidence + global mean; flags imperial-vs-metric drift >1% in `warnings`. 3. AI fallback — gated via `getResolvedOcrConfig()` (existing openai/claude provider). Surfaced from the diff dialog only when `shouldOfferAiTier()` returns true (mean OCR confidence below 0.55 threshold), so OPENAI_API_KEY isn't burned on every upload. Service layer (`lib/services/berth-pdf.service.ts`): - `uploadBerthPdf()` — magic-byte check, size cap, version-number bump + current pointer in one transaction. - `reconcilePdfWithBerth()` — auto-applies fields where CRM is null; flags conflicts when CRM and PDF disagree; tolerates ±1% on numeric columns; warns on mooring-number-in-PDF mismatch (§14.6). - `applyParseResults()` — hard allowlist of writable columns; stamps `appliedFields` onto `parse_results` for audit. - `rollbackToVersion()` — pointer flip only, never re-parses (§14.6). - `listBerthPdfVersions()` — version list with 15-min signed URLs. - `getMaxUploadMb()` — port-override → global → default 15 lookup on `system_settings.berth_pdf_max_upload_mb`. §14.6 critical mitigations: - Magic-byte check (`%PDF-`) on every upload; mismatch deletes the storage object and rejects the request. - Size cap from `system_settings.berth_pdf_max_upload_mb` (default 15 MB); enforced in the upload-url presign AND server-side. - 0-byte uploads rejected. - Mooring-number mismatch surfaces as a `warnings[]` entry on the reconcile result so the rep sees it in the diff dialog. - Imperial vs metric ±1% tolerance in both the parser warnings and the reconcile equality check. - Path traversal already blocked at the storage layer (Phase 6a). API + UI: - `POST /api/v1/berths/[id]/pdf-upload-url` — presigned URL (S3) or HMAC-signed proxy URL (filesystem) sized to the per-port cap. - `POST /api/v1/berths/[id]/pdf-versions` — verifies the upload via `backend.head()`, writes the row, bumps `current_pdf_version_id`. - `GET /api/v1/berths/[id]/pdf-versions` — version list + signed URLs. - `POST /api/v1/berths/[id]/pdf-versions/[versionId]/rollback`. - `POST /api/v1/berths/[id]/pdf-versions/parse-results/apply` — rep-confirmed diff payload. - New "Documents" tab on the berth detail page (`berth-tabs.tsx`) with current-PDF panel, version history, Replace PDF button, and `<PdfReconcileDialog>` for the auto-applied + conflicts UX. System settings: - `berth_pdf_max_upload_mb` (default 15) — caps presigned-upload size + server-side validation. Resolved port-override → global → default. Tests: - `tests/unit/services/berth-pdf-parser.test.ts` — magic bytes, feet-inches, human dates, full §9.2-shaped OCR text → 18 fields, drift warning, AI-tier gate. - `tests/unit/services/berth-pdf-acroform.test.ts` — synthetic pdf-lib AcroForm round-trip. - `tests/integration/berth-pdf-versions.test.ts` — upload, version- number bump, magic-byte rejection, reconcile auto-applied vs conflicts vs ±1% tolerance, mooring-number warning, applyParseResults allowlist enforcement, rollback semantics. Acceptance: `pnpm exec tsc --noEmit` clean, `pnpm exec vitest run` green at 1103/1103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 03:34:24 +02:00
/**
* Reconcile-diff dialog (Phase 6b see plan §4.7b, §14.6).
*
* Shown after a successful per-berth PDF upload + parse. Surfaces three
* sections:
* - Warnings (mooring-number mismatch, imperial-vs-metric drift, etc.)
* so the rep can abort before applying.
* - Auto-applied fields fields the parser found that the CRM had as null;
* these are pre-checked and applied on confirm.
* - Conflicts fields where CRM and PDF disagree on a non-null value.
* The rep picks "Keep CRM" or "Use PDF" per row before confirming.
*
* On confirm, the dialog POSTs to /pdf-versions/parse-results/apply with the
* rep-curated `fieldsToApply` map.
*/
'use client';
import { useState } from 'react';
import { useMutation, useQueryClient } from '@tanstack/react-query';
import { toast } from 'sonner';
import { apiFetch } from '@/lib/api/client';
import { Button } from '@/components/ui/button';
import { Checkbox } from '@/components/ui/checkbox';
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle,
} from '@/components/ui/dialog';
interface AutoAppliedField {
field: string;
value: string | number;
}
interface ConflictField {
field: string;
crmValue: string | number | null;
pdfValue: string | number | null;
pdfConfidence: number;
}
export interface PdfReconcileDialogProps {
berthId: string;
versionId: string;
autoApplied: AutoAppliedField[];
conflicts: ConflictField[];
warnings: string[];
onClose: () => void;
}
export function PdfReconcileDialog({
berthId,
versionId,
autoApplied,
conflicts,
warnings,
onClose,
}: PdfReconcileDialogProps) {
const qc = useQueryClient();
// For each auto-applied field: rep can opt out by unchecking.
const [autoChecked, setAutoChecked] = useState<Record<string, boolean>>(
Object.fromEntries(autoApplied.map((f) => [f.field, true])),
);
// For each conflict: 'pdf' applies the PDF value, 'crm' keeps CRM (omit from
// payload), 'skip' is the same as 'crm' but distinct in the UI for clarity.
const [conflictChoice, setConflictChoice] = useState<Record<string, 'pdf' | 'crm'>>(
Object.fromEntries(conflicts.map((c) => [c.field, 'crm'])),
);
const apply = useMutation({
mutationFn: async () => {
const fieldsToApply: Record<string, string | number> = {};
for (const f of autoApplied) if (autoChecked[f.field]) fieldsToApply[f.field] = f.value;
for (const c of conflicts) {
if (conflictChoice[c.field] === 'pdf' && c.pdfValue != null) {
fieldsToApply[c.field] = c.pdfValue;
}
}
return apiFetch(`/api/v1/berths/${berthId}/pdf-versions/parse-results/apply`, {
method: 'POST',
body: { versionId, fieldsToApply },
});
},
onSuccess: () => {
void qc.invalidateQueries({ queryKey: ['berth', berthId] });
void qc.invalidateQueries({ queryKey: ['berth-pdf-versions', berthId] });
toast.success('Berth fields updated from PDF.');
onClose();
},
onError: (err: Error) => {
toast.error('Apply failed', { description: err.message });
},
});
return (
<Dialog open onOpenChange={(open) => (!open ? onClose() : undefined)}>
<DialogContent className="max-w-2xl">
<DialogHeader>
<DialogTitle>Review parsed fields</DialogTitle>
<DialogDescription>
The PDF parser extracted these values. Review and apply the ones you trust.
</DialogDescription>
</DialogHeader>
{warnings.length > 0 ? (
<div className="rounded-md border border-yellow-300 bg-yellow-50 p-3 text-sm">
<p className="font-medium">Warnings</p>
<ul className="mt-1 list-disc pl-5">
{warnings.map((w, i) => (
<li key={i}>{w}</li>
))}
</ul>
</div>
) : null}
{autoApplied.length > 0 ? (
<section>
<h3 className="text-sm font-medium">
Auto-applied <span className="text-muted-foreground">({autoApplied.length})</span>
</h3>
<p className="text-xs text-muted-foreground">
CRM had no value; the PDF supplied one. Uncheck to skip.
</p>
<ul className="mt-2 space-y-1">
{autoApplied.map((f) => (
<li key={f.field} className="flex items-center gap-2 text-sm">
<Checkbox
id={`auto-${f.field}`}
checked={autoChecked[f.field]}
onCheckedChange={(checked) =>
setAutoChecked((prev) => ({ ...prev, [f.field]: checked === true }))
}
/>
<label htmlFor={`auto-${f.field}`} className="flex-1">
<span className="font-medium">{f.field}</span>:{' '}
<span className="text-muted-foreground">{String(f.value)}</span>
</label>
</li>
))}
</ul>
</section>
) : null}
{conflicts.length > 0 ? (
<section>
<h3 className="text-sm font-medium">
Conflicts <span className="text-muted-foreground">({conflicts.length})</span>
</h3>
<p className="text-xs text-muted-foreground">
Pick which value to keep for each field.
</p>
<ul className="mt-2 space-y-2">
{conflicts.map((c) => (
<li
key={c.field}
className="grid grid-cols-[1fr_auto_auto] items-center gap-2 rounded border p-2 text-sm"
>
<span className="font-medium">{c.field}</span>
<Button
size="sm"
variant={conflictChoice[c.field] === 'crm' ? 'default' : 'outline'}
onClick={() => setConflictChoice((prev) => ({ ...prev, [c.field]: 'crm' }))}
>
Keep: {String(c.crmValue)}
</Button>
<Button
size="sm"
variant={conflictChoice[c.field] === 'pdf' ? 'default' : 'outline'}
onClick={() => setConflictChoice((prev) => ({ ...prev, [c.field]: 'pdf' }))}
>
Use PDF: {String(c.pdfValue)} ({Math.round(c.pdfConfidence * 100)}%)
</Button>
</li>
))}
</ul>
</section>
) : null}
<DialogFooter>
<Button variant="outline" onClick={onClose}>
Cancel
</Button>
<Button onClick={() => apply.mutate()} disabled={apply.isPending}>
{apply.isPending ? 'Applying…' : 'Apply'}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
);
}