feat(ocr): Tesseract.js as default scanner, AI as opt-in per port

The mobile receipt scanner now runs Tesseract.js in-browser by default —
on-device, free, and image bytes never leave the device. AI providers
(OpenAI / Claude) become a per-port opt-in for higher accuracy on
hard-to-read receipts.

- Lazy-load Tesseract WASM in src/lib/ocr/tesseract-client.ts (5 MB
  bundle dynamic-imports on first scan, not in main chunk)
- Heuristic parser src/lib/ocr/parse-receipt-text.ts extracts vendor,
  date, amount, currency, and line items from raw OCR text
- New port-scoped aiEnabled flag on OcrConfig (defaults false). Resolved
  flag never inherits from the global row — each port admin opts in
  independently
- Scan endpoint short-circuits to manual-mode when aiEnabled=false so
  the AI provider is never invoked unless the admin has flipped the
  switch
- Scan UI runs Tesseract first, then asks the server whether AI is
  enabled — uses the AI result only when its confidence beats Tesseract;
  network failures degrade gracefully to the local parse
- Admin OCR-settings form gains the per-port aiEnabled checkbox

Tests: 756/756 vitest (was 747) — +7 parser unit tests, +2 aiEnabled
config tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Ciaccio
2026-04-28 19:46:29 +02:00
parent 46937bbcb9
commit 2cf1bd9754
11 changed files with 693 additions and 38 deletions

View File

@@ -28,6 +28,7 @@ interface ConfigResp {
model: string;
hasApiKey: boolean;
useGlobal: boolean;
aiEnabled: boolean;
};
models: Record<Provider, string[]>;
}
@@ -56,6 +57,7 @@ function SettingsBlock({ scope, title, description, showUseGlobal }: SettingsBlo
const [apiKey, setApiKey] = useState('');
const [showKey, setShowKey] = useState(false);
const [useGlobal, setUseGlobal] = useState(false);
const [aiEnabled, setAiEnabled] = useState(false);
const [testStatus, setTestStatus] = useState<null | { ok: true } | { ok: false; reason: string }>(
null,
);
@@ -65,6 +67,7 @@ function SettingsBlock({ scope, title, description, showUseGlobal }: SettingsBlo
setProvider(data.data.provider);
setModel(data.data.model);
setUseGlobal(data.data.useGlobal);
setAiEnabled(data.data.aiEnabled);
}, [data?.data]);
const save = useMutation({
@@ -78,6 +81,7 @@ function SettingsBlock({ scope, title, description, showUseGlobal }: SettingsBlo
apiKey: apiKey.length > 0 ? apiKey : undefined,
clearApiKey: Boolean(clearApiKey),
useGlobal: scope === 'global' ? false : useGlobal,
aiEnabled: scope === 'global' ? false : aiEnabled,
},
}),
onSuccess: () => {
@@ -143,6 +147,26 @@ function SettingsBlock({ scope, title, description, showUseGlobal }: SettingsBlo
</div>
) : null}
{scope === 'port' ? (
<div className="flex items-start gap-2 rounded-lg border border-border bg-muted/30 p-3">
<Checkbox
id={`aiEnabled-${scope}`}
checked={aiEnabled}
onCheckedChange={(v) => setAiEnabled(v === true)}
/>
<div className="space-y-0.5">
<Label htmlFor={`aiEnabled-${scope}`} className="text-sm font-medium">
Enable AI receipt parsing for this port
</Label>
<p className="text-xs text-muted-foreground">
Off by default. Receipts are read on-device using Tesseract.js accurate enough for
most receipts and incurs no AI cost. Turning this on lets the configured provider
re-parse receipts server-side for higher accuracy on hard-to-read images.
</p>
</div>
</div>
) : null}
<div className="grid grid-cols-1 gap-4 sm:grid-cols-2">
<div className="space-y-1.5">
<Label htmlFor={`provider-${scope}`}>Provider</Label>
@@ -267,14 +291,14 @@ export function OcrSettingsForm() {
<PageHeader
title="Receipt OCR"
eyebrow="Admin"
description="Configure the AI provider used to read receipts captured via the mobile scanner."
description="Receipts are scanned on-device by default. Optionally configure an AI provider for higher-accuracy parsing on tricky receipts."
variant="gradient"
/>
<SettingsBlock
scope="port"
title="This port"
description="Provider and key used when staff at this port scan a receipt."
description="Optional AI provider for staff at this port. Tesseract.js handles all scans on-device until AI is enabled."
showUseGlobal
/>