feat(ocr): Tesseract.js as default scanner, AI as opt-in per port

The mobile receipt scanner now runs Tesseract.js in-browser by default —
on-device, free, and image bytes never leave the device. AI providers
(OpenAI / Claude) become a per-port opt-in for higher accuracy on
hard-to-read receipts.

- Lazy-load Tesseract WASM in src/lib/ocr/tesseract-client.ts (5 MB
  bundle dynamic-imports on first scan, not in main chunk)
- Heuristic parser src/lib/ocr/parse-receipt-text.ts extracts vendor,
  date, amount, currency, and line items from raw OCR text
- New port-scoped aiEnabled flag on OcrConfig (defaults false). Resolved
  flag never inherits from the global row — each port admin opts in
  independently
- Scan endpoint short-circuits to manual-mode when aiEnabled=false so
  the AI provider is never invoked unless the admin has flipped the
  switch
- Scan UI runs Tesseract first, then asks the server whether AI is
  enabled — uses the AI result only when its confidence beats Tesseract;
  network failures degrade gracefully to the local parse
- Admin OCR-settings form gains the per-port aiEnabled checkbox

Tests: 756/756 vitest (was 747) — +7 parser unit tests, +2 aiEnabled
config tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Ciaccio
2026-04-28 19:46:29 +02:00
parent 46937bbcb9
commit 2cf1bd9754
11 changed files with 693 additions and 38 deletions

View File

@@ -115,6 +115,38 @@ describe('OCR config', () => {
expect(resolved.model).toBe('gpt-4o');
});
it('aiEnabled defaults to false and round-trips when toggled', async () => {
const port = await makePort();
await saveOcrConfig(
port.id,
{ provider: 'openai', model: 'gpt-4o-mini', apiKey: 'sk-x' },
'user-1',
);
let resolved = await getResolvedOcrConfig(port.id);
expect(resolved.aiEnabled).toBe(false);
await saveOcrConfig(
port.id,
{ provider: 'openai', model: 'gpt-4o-mini', aiEnabled: true },
'user-1',
);
resolved = await getResolvedOcrConfig(port.id);
expect(resolved.aiEnabled).toBe(true);
expect(resolved.apiKey).toBe('sk-x'); // not wiped by the toggle
});
it('aiEnabled is forced false at global scope', async () => {
await saveOcrConfig(
null,
{ provider: 'openai', model: 'gpt-4o-mini', apiKey: 'g', aiEnabled: true },
'user-1',
);
const port = await makePort();
const resolved = await getResolvedOcrConfig(port.id);
// Resolved AI flag is per-port, not inherited from global.
expect(resolved.aiEnabled).toBe(false);
});
it('global rows force useGlobal=false on save (not meaningful at global scope)', async () => {
await saveOcrConfig(
null,