feat(scan): compress phone-photo receipts before upload (browser-image-compression)

Phase 3 — wires `browser-image-compression` into the scan-shell so 4-12 MB
phone photos get crushed to ~500 KB in a WebWorker before any other work
happens. Receipts come back from tesseract + the AI parse much faster on
mobile bandwidth, and the server's sharp pipeline has less to chew on.

compressReceiptIfHeavy(file):
  - Pass-through for SVGs / PDFs / non-images
  - Pass-through for files already under 1 MB
  - Otherwise: imageCompression with maxSizeMB: 0.5, maxWidthOrHeight:
    2000, useWebWorker: true, preserveExif: false (auto-rotate to EXIF
    orientation then strip metadata so the receipt isn't sideways)
  - PNG → JPEG transcode (smaller for natural photo content)
  - Initial quality 0.85 — Tesseract's sweet spot for receipt text
  - Lazy-loaded import: the WebWorker bundle isn't on the critical path
  - try/catch fallback: if compression itself throws, fall through to
    the original file so a corner-case bug never blocks a save

Wired into handleFile(rawFile) before tesseract runs and before the
receipt is sent to /api/v1/expenses/scan-receipt. Downstream upload
through handleSubmit() also benefits because the same compressed File
flows through.

Concrete impact for a 12 MP iPhone receipt (~8 MB):
  Before: 8 MB upload, 8 MB tesseract input
  After:  ~500 KB upload, 2000px max edge tesseract input

Bandwidth + battery + perceived latency win on the mobile expense
scanner path. No behaviour change for desktop file uploads under 1 MB.

1298/1298 vitest green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-12 21:21:37 +02:00
parent d8f1c0c34e
commit 18b6827b77
3 changed files with 58 additions and 1 deletions

View File

@@ -25,6 +25,41 @@ import { cn } from '@/lib/utils';
import { EXPENSE_CATEGORIES, PAYMENT_METHODS } from '@/lib/constants';
import { runTesseract } from '@/lib/ocr/tesseract-client';
// Lazy-loaded compression — the worker bundle isn't on the critical path,
// and most users won't reach this code without first granting camera/file
// access, by which point the module is already paged in.
async function compressReceiptIfHeavy(file: File): Promise<File> {
// Only compress raster images > ~1 MB. PDFs, SVGs, and small files pass
// through untouched. Magic-byte check via mime type — the caller is the
// file picker which trusts the picker output already.
if (!file.type.startsWith('image/') || file.type === 'image/svg+xml') return file;
if (file.size < 1 * 1024 * 1024) return file;
const { default: imageCompression } = await import('browser-image-compression');
try {
const compressed = await imageCompression(file, {
maxSizeMB: 0.5, // ~500 KB target — plenty of resolution for OCR
maxWidthOrHeight: 2000, // tesseract.js's sweet spot for receipt text
useWebWorker: true, // off the main thread; UI stays responsive
// Auto-rotate to EXIF orientation, strip metadata. Phones often
// store the rotation as EXIF rather than rotating pixels; without
// this the receipt comes out sideways and OCR confidence tanks.
preserveExif: false,
fileType: file.type === 'image/png' ? 'image/jpeg' : file.type,
initialQuality: 0.85,
});
// Browser-image-compression typings always say `File`, but in some
// runtimes the value comes through as a plain Blob. Belt-and-suspenders:
// wrap in a File so downstream FormData uses the original filename.
const blob = compressed as unknown as Blob;
if (typeof File !== 'undefined' && blob instanceof File) return blob;
return new File([blob], file.name, { type: blob.type });
} catch {
// Fall back to the original — we don't want a corner-case compression
// bug to block the user from saving an expense.
return file;
}
}
// ─── Types ────────────────────────────────────────────────────────────────────
interface ParsedReceipt {
@@ -301,7 +336,13 @@ export function ScanShell() {
};
}, [imagePreview]);
async function handleFile(file: File) {
async function handleFile(rawFile: File) {
// Compress oversized phone photos to ~500 KB in a WebWorker BEFORE
// we hand the bytes to tesseract or the server. Receipts from 12MP
// cameras are usually 4-8 MB; this drops them to ~250-500 KB without
// visible quality loss for text OCR. Mobile bandwidth + the server's
// sharp pipeline both benefit.
const file = await compressReceiptIfHeavy(rawFile);
if (imagePreview) URL.revokeObjectURL(imagePreview);
setImagePreview(URL.createObjectURL(file));
setState({ kind: 'processing', engine: 'tesseract' });