Files
pn-new-crm/src/lib/services/ocr-providers.ts

197 lines
5.5 KiB
TypeScript
Raw Normal View History

feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
/**
* Receipt OCR provider adapters. Each adapter takes raw image bytes
* and returns a normalized `ParsedReceipt` shape; callers don't care
* which provider produced it.
*/
import OpenAI from 'openai';
import { logger } from '@/lib/logger';
export interface ParsedReceiptLineItem {
description: string;
amount: number;
}
export interface ParsedReceipt {
establishment: string | null;
/** ISO YYYY-MM-DD. */
date: string | null;
amount: number | null;
currency: string | null;
lineItems: ParsedReceiptLineItem[];
/** 0..1; below 0.6 surfaces "verify mode" UI. */
confidence: number;
}
export interface OcrUsage {
inputTokens: number;
outputTokens: number;
requestId: string | null;
}
export interface OcrRunResult {
parsed: ParsedReceipt;
usage: OcrUsage;
}
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
const EMPTY_RESULT: ParsedReceipt = {
establishment: null,
date: null,
amount: null,
currency: null,
lineItems: [],
confidence: 0,
};
const SYSTEM_PROMPT =
'You extract structured data from a marina-business receipt image. Return ONLY a JSON object with these keys: establishment (string), date (ISO YYYY-MM-DD), amount (number, total), currency (3-letter ISO code), lineItems (array of {description, amount}), confidence (number 0-1). If a field cannot be read, return null for that field. Set confidence near 0 if the image is unreadable, near 1 if every field was confidently extracted.';
interface RunArgs {
imageBuffer: Buffer;
mimeType: string;
apiKey: string;
model: string;
}
function safeParse(content: string): ParsedReceipt {
const cleaned = content.replace(/```json\n?|\n?```/g, '').trim();
try {
const obj = JSON.parse(cleaned) as Partial<ParsedReceipt>;
return {
establishment: obj.establishment ?? null,
date: obj.date ?? null,
amount: typeof obj.amount === 'number' ? obj.amount : null,
currency: obj.currency ?? null,
lineItems: Array.isArray(obj.lineItems) ? obj.lineItems : [],
confidence: typeof obj.confidence === 'number' ? obj.confidence : 0,
};
} catch (err) {
logger.warn({ err, contentLen: cleaned.length }, 'OCR provider returned non-JSON');
return EMPTY_RESULT;
}
}
async function runOpenAi({ imageBuffer, mimeType, apiKey, model }: RunArgs): Promise<OcrRunResult> {
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
const client = new OpenAI({ apiKey });
const base64 = imageBuffer.toString('base64');
const response = await client.chat.completions.create({
model,
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
{
role: 'user',
content: [
{ type: 'text', text: 'Extract the receipt as JSON.' },
{
type: 'image_url',
image_url: { url: `data:${mimeType};base64,${base64}` },
},
],
},
],
max_tokens: 1024,
response_format: { type: 'json_object' },
});
const parsed = safeParse(response.choices[0]?.message?.content ?? '{}');
return {
parsed,
usage: {
inputTokens: response.usage?.prompt_tokens ?? 0,
outputTokens: response.usage?.completion_tokens ?? 0,
requestId: response.id ?? null,
},
};
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
}
async function runClaude({ imageBuffer, mimeType, apiKey, model }: RunArgs): Promise<OcrRunResult> {
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
const base64 = imageBuffer.toString('base64');
const res = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': apiKey,
'anthropic-version': '2023-06-01',
},
body: JSON.stringify({
model,
max_tokens: 1024,
system: SYSTEM_PROMPT,
messages: [
{
role: 'user',
content: [
{
type: 'image',
source: { type: 'base64', media_type: mimeType, data: base64 },
},
{ type: 'text', text: 'Extract the receipt as JSON.' },
],
},
],
}),
});
if (!res.ok) {
const detail = await res.text().catch(() => '');
throw new Error(`Claude API ${res.status}: ${detail.slice(0, 200)}`);
}
const body = (await res.json()) as {
id?: string;
content?: Array<{ type: string; text?: string }>;
usage?: { input_tokens?: number; output_tokens?: number };
};
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
const text = body.content?.find((c) => c.type === 'text')?.text ?? '{}';
const parsed = safeParse(text);
return {
parsed,
usage: {
inputTokens: body.usage?.input_tokens ?? 0,
outputTokens: body.usage?.output_tokens ?? 0,
requestId: body.id ?? null,
},
};
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
}
export async function runOcr(args: {
provider: 'openai' | 'claude';
imageBuffer: Buffer;
mimeType: string;
apiKey: string;
model: string;
}): Promise<OcrRunResult> {
feat(phase-b): ship analytics dashboard, alerts, scanner PWA, dedup, audit view Phase B (Insights & Alerts) PR4-11 in one drop. Builds on the schema + service skeletons committed in PRs 1-3. PR4 Analytics dashboard — 4 chart types (funnel/timeline/breakdown/source), date-range picker (today/7d/30d/90d), CSV+PNG export per card. PR5 Alert rail UI + /alerts page — topbar bell w/ live count, dashboard right-rail, three-tab page (active/dismissed/resolved), socket-driven invalidation. Bell lazy-loads list on popover open to keep cold pages fast in non-dashboard routes. PR6 EOI queue tab on documents hub — filters to in-flight EOIs, count surfaces in tab label. PR7 Interests-by-berth tab on berth detail — replaces the stub. PR8 Expense duplicate detection — BullMQ job runs scan on create, yellow banner on detail w/ Merge / Not-a-duplicate, transactional merge consolidates receipts and archives the source. PR9 Receipt scanner PWA + multi-provider AI — port-scoped /scan route in its own (scanner) group with no dashboard chrome, dynamic per-port manifest, OpenAI + Claude provider abstraction, admin OCR settings page (port-level + super-admin global default w/ opt-in fallback), test-connection endpoint, manual-entry fallback when no key is configured. Verify form always shown before save — no ghost rows. PR10 Audit log read view — swap to tsvector full-text search on the existing GIN index, cursor pagination, filters for entity/action/user /date range, batched actor-email resolution. PR11 Real-API tests — opt-in receipt-ocr.spec (admin save+test, optional real-receipt parse via REALAPI_RECEIPT_FIXTURE) and alert-engine socket-fanout spec gated behind RUN_ALERT_ENGINE_REALAPI. Both skip cleanly without their gate envs so CI stays green. Test totals: vitest 690 -> 713, smoke 130 -> 138, realapi +2 opt-in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:21:55 +02:00
if (args.provider === 'openai') return runOpenAi(args);
return runClaude(args);
}
/**
* Tiny dummy-image probe used by the admin "Test connection" button.
* Returns the raw HTTP status so callers can render plain-English errors.
*/
export async function testProvider(
provider: 'openai' | 'claude',
apiKey: string,
model: string,
): Promise<{ ok: true } | { ok: false; reason: string }> {
// 1×1 transparent PNG.
const pixelPng = Buffer.from(
'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII=',
'base64',
);
try {
await runOcr({
provider,
imageBuffer: pixelPng,
mimeType: 'image/png',
apiKey,
model,
});
return { ok: true };
} catch (err) {
const reason = err instanceof Error ? err.message : 'Unknown error';
return { ok: false, reason };
}
}
export const OCR_FEATURE = 'ocr_receipt';
export const OCR_ESTIMATED_TOKENS = 2048;