src/lib/services/expense-ocr.service.ts

/**
 * Claude Vision-driven OCR for expense receipts. PR1 stub: types and the
 * service contract. The actual API call wires up in PR9 with prompt
 * caching on the system text and Haiku 4.5 by default.
 */

export interface OcrLineItem {
  description: string;
  quantity?: number;
  unitPrice?: number;
  amount: number;
}

export interface OcrResult {
  vendor: string | null;
  amount: number | null;
  currency: string | null;
  /** ISO date YYYY-MM-DD. */
  date: string | null;
  lineItems: OcrLineItem[];
  /** 0..1; below 0.6 surfaces "verify mode" UI. */
  confidence: number;
}

export interface OcrContext {
  fileId: string;
  fileUrl: string;
  /** Optional MIME hint; the service still detects from bytes. */
  mimeType?: string;
}

/** Cost ceiling per call (Haiku 4.5 + cached system prompt). PR9 enforces. */
export const OCR_MAX_TOKENS = 1024;
export const OCR_LOW_CONFIDENCE_THRESHOLD = 0.6;

/** Stub - returns "pending" shape so callers can wire UI in PR1 without
 *  Anthropic credentials. */
export async function ocrReceipt(_ctx: OcrContext): Promise<OcrResult> {
  return {
    vendor: null,
    amount: null,
    currency: null,
    date: null,
    lineItems: [],
    confidence: 0,
  };
}
feat(insights): Phase B schema + service skeletons PR1 of Phase B per docs/superpowers/specs/2026-04-28-phase-b-insights-alerts-design.md. Lays the foundation that PRs 2-10 will fill in with behaviour. Schema (migration 0014): - alerts table with rule-engine fields (rule_id, severity, link, entity_type/id, fingerprint, fired/dismissed/acknowledged/resolved timestamps, jsonb metadata). Partial-unique fingerprint index keeps one open row per (port, rule, entity); separate indexes power severity-filtered and time-ordered queries. - analytics_snapshots (port_id, metric_id) -> jsonb cache + computedAt for the 15-min recurring refresh. - expenses: duplicate_of self-FK, dedup_scanned_at, ocr_status/raw/ confidence; partial index on (port, vendor, amount, date) where duplicate_of IS NULL drives the dedup heuristic. - audit_logs.search_text: GENERATED ALWAYS tsvector over action+entity_type+entity_id+user_id, GIN-indexed (drizzle can't model GENERATED ALWAYS in TS yet, so the migration appends manual ALTER + the GIN index). Service skeletons in src/lib/services/: - alerts.service.ts: fingerprintFor, reconcileAlertsForPort (upsert + auto-resolve), dismiss, acknowledge, listAlertsForPort. - alert-rules.ts: RULE_REGISTRY of 10 rule evaluators (currently no-op); PR2 fills in the bodies. - analytics.service.ts: readSnapshot/writeSnapshot with 15-min TTL + no-op compute* stubs for the four chart series; PR3 fills behavior. - expense-dedup.service.ts: scanForDuplicates + markBestDuplicate using the partial dedup index. PR8 wires the BullMQ trigger. - expense-ocr.service.ts: OcrResult/OcrLineItem types + ocrReceipt stub. PR9 wires Claude Vision (Haiku 4.5 + ephemeral system-prompt cache). - audit-search.service.ts: tsvector @@ plainto_tsquery + cursor pagination on (createdAt, id). PR10 wires the admin UI. tsc clean, lint clean, vitest 675/675 (one unrelated AES random-output flake passes solo). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-28 14:43:01 +02:00			`/**`
			`* Claude Vision-driven OCR for expense receipts. PR1 stub: types and the`
			`* service contract. The actual API call wires up in PR9 with prompt`
			`* caching on the system text and Haiku 4.5 by default.`
			`*/`

			`export interface OcrLineItem {`
			`description: string;`
			`quantity?: number;`
			`unitPrice?: number;`
			`amount: number;`
			`}`

			`export interface OcrResult {`
			`vendor: string \| null;`
			`amount: number \| null;`
			`currency: string \| null;`
			`/** ISO date YYYY-MM-DD. */`
			`date: string \| null;`
			`lineItems: OcrLineItem[];`
			`/** 0..1; below 0.6 surfaces "verify mode" UI. */`
			`confidence: number;`
			`}`

			`export interface OcrContext {`
			`fileId: string;`
			`fileUrl: string;`
			`/** Optional MIME hint; the service still detects from bytes. */`
			`mimeType?: string;`
			`}`

			`/** Cost ceiling per call (Haiku 4.5 + cached system prompt). PR9 enforces. */`
			`export const OCR_MAX_TOKENS = 1024;`
			`export const OCR_LOW_CONFIDENCE_THRESHOLD = 0.6;`

chore(style): codebase em-dash sweep + minor layout polish Replaces every em-dash and en-dash with regular ASCII hyphens across comments, JSX strings, and dev-facing logs. Mostly cosmetic but stops the inconsistent mix that crept in over the last few months (some files used em-dashes in comments, others didn't, some used both). Bundles two small dashboard-layout tweaks that touch a couple of already-modified files: - (dashboard)/layout.tsx main padding goes from p-6 to pt-3 px-6 pb-6 so page content sits closer to the topbar. - Sidebar now receives the ports list it needs for the footer port switcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-05-04 22:57:01 +02:00			`/** Stub - returns "pending" shape so callers can wire UI in PR1 without`
feat(insights): Phase B schema + service skeletons PR1 of Phase B per docs/superpowers/specs/2026-04-28-phase-b-insights-alerts-design.md. Lays the foundation that PRs 2-10 will fill in with behaviour. Schema (migration 0014): - alerts table with rule-engine fields (rule_id, severity, link, entity_type/id, fingerprint, fired/dismissed/acknowledged/resolved timestamps, jsonb metadata). Partial-unique fingerprint index keeps one open row per (port, rule, entity); separate indexes power severity-filtered and time-ordered queries. - analytics_snapshots (port_id, metric_id) -> jsonb cache + computedAt for the 15-min recurring refresh. - expenses: duplicate_of self-FK, dedup_scanned_at, ocr_status/raw/ confidence; partial index on (port, vendor, amount, date) where duplicate_of IS NULL drives the dedup heuristic. - audit_logs.search_text: GENERATED ALWAYS tsvector over action+entity_type+entity_id+user_id, GIN-indexed (drizzle can't model GENERATED ALWAYS in TS yet, so the migration appends manual ALTER + the GIN index). Service skeletons in src/lib/services/: - alerts.service.ts: fingerprintFor, reconcileAlertsForPort (upsert + auto-resolve), dismiss, acknowledge, listAlertsForPort. - alert-rules.ts: RULE_REGISTRY of 10 rule evaluators (currently no-op); PR2 fills in the bodies. - analytics.service.ts: readSnapshot/writeSnapshot with 15-min TTL + no-op compute* stubs for the four chart series; PR3 fills behavior. - expense-dedup.service.ts: scanForDuplicates + markBestDuplicate using the partial dedup index. PR8 wires the BullMQ trigger. - expense-ocr.service.ts: OcrResult/OcrLineItem types + ocrReceipt stub. PR9 wires Claude Vision (Haiku 4.5 + ephemeral system-prompt cache). - audit-search.service.ts: tsvector @@ plainto_tsquery + cursor pagination on (createdAt, id). PR10 wires the admin UI. tsc clean, lint clean, vitest 675/675 (one unrelated AES random-output flake passes solo). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-28 14:43:01 +02:00			`* Anthropic credentials. */`
			`export async function ocrReceipt(_ctx: OcrContext): Promise<OcrResult> {`
			`return {`
			`vendor: null,`
			`amount: null,`
			`currency: null,`
			`date: null,`
			`lineItems: [],`
			`confidence: 0,`
			`};`
			`}`