Adds a token-denominated guardrail in front of every server-side AI call
so a misconfigured port can't run up an unbounded bill. Soft caps surface
a banner; hard caps refuse new requests until the period rolls over.
Usage flows into a feature-typed ledger so future AI surfaces (summary,
embeddings, reply-draft) can drop in without schema changes.
- New table ai_usage_ledger (port, user, feature, provider, model,
input/output/total tokens, request id) with two indexes for rollup
- New service ai-budget.service.ts: getAiBudget/setAiBudget,
checkBudget (pre-flight gate), recordAiUsage, currentPeriodTokens,
periodBreakdown — all token-based, period boundaries in UTC
- runOcr now returns provider usage so the route can record the actual
spend instead of estimating
- Scan-receipt route gates on checkBudget before invoking AI; returns
source: manual / reason: budget-exceeded when blocked, surfaces
softCapWarning on the success path
- Admin UI: new AiBudgetCard on the OCR settings page — shows current
spend, per-feature breakdown, soft/hard cap inputs, period selector
- Permission: admin.manage_settings on both routes
Tests: 766/766 vitest (was 756) — +10 budget tests covering enforce/
disabled/cap-exceed/estimate-exceed/soft-warn/period boundaries/
cross-port isolation/silent ledger failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>