feat(ai): per-port token budgets + usage ledger for AI features

Adds a token-denominated guardrail in front of every server-side AI call so a misconfigured port can't run up an unbounded bill. Soft caps surface a banner; hard caps refuse new requests until the period rolls over. Usage flows into a feature-typed ledger so future AI surfaces (summary, embeddings, reply-draft) can drop in without schema changes. - New table ai_usage_ledger (port, user, feature, provider, model, input/output/total tokens, request id) with two indexes for rollup - New service ai-budget.service.ts: getAiBudget/setAiBudget, checkBudget (pre-flight gate), recordAiUsage, currentPeriodTokens, periodBreakdown — all token-based, period boundaries in UTC - runOcr now returns provider usage so the route can record the actual spend instead of estimating - Scan-receipt route gates on checkBudget before invoking AI; returns source: manual / reason: budget-exceeded when blocked, surfaces softCapWarning on the success path - Admin UI: new AiBudgetCard on the OCR settings page — shows current spend, per-feature breakdown, soft/hard cap inputs, period selector - Permission: admin.manage_settings on both routes Tests: 766/766 vitest (was 756) — +10 budget tests covering enforce/ disabled/cap-exceed/estimate-exceed/soft-warn/period boundaries/ cross-port isolation/silent ledger failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:53:09 +02:00
parent 2cf1bd9754
commit e7d23b254c
12 changed files with 10841 additions and 19 deletions
--- a/src/lib/db/schema/ai-usage.ts
+++ b/src/lib/db/schema/ai-usage.ts
@@ -0,0 +1,50 @@
+/**
+ * AI usage ledger.
+ *
+ * Every server-side AI provider call records one row here so admins can
+ * audit spend per port, per feature, per user. Per-port budgets (stored
+ * in `system_settings` under `ai.budget`) read this table to enforce
+ * soft warnings and hard caps.
+ *
+ * Token-denominated rather than dollar-denominated so the cap survives
+ * model price changes — and it's the unit both OpenAI and Anthropic
+ * SDKs return in `response.usage`.
+ */
+
+import { pgTable, text, timestamp, integer, index } from 'drizzle-orm/pg-core';
+
+import { ports } from './ports';
+import { user } from './users';
+
+export const aiUsageLedger = pgTable(
+  'ai_usage_ledger',
+  {
+    id: text('id')
+      .primaryKey()
+      .$defaultFn(() => crypto.randomUUID()),
+    portId: text('port_id')
+      .notNull()
+      .references(() => ports.id, { onDelete: 'cascade' }),
+    /** Optional — system-initiated calls (e.g. scheduled summarizers) won't have a user. */
+    userId: text('user_id').references(() => user.id, { onDelete: 'set null' }),
+    /** Stable feature key: 'ocr', 'summary', 'embedding', 'reply_draft', etc. */
+    feature: text('feature').notNull(),
+    /** 'openai' | 'claude' | 'tesseract' (free, recorded for parity). */
+    provider: text('provider').notNull(),
+    model: text('model').notNull(),
+    inputTokens: integer('input_tokens').notNull().default(0),
+    outputTokens: integer('output_tokens').notNull().default(0),
+    /** input + output. Indexed and used for budget rollup queries. */
+    totalTokens: integer('total_tokens').notNull().default(0),
+    /** Provider-side request id for cross-referencing with provider logs. */
+    requestId: text('request_id'),
+    createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
+  },
+  (table) => [
+    index('idx_ai_usage_port_created').on(table.portId, table.createdAt),
+    index('idx_ai_usage_port_feature_created').on(table.portId, table.feature, table.createdAt),
+  ],
+);
+
+export type AiUsageRow = typeof aiUsageLedger.$inferSelect;
+export type NewAiUsageRow = typeof aiUsageLedger.$inferInsert;