feat(errors): platform-wide request ids + error codes + admin inspector

End-to-end error-handling overhaul. A user hitting any failure now sees
a plain-text message + stable error code + reference id. A super admin
can paste the id into /admin/errors/<id> for the full request shape,
sanitized body, error stack, and a heuristic likely-cause hint.

REQUEST CONTEXT (AsyncLocalStorage)
- src/lib/request-context.ts mints a per-request frame carrying
  requestId + portId + userId + method + path + start timestamp.
- withAuth wraps every authenticated handler in runWithRequestContext
  and accepts an upstream X-Request-Id header (validated shape) or
  generates a fresh UUID. The id ALWAYS leaves on the X-Request-Id
  response header, including early-return 401/403/4xx paths.
- Pino logger reads from the same context via mixin — every log
  line emitted during the request automatically carries the ids
  with no per-call threading.

ERROR CODE REGISTRY
- src/lib/error-codes.ts defines stable DOMAIN_REASON codes with
  HTTP status + plain-text user-facing message (no jargon, written
  for the rep on the phone with a customer).
- New CodedError class wraps a registered code + optional
  internalMessage (admin-only — never sent to client).
- Existing AppError subclasses got plain-text default rewrites so
  legacy throw sites improve immediately without migration.
- High-impact services migrated to specific codes:
  expenses (RECEIPT_REQUIRED, INVOICE_LINKED), interest-berths
  (CROSS_PORT_LINK_REJECTED), berth-pdf (PDF_MAGIC_BYTE / PDF_EMPTY /
  PDF_TOO_LARGE / VERSION_ALREADY_CURRENT), recommender
  (INTEREST_PORT_MISMATCH).

ERROR ENVELOPE
- errorResponse always sets X-Request-Id header + requestId field.
- 5xx responses include a "Quote error ID …" friendly line.
- 4xx kept clean (validation, permission, not-found don't pollute
  the inspector — they're already in audit log).

PERSISTENCE (error_events table, migration 0040)
- One row per 5xx, keyed on requestId, with method/path/status/error
  name+message/stack head (4KB cap)/sanitized body excerpt (1KB cap;
  password/token/secret/etc keys redacted)/duration/IP/UA/metadata.
- captureErrorEvent extracts Postgres SQLSTATE/severity/cause.code
  so the classifier can recognize FK / unique / NOT NULL / schema-
  drift violations.
- Failure to persist is logged-not-thrown.

LIKELY-CULPRIT CLASSIFIER (src/lib/error-classifier.ts)
- 4-pass heuristic (first match wins):
  1. Postgres SQLSTATE → human reason (23503 FK, 23505 unique,
     42703 schema drift, 53300 connection limit, …)
  2. Error class name (AbortError, TimeoutError, FetchError,
     ZodError)
  3. Stack-path patterns (/lib/storage/, /lib/email/, documenso,
     openai|claude, /queue/workers/)
  4. Free-text message keywords (econnrefused, rate limit, timeout,
     unauthorized|invalid api key)
- Returns { label, hint, subsystem } for the inspector badge.

CLIENT SIDE
- apiFetch throws structured ApiError with message + code + requestId
  + details + retryAfter.
- toastError() helper renders the standard 3-line toast:
  plain message / Error code: X / Reference ID: Y [Copy ID].

ADMIN INSPECTOR
- /<port>/admin/errors lists captured 5xx with status badge + path +
  likely-culprit badge + truncated message + reference id. Filter by
  status code; auto-refresh via TanStack Query.
- /<port>/admin/errors/<requestId> deep-dive: request shape, full
  error name+message+stack, sanitized body excerpt, raw metadata,
  registered-code lookup (so admin can compare to what user saw),
  likely-culprit hint with subsystem tag.
- /<port>/admin/errors/codes is the in-app code reference page —
  every registered code grouped by domain prefix, searchable, with
  HTTP status + user message inline. Linked from inspector header
  so admins can flip to it while triaging.
- Permission: admin.view_audit_log. Super admins see all ports;
  regular admins port-scoped.
- system-monitoring dashboard now surfaces error_events alongside
  permission_denied audit + queue failed jobs (RecentError gains
  source: 'request' variant).

DOCS
- docs/error-handling.md walks through coded errors, plain-text
  message guidelines, client toasting, admin inspector usage,
  persistence rules, classifier internals, pruning, and the
  legacy → CodedError migration path.

MIGRATION SAFETY
- Audit confirmed all 41 migrations (0000-0040) apply cleanly in
  journal order against an empty DB. 0040 references ports(id)
  which exists from 0000. 0035/0038 don't deadlock under sequential
  psql -f. Removed redundant idx_ds_sent_by from 0038 (created in
  0037).

Tests: 1168/1168 vitest passing. tsc clean.
- security-error-responses tests updated for plain-text messages
  + new optional response keys (code/requestId/message).
- berth-pdf-versions tests assert stable error codes via
  toMatchObject({ code }) rather than message regex.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Ciaccio
2026-05-05 14:12:59 +02:00
parent c4a41d5f5b
commit 4723994bdc
26 changed files with 2027 additions and 169 deletions

188
docs/error-handling.md Normal file
View File

@@ -0,0 +1,188 @@
# Error handling
## Overview
Every authenticated request runs inside an `AsyncLocalStorage` frame
that carries a `requestId` (UUID) plus the resolved `portId` / `userId`
/ HTTP method / path / start time. The id surfaces:
- as `X-Request-Id` on every response header (success or failure)
- inside every pino log line emitted during the request
- in the JSON error body returned to the client (`requestId` field)
- as the primary key of the `error_events` row written when a 5xx fires
A user who hits a failure can copy the **Reference ID** from the toast
and a super admin can paste it into `/<port>/admin/errors/<requestId>`
to see the full request context, sanitized body, error stack, and a
heuristic "likely culprit" hint.
## Throwing errors from a service
Use `CodedError` with a registered code:
```ts
import { CodedError } from '@/lib/errors';
if (!hasReceipts && !ack) {
throw new CodedError('EXPENSES_RECEIPT_REQUIRED');
}
```
The code drives:
- the HTTP status (defined in `src/lib/error-codes.ts`)
- the **plain-text user-facing message** (no jargon — written for the
rep on the phone with a customer)
- the stable identifier the user can quote to support
For more verbose internal context — admin-only — use `internalMessage`:
```ts
throw new CodedError('CROSS_PORT_LINK_REJECTED', {
internalMessage: `interest ${a.id} (port ${a.portId}) ↔ berth ${b.id} (port ${b.portId})`,
});
```
The `internalMessage` lands in the `error_events` row and the admin
inspector but **never** reaches the client.
## Adding a new error code
1. Open `src/lib/error-codes.ts`.
2. Add an entry to the `ERROR_CODES` map. Convention: `DOMAIN_REASON`
in SCREAMING_SNAKE_CASE.
```ts
FOO_INVALID_BAR: {
status: 400,
userMessage: 'That bar value is no good. Please try another.',
},
```
3. Use it: `throw new CodedError('FOO_INVALID_BAR')`.
4. The code, status, and message are now contractually stable —
never rename a code once it has shipped. Documentation, UI, and
external integrations may pin to it.
## Plain-text message guidelines
User-facing messages should:
- Avoid internal jargon (no "constraint violation", "FK", "row lock").
- Be written for a rep on the phone with a customer.
- Include the suggested next action when natural ("Ask an admin if you
think you should").
- Not include any technical detail that doesn't help the user — the
request id + error code carry that.
Verbose technical detail belongs in `internalMessage` (admin-only).
## Client side
In a `useMutation`, render errors with the shared helper:
```ts
import { toastError } from '@/lib/api/toast-error';
const mutation = useMutation({
mutationFn: () => apiFetch('/api/v1/foo', { method: 'POST', body: { ... } }),
onSuccess: () => { ... },
onError: (err) => toastError(err),
});
```
The toast renders three lines:
```
{plain-text message}
Error code: EXPENSES_RECEIPT_REQUIRED
Reference ID: 8f3c-ab12-… [Copy ID]
```
The "Copy ID" action puts the request id on the clipboard so the
user can paste it into a support ticket.
## Admin inspector
`/<port>/admin/errors` lists captured 5xx errors:
- Status badge + method + path
- "Likely culprit" badge (heuristic — Postgres SQLSTATE, error name,
stack-path patterns, message keywords)
- Truncated error name + message
- Timestamp + reference id
Click any row for `/<port>/admin/errors/<requestId>` which shows:
- Request shape (method / path / when / duration / port / user / IP / UA)
- Likely culprit + plain-English hint + subsystem tag
- Full error name, message, stack head (first 4 KB)
- Sanitized request body excerpt (max 1 KB; sensitive keys redacted)
- Raw metadata (Postgres SQLSTATE codes, internalMessage, etc.)
Permission: `admin.view_audit_log`. Super admins see every port's
errors; regular admins are scoped to their active port.
## What gets persisted
| Status | error_events row? | Toast shows code? |
| ------ | ----------------- | ----------------- |
| 4xx | No | Yes |
| 5xx | **Yes** | Yes |
4xx errors are user-action mistakes (validation, not-found, permission
denied). They're visible in the audit log but not the error inspector
— that table is reserved for platform faults.
5xx errors hit the `errorEvents` table via `captureErrorEvent` inside
`errorResponse`, which:
1. Reads the request context from ALS.
2. Sanitizes + truncates the body (1 KB cap, sensitive keys redacted).
3. Pulls Postgres `code` / `severity` / `cause.code` if the underlying
error is a `postgres` driver error.
4. Truncates the stack to 4 KB.
5. Inserts one row keyed on `requestId` with `ON CONFLICT DO NOTHING`.
Failure to persist NEVER throws — the user is already getting an
error response; we don't want a logging-pipeline failure to mask it.
## Likely-culprit classifier
`src/lib/error-classifier.ts` runs four passes against an
`error_events` row, first match wins:
1. **Postgres SQLSTATE** (from `metadata.code`): 23502 NOT NULL,
23503 FK, 23505 unique, 23514 CHECK, 42703 schema drift, 42P01
missing table, 40001 serialization, 53300 connection limit, …
2. **Error class name**: `AbortError`, `TimeoutError`, `FetchError`,
`ZodError`.
3. **Stack path**: `/lib/storage/`, `/lib/email/`, `documenso`,
`openai|claude`, `/queue/workers/`.
4. **Message free-text**: `econnrefused`, `rate limit`, `timeout`,
`unauthorized|invalid api key`.
Returns `null` when nothing matches; the inspector renders
"Uncategorized" in that case. Adding a new heuristic is a one-line
edit to the relevant array.
## Pruning
`error_events` rows are dropped after 90 days by the maintenance
worker (TODO: confirm the worker has the deletion path; if not, add
a periodic job that runs `DELETE FROM error_events WHERE created_at <
now() - interval '90 days'`).
## Migration path for legacy throws
Existing `NotFoundError` / `ForbiddenError` / `ConflictError` /
`ValidationError` / `RateLimitError` still work — the user-facing
messages on these classes have been rewritten to plain-text defaults.
Migration to `CodedError` happens opportunistically: when touching a
service to fix something else, swap the throw site for a registered
code.
A follow-up audit pass should walk `git grep "throw new ValidationError"`
and migrate the user-impactful ones to specific codes.

View File

@@ -0,0 +1,246 @@
'use client';
import Link from 'next/link';
import { useParams } from 'next/navigation';
import { useQuery } from '@tanstack/react-query';
import { format } from 'date-fns';
import { ArrowLeft, Copy, Wrench } from 'lucide-react';
import { toast } from 'sonner';
import type { Route } from 'next';
import { Badge } from '@/components/ui/badge';
import { ERROR_CODES, isErrorCode } from '@/lib/error-codes';
import { Button } from '@/components/ui/button';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Skeleton } from '@/components/ui/skeleton';
import { apiFetch } from '@/lib/api/client';
import type { ErrorEvent } from '@/lib/db/schema/system';
import type { LikelyCulprit } from '@/lib/error-classifier';
interface DetailResponse {
data: ErrorEvent & { likelyCulprit: LikelyCulprit | null };
}
/**
* Detail view for a single captured error. Shows everything an admin
* needs to triage:
*
* - Request shape: method, path, status, duration, who fired it
* - Error: name, message, full stack head, (sanitized) request body
* - Likely-culprit hint: heuristic-driven plain-English root-cause
* - Raw metadata: pg SQLSTATE codes, internal-message debug strings
*/
export default function ErrorEventDetailPage() {
const params = useParams<{ portSlug: string; requestId: string }>();
const portSlug = params?.portSlug ?? '';
const requestId = params?.requestId ?? '';
const query = useQuery<DetailResponse>({
queryKey: ['admin', 'error-events', requestId],
queryFn: () => apiFetch<DetailResponse>(`/api/v1/admin/error-events/${requestId}`),
enabled: Boolean(requestId),
});
function copy(text: string, label: string) {
if (typeof navigator === 'undefined' || !navigator.clipboard) return;
void navigator.clipboard.writeText(text);
toast.success(`${label} copied`);
}
if (query.isLoading) {
return (
<div className="space-y-3">
<Skeleton className="h-8 w-48" />
<Skeleton className="h-32 w-full" />
<Skeleton className="h-64 w-full" />
</div>
);
}
const event = query.data?.data;
if (!event) {
return (
<Card>
<CardContent className="py-12 text-center text-sm text-muted-foreground">
Error event not found. It may have been pruned or you may not have access.
</CardContent>
</Card>
);
}
return (
<div className="space-y-4">
<div>
<Button variant="ghost" size="sm" asChild>
<Link href={`/${portSlug}/admin/errors` as Route}>
<ArrowLeft className="mr-1.5 h-4 w-4" />
Back to error list
</Link>
</Button>
</div>
<div className="flex items-center gap-2 flex-wrap">
<h1 className="text-2xl font-bold">Error {requestId.slice(0, 8)}</h1>
<Badge
variant="outline"
className={
event.statusCode >= 500
? 'border-destructive/40 text-destructive'
: 'border-amber-300 text-amber-800'
}
>
{event.statusCode}
</Badge>
{event.likelyCulprit && (
<Badge variant="secondary" className="gap-1">
<Wrench className="h-3 w-3" />
{event.likelyCulprit.label}
</Badge>
)}
<Button size="sm" variant="ghost" onClick={() => copy(requestId, 'Reference ID')}>
<Copy className="mr-1.5 h-3 w-3" />
Copy ID
</Button>
</div>
{event.likelyCulprit && (
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium flex items-center gap-2">
<Wrench className="h-4 w-4" /> Likely culprit
</CardTitle>
</CardHeader>
<CardContent className="text-sm">
<p className="font-medium">{event.likelyCulprit.label}</p>
<p className="text-muted-foreground mt-1">{event.likelyCulprit.hint}</p>
<p className="text-xs text-muted-foreground mt-2">
Subsystem: <code className="font-mono">{event.likelyCulprit.subsystem}</code>
</p>
</CardContent>
</Card>
)}
{/* If the captured error has a registered code on its metadata,
* surface the canonical user-facing message + status from the
* registry so the admin can compare what the user saw to what
* the system actually did. */}
{(() => {
const meta = (event.metadata ?? {}) as Record<string, unknown>;
const code = typeof meta.code === 'string' ? meta.code : null;
if (!code || !isErrorCode(code)) return null;
const def = ERROR_CODES[code];
return (
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium">Error code</CardTitle>
</CardHeader>
<CardContent className="space-y-1 text-sm">
<div className="flex items-center gap-2">
<Badge variant="outline">{def.status}</Badge>
<code className="font-mono text-xs font-semibold">{code}</code>
</div>
<p className="mt-2">{def.userMessage}</p>
<p className="text-xs text-muted-foreground">
Compare to the message the user saw in their toast.{' '}
<Link
href={`/${portSlug}/admin/errors/codes` as Route}
className="text-primary hover:underline"
>
All codes
</Link>
</p>
</CardContent>
</Card>
);
})()}
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium">Request</CardTitle>
</CardHeader>
<CardContent className="grid grid-cols-1 md:grid-cols-2 gap-3 text-sm">
<KV label="Method" value={event.method} />
<KV label="Path" value={event.path} mono />
<KV label="When" value={format(new Date(event.createdAt), 'PPpp')} />
<KV label="Duration" value={event.durationMs ? `${event.durationMs} ms` : '—'} />
<KV label="Port" value={event.portId ?? '(none)'} mono />
<KV label="User" value={event.userId ?? '(none)'} mono />
<KV label="IP" value={event.ipAddress ?? '—'} mono />
<KV label="User agent" value={event.userAgent ?? '—'} />
</CardContent>
</Card>
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium">Error</CardTitle>
</CardHeader>
<CardContent className="space-y-3 text-sm">
<KV label="Name" value={event.errorName ?? '—'} mono />
<div>
<p className="text-xs text-muted-foreground">Message</p>
<p className="mt-0.5 font-mono whitespace-pre-wrap break-words">
{event.errorMessage ?? '—'}
</p>
</div>
{event.errorStack && (
<div>
<div className="flex items-center justify-between">
<p className="text-xs text-muted-foreground">Stack (truncated)</p>
<Button
size="sm"
variant="ghost"
onClick={() => copy(event.errorStack ?? '', 'Stack')}
>
<Copy className="mr-1.5 h-3 w-3" /> Copy
</Button>
</div>
<pre className="mt-1 max-h-96 overflow-auto rounded bg-muted p-2 text-xs font-mono whitespace-pre-wrap break-words">
{event.errorStack}
</pre>
</div>
)}
</CardContent>
</Card>
{event.requestBodyExcerpt && (
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium">
Request body (sanitized, max 1 KB)
</CardTitle>
</CardHeader>
<CardContent>
<pre className="max-h-64 overflow-auto rounded bg-muted p-2 text-xs font-mono whitespace-pre-wrap break-words">
{event.requestBodyExcerpt}
</pre>
</CardContent>
</Card>
)}
{event.metadata !== null &&
typeof event.metadata === 'object' &&
Object.keys(event.metadata as Record<string, unknown>).length > 0 && (
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium">Metadata</CardTitle>
</CardHeader>
<CardContent>
<pre className="overflow-auto rounded bg-muted p-2 text-xs font-mono">
{JSON.stringify(event.metadata, null, 2)}
</pre>
</CardContent>
</Card>
)}
</div>
);
}
function KV({ label, value, mono }: { label: string; value: string | null; mono?: boolean }) {
return (
<div>
<p className="text-xs text-muted-foreground">{label}</p>
<p className={`mt-0.5 ${mono ? 'font-mono text-xs' : ''}`}>{value ?? '—'}</p>
</div>
);
}

View File

@@ -0,0 +1,134 @@
'use client';
import { useState, useMemo } from 'react';
import Link from 'next/link';
import { useParams } from 'next/navigation';
import { ArrowLeft, BookOpen, Search } from 'lucide-react';
import type { Route } from 'next';
import { Badge } from '@/components/ui/badge';
import { Button } from '@/components/ui/button';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Input } from '@/components/ui/input';
import { ERROR_CODES } from '@/lib/error-codes';
/**
* Error-code reference page surfaced inside the admin section so an
* admin investigating a captured error_events row can flip to this
* tab, look up the code the user reported, and read the canonical
* plain-language meaning + status code without leaving the app.
*
* Pulls directly from `src/lib/error-codes.ts` so it stays in sync
* automatically — adding an entry to the registry adds a row here.
*/
export default function ErrorCodeReferencePage() {
const params = useParams<{ portSlug: string }>();
const portSlug = params?.portSlug ?? '';
const [search, setSearch] = useState('');
const entries = useMemo(() => {
const all = Object.entries(ERROR_CODES) as Array<
[string, (typeof ERROR_CODES)[keyof typeof ERROR_CODES]]
>;
if (!search.trim()) return all;
const q = search.trim().toLowerCase();
return all.filter(
([code, def]) => code.toLowerCase().includes(q) || def.userMessage.toLowerCase().includes(q),
);
}, [search]);
// Group by domain prefix (the part before the first underscore) so
// the table reads naturally — Expenses, Berths, Storage, etc.
const grouped = useMemo(() => {
const groups = new Map<string, typeof entries>();
for (const entry of entries) {
const prefix = entry[0].split('_')[0] ?? 'OTHER';
const bucket = groups.get(prefix) ?? [];
bucket.push(entry);
groups.set(prefix, bucket);
}
return [...groups.entries()].sort(([a], [b]) => a.localeCompare(b));
}, [entries]);
return (
<div className="space-y-4">
<div className="flex items-center gap-2">
<Button variant="ghost" size="sm" asChild>
<Link href={`/${portSlug}/admin/errors` as Route}>
<ArrowLeft className="mr-1.5 h-4 w-4" />
Back to error inspector
</Link>
</Button>
</div>
<div className="flex items-start justify-between gap-4 flex-wrap">
<div>
<h1 className="text-2xl font-bold flex items-center gap-2">
<BookOpen className="h-5 w-5" /> Error code reference
</h1>
<p className="text-muted-foreground text-sm mt-1">
Every error code the platform can return, with its HTTP status and the plain-language
message a user sees. Codes are stable identifiers once shipped, they never get
renamed.
</p>
</div>
</div>
<div className="relative max-w-md">
<Search className="pointer-events-none absolute left-2 top-1/2 -translate-y-1/2 h-4 w-4 text-muted-foreground" />
<Input
placeholder="Search code or message…"
value={search}
onChange={(e) => setSearch(e.target.value)}
className="pl-8"
/>
</div>
{grouped.length === 0 ? (
<Card>
<CardContent className="py-12 text-center text-sm text-muted-foreground">
No codes match &quot;{search}&quot;.
</CardContent>
</Card>
) : (
<div className="space-y-4">
{grouped.map(([prefix, items]) => (
<Card key={prefix}>
<CardHeader>
<CardTitle className="text-sm font-medium uppercase tracking-wider text-muted-foreground">
{prefix}
</CardTitle>
</CardHeader>
<CardContent className="divide-y">
{items.map(([code, def]) => (
<div key={code} className="flex items-start gap-3 py-3 first:pt-0 last:pb-0">
<Badge
variant="outline"
className={
def.status >= 500
? 'border-destructive/40 text-destructive'
: def.status >= 400
? 'border-amber-300 text-amber-800'
: 'border-muted'
}
>
{def.status}
</Badge>
<div className="flex-1 min-w-0">
<p className="font-mono text-xs font-semibold">{code}</p>
<p className="text-sm mt-0.5">{def.userMessage}</p>
{'hint' in def && typeof def.hint === 'string' && (
<p className="text-xs text-muted-foreground mt-0.5">{def.hint}</p>
)}
</div>
</div>
))}
</CardContent>
</Card>
))}
</div>
)}
</div>
);
}

View File

@@ -0,0 +1,157 @@
'use client';
import { useState } from 'react';
import Link from 'next/link';
import { useParams } from 'next/navigation';
import { useQuery } from '@tanstack/react-query';
import { format, formatDistanceToNow } from 'date-fns';
import { AlertTriangle, BookOpen, Search, Wrench } from 'lucide-react';
import type { Route } from 'next';
import { Badge } from '@/components/ui/badge';
import { Button } from '@/components/ui/button';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Input } from '@/components/ui/input';
import { Skeleton } from '@/components/ui/skeleton';
import { PageHeader } from '@/components/shared/page-header';
import { EmptyState } from '@/components/shared/empty-state';
import { apiFetch } from '@/lib/api/client';
import { classifyError } from '@/lib/error-classifier';
import type { ErrorEvent } from '@/lib/db/schema/system';
interface ListResponse {
data: ErrorEvent[];
}
/**
* Super-admin error inspector.
*
* Shows the most recent captured 5xx errors with: when, where (HTTP
* method + path), what (error name + message), and a heuristic
* "likely culprit" badge driven by `classifyError`. Click into any
* row for the full stack + body excerpt + raw metadata.
*/
export default function AdminErrorsPage() {
const params = useParams<{ portSlug: string }>();
const portSlug = params?.portSlug ?? '';
const [statusFilter, setStatusFilter] = useState('');
const query = useQuery<ListResponse>({
queryKey: ['admin', 'error-events', { statusFilter }],
queryFn: () => {
const search = new URLSearchParams();
if (statusFilter) search.set('statusCode', statusFilter);
return apiFetch<ListResponse>(
`/api/v1/admin/error-events${search.toString() ? `?${search.toString()}` : ''}`,
);
},
});
const events = query.data?.data ?? [];
return (
<div className="space-y-4">
<PageHeader
title="Error inspector"
description="Captured 5xx errors. Click any row for the full stack, request body excerpt, and likely culprit."
actions={
<Button variant="outline" size="sm" asChild>
<Link href={`/${portSlug}/admin/errors/codes` as Route}>
<BookOpen className="mr-1.5 h-4 w-4" />
Code reference
</Link>
</Button>
}
/>
<Card>
<CardHeader>
<CardTitle className="text-sm font-medium flex items-center gap-2">
<Search className="h-4 w-4" /> Filters
</CardTitle>
</CardHeader>
<CardContent className="flex flex-wrap items-end gap-3">
<div className="space-y-1">
<label className="text-xs text-muted-foreground" htmlFor="status">
Status code
</label>
<Input
id="status"
placeholder="e.g. 500"
value={statusFilter}
onChange={(e) => setStatusFilter(e.target.value.replace(/\D/g, ''))}
className="h-8 w-32"
/>
</div>
{statusFilter && (
<Button variant="ghost" size="sm" className="h-8" onClick={() => setStatusFilter('')}>
Clear
</Button>
)}
</CardContent>
</Card>
{query.isLoading ? (
<div className="space-y-2">
{Array.from({ length: 5 }).map((_, i) => (
<Skeleton key={i} className="h-14 w-full" />
))}
</div>
) : events.length === 0 ? (
<EmptyState
icon={AlertTriangle}
title="No captured errors"
description="Nothing has hit a 5xx in the selected window. That's a good thing."
/>
) : (
<div className="rounded-lg border divide-y">
{events.map((event) => {
const culprit = classifyError(event);
return (
<Link
key={event.requestId}
href={`/${portSlug}/admin/errors/${event.requestId}` as Route}
className="flex items-start gap-3 p-3 hover:bg-muted/40"
>
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2">
<Badge
variant="outline"
className={
event.statusCode >= 500
? 'border-destructive/40 text-destructive'
: 'border-amber-300 text-amber-800'
}
>
{event.statusCode}
</Badge>
<span className="text-xs font-mono uppercase text-muted-foreground">
{event.method}
</span>
<span className="text-sm font-medium truncate">{event.path}</span>
{culprit && (
<Badge variant="secondary" className="gap-1 text-xs">
<Wrench className="h-3 w-3" />
{culprit.label}
</Badge>
)}
</div>
<p className="text-xs text-muted-foreground truncate mt-0.5">
{event.errorName ? `${event.errorName}: ` : ''}
{event.errorMessage ?? '(no message)'}
</p>
<p className="text-xs text-muted-foreground mt-0.5">
{formatDistanceToNow(new Date(event.createdAt), { addSuffix: true })} ·{' '}
{format(new Date(event.createdAt), 'MMM d HH:mm:ss')} · ID{' '}
<code className="font-mono">{event.requestId.slice(0, 12)}</code>
</p>
</div>
</Link>
);
})}
</div>
)}
</div>
);
}

View File

@@ -0,0 +1,45 @@
import { NextResponse } from 'next/server';
import { withAuth, withPermission } from '@/lib/api/helpers';
import { errorResponse, NotFoundError } from '@/lib/errors';
import { classifyError } from '@/lib/error-classifier';
import { getErrorEventById } from '@/lib/services/error-events.service';
/**
* GET /api/v1/admin/error-events/[requestId]
*
* Returns a single captured error_events row plus the heuristic
* "likely culprit" classification. Permission: `admin.view_audit_log`.
*
* Tenant: super admins see any port; regular admins only see events
* captured against their active port.
*/
export const GET = withAuth(
withPermission('admin', 'view_audit_log', async (_req, ctx, params) => {
try {
const requestId = params.requestId;
if (!requestId) throw new NotFoundError('Error event');
const event = await getErrorEventById(requestId);
if (!event) throw new NotFoundError('Error event');
// Tenant scoping. A port_id of null on the row means the error
// fired pre-port-resolve (login page, public form, etc.) — those
// are visible to super admins only.
if (!ctx.isSuperAdmin) {
if (!event.portId || event.portId !== ctx.portId) {
throw new NotFoundError('Error event');
}
}
return NextResponse.json({
data: {
...event,
likelyCulprit: classifyError(event),
},
});
} catch (error) {
return errorResponse(error);
}
}),
);

View File

@@ -0,0 +1,43 @@
import { NextResponse } from 'next/server';
import { withAuth, withPermission } from '@/lib/api/helpers';
import { errorResponse } from '@/lib/errors';
import { listErrorEvents } from '@/lib/services/error-events.service';
/**
* GET /api/v1/admin/error-events
*
* Paginated list of captured 5xx error_events rows. Powers the
* super-admin error inspector at `/admin/errors`. Permission:
* `admin.view_audit_log`.
*
* Query params:
* - statusCode: narrow to a single status (e.g. 500)
* - from / to: ISO date strings; defaults to last 7 days server-side
* - limit: defaults to 100; capped at 500 server-side
*
* Super admins see every port; regular admins are scoped to their port.
*/
export const GET = withAuth(
withPermission('admin', 'view_audit_log', async (req, ctx) => {
try {
const url = new URL(req.url);
const statusCode = url.searchParams.get('statusCode');
const from = url.searchParams.get('from') ?? undefined;
const to = url.searchParams.get('to') ?? undefined;
const limitRaw = url.searchParams.get('limit');
const limit = limitRaw ? Math.min(500, Math.max(1, Number(limitRaw))) : 100;
const events = await listErrorEvents({
portId: ctx.isSuperAdmin ? undefined : ctx.portId,
statusCode: statusCode ? Number(statusCode) : undefined,
from,
to,
limit,
});
return NextResponse.json({ data: events });
} catch (error) {
return errorResponse(error);
}
}),
);

View File

@@ -74,13 +74,57 @@ export async function apiFetch<T = unknown>(url: string, opts: ApiFetchOptions =
if (!res.ok) {
const error = await res.json().catch(() => ({ error: res.statusText }));
throw Object.assign(new Error(error.error ?? 'Request failed'), {
// Surface the request id so toasts can display "Error ID: …" and
// the user can copy it to a support ticket. Server-side wrappers
// always set X-Request-Id, even on early-return 401/403 paths.
const requestId = error.requestId ?? res.headers.get('x-request-id') ?? null;
throw new ApiError({
message: error.error ?? error.message ?? 'Request failed',
status: res.status,
code: error.code,
details: error.details,
code: error.code ?? null,
details: error.details ?? null,
requestId,
retryAfter: typeof error.retryAfter === 'number' ? error.retryAfter : null,
});
}
if (res.status === 204) return undefined as T;
return res.json() as Promise<T>;
}
/**
* Structured client-side error thrown by `apiFetch`. Carries the stable
* fields a toast / error boundary needs to render a useful message:
*
* - `message`: plain-text, ready to show to the user
* - `code`: stable error code from `src/lib/error-codes.ts`
* - `requestId`: paste this to support to find the row in
* `/admin/errors/<requestId>`
*
* Mutations should use the `toastError(err)` helper rather than reading
* these fields directly — that keeps the toast format consistent.
*/
export class ApiError extends Error {
status: number;
code: string | null;
details: unknown;
requestId: string | null;
retryAfter: number | null;
constructor(args: {
message: string;
status: number;
code: string | null;
details: unknown;
requestId: string | null;
retryAfter: number | null;
}) {
super(args.message);
this.name = 'ApiError';
this.status = args.status;
this.code = args.code;
this.details = args.details;
this.requestId = args.requestId;
this.retryAfter = args.retryAfter;
}
}

View File

@@ -1,3 +1,5 @@
import { randomUUID } from 'node:crypto';
import { and, eq } from 'drizzle-orm';
import { NextRequest, NextResponse } from 'next/server';
@@ -8,6 +10,7 @@ import { type RolePermissions } from '@/lib/db/schema/users';
import { createAuditLog } from '@/lib/audit';
import { errorResponse } from '@/lib/errors';
import { logger } from '@/lib/logger';
import { runWithRequestContext, getRequestContext } from '@/lib/request-context';
import {
checkRateLimit,
rateLimiters,
@@ -99,11 +102,36 @@ export function withAuth(
routeContext: { params: Promise<Record<string, string>> },
) => Promise<NextResponse> {
return async (req, routeContext) => {
// Mint or accept a request id BEFORE entering the ALS frame so every
// log line + the response header reference the same value. Clients
// (or upstream proxies) may pre-supply via X-Request-Id; otherwise
// generate a fresh UUID. Pattern-validated so a crafted header can't
// smuggle log-injection chars.
const incomingId = req.headers.get('x-request-id');
const requestId =
incomingId && /^[A-Za-z0-9-]{8,64}$/.test(incomingId) ? incomingId : randomUUID();
/** Stamp `X-Request-Id` onto every response leaving the wrapper. */
const tag = (res: NextResponse): NextResponse => {
res.headers.set('X-Request-Id', requestId);
return res;
};
return runWithRequestContext(
{
requestId,
portId: '',
userId: '',
method: req.method,
path: new URL(req.url).pathname,
startedAt: Date.now(),
},
async () => {
try {
// 1. Validate session via Better Auth.
const session = await auth.api.getSession({ headers: req.headers });
if (!session?.user) {
return NextResponse.json({ error: 'Authentication required' }, { status: 401 });
return tag(NextResponse.json({ error: 'Authentication required' }, { status: 401 }));
}
// 2. Load the CRM user profile (keyed on Better Auth user ID).
@@ -111,13 +139,13 @@ export function withAuth(
where: eq(userProfiles.userId, session.user.id),
});
if (!profile || !profile.isActive) {
return NextResponse.json({ error: 'Account disabled' }, { status: 403 });
return tag(NextResponse.json({ error: 'Account disabled' }, { status: 403 }));
}
// 3. Resolve port context.
// Port ID comes from the X-Port-Id header (set by the client after port
// selection), falling back to the user's default port from preferences.
// It NEVER comes from the request body - SECURITY-GUIDELINES.md §2.1.
// 3. Resolve port context. Port id comes from the X-Port-Id
// header (set by the client after port selection), falling
// back to the user's default port preference. NEVER from the
// request body SECURITY-GUIDELINES.md §2.1.
const portIdFromHeader = req.headers.get('X-Port-Id');
const portId =
portIdFromHeader ??
@@ -125,7 +153,7 @@ export function withAuth(
null;
if (!portId && !profile.isSuperAdmin) {
return NextResponse.json({ error: 'Port context required' }, { status: 400 });
return tag(NextResponse.json({ error: 'Port context required' }, { status: 400 }));
}
// 4. Resolve effective permissions.
@@ -134,7 +162,10 @@ export function withAuth(
if (!profile.isSuperAdmin && portId) {
const portRole = await db.query.userPortRoles.findFirst({
where: and(eq(userPortRoles.userId, profile.userId), eq(userPortRoles.portId, portId)),
where: and(
eq(userPortRoles.userId, profile.userId),
eq(userPortRoles.portId, portId),
),
with: {
role: true,
port: true,
@@ -142,7 +173,7 @@ export function withAuth(
});
if (!portRole) {
return NextResponse.json({ error: 'No access to this port' }, { status: 403 });
return tag(NextResponse.json({ error: 'No access to this port' }, { status: 403 }));
}
permissions = { ...(portRole.role.permissions as RolePermissions) };
@@ -163,9 +194,7 @@ export function withAuth(
) as RolePermissions;
}
// Per-user residential toggle - flips the residential domain on
// top of whatever the role grants. We never use it to *revoke*
// residential access from a role that already grants it.
// Per-user residential toggle.
if (portRole.residentialAccess && permissions) {
permissions = {
...permissions,
@@ -180,18 +209,23 @@ export function withAuth(
};
}
} else if (profile.isSuperAdmin && portId) {
// Super admin still needs portSlug for response context.
// We also validate the portId actually exists - a super-admin session
// must not be able to operate against a fabricated portId.
const port = await db.query.ports.findFirst({
where: eq(ports.id, portId),
});
if (!port) {
return NextResponse.json({ error: 'Port not found' }, { status: 404 });
return tag(NextResponse.json({ error: 'Port not found' }, { status: 404 }));
}
portSlug = port.slug;
}
// Now that the user + port are resolved, enrich the ALS frame
// so log lines + error_events rows pick up the identifiers.
const frame = getRequestContext();
if (frame) {
frame.userId = profile.userId;
frame.portId = portId ?? '';
}
const ctx: AuthContext = {
userId: profile.userId,
portId: portId ?? '',
@@ -207,10 +241,12 @@ export function withAuth(
};
const params = await routeContext.params;
return await handler(req, ctx, params);
return tag(await handler(req, ctx, params));
} catch (error) {
return errorResponse(error);
return tag(errorResponse(error));
}
},
);
};
}

View File

@@ -0,0 +1,49 @@
'use client';
import { toast } from 'sonner';
import { ApiError } from '@/lib/api/client';
/**
* Render an API error as a toast in the consistent platform format:
*
* ┌─────────────────────────────────────────────┐
* │ {plain-text message} │
* │ │
* │ Error code: EXPENSES_RECEIPT_REQUIRED │
* │ Reference ID: ab12-cd34-… [Copy] │
* └─────────────────────────────────────────────┘
*
* Use this anywhere a `useMutation({ onError })` would otherwise just
* call `toast.error(err.message)`. Falls back gracefully when the error
* isn't an ApiError (network errors, programmer errors, etc.).
*/
export function toastError(err: unknown, fallback = 'Something went wrong.'): void {
if (err instanceof ApiError) {
const lines: string[] = [];
if (err.code) lines.push(`Error code: ${err.code}`);
if (err.requestId) lines.push(`Reference ID: ${err.requestId}`);
toast.error(err.message, {
description: lines.length > 0 ? lines.join('\n') : undefined,
// Long enough to read the message + grab the reference id.
duration: 8_000,
action: err.requestId
? {
label: 'Copy ID',
onClick: () => {
if (typeof navigator !== 'undefined' && navigator.clipboard) {
void navigator.clipboard.writeText(err.requestId!);
toast.success('Reference ID copied');
}
},
}
: undefined,
});
return;
}
if (err instanceof Error) {
toast.error(err.message || fallback);
return;
}
toast.error(fallback);
}

View File

@@ -19,5 +19,7 @@ ALTER TABLE document_sends
ADD CONSTRAINT document_sends_sent_by_user_id_user_id_fk
FOREIGN KEY (sent_by_user_id) REFERENCES "user"(id) ON DELETE SET NULL;
CREATE INDEX IF NOT EXISTS idx_ds_sent_by
ON document_sends(sent_by_user_id);
-- Index `idx_ds_sent_by` is created earlier in 0037_missing_fk_indexes.sql
-- (also IF NOT EXISTS so re-running is a no-op). Kept here as a comment
-- reference so a future maintainer reading just this migration knows the
-- index exists rather than thinking it was forgotten.

View File

@@ -0,0 +1,28 @@
-- Per-request error capture table powering the super-admin error
-- inspector. A user pastes the "Error ID: …" they saw on a failed
-- mutation; the admin pulls the full row.
--
-- Pruned at 90 days by the maintenance worker.
CREATE TABLE IF NOT EXISTS error_events (
request_id text PRIMARY KEY,
port_id text REFERENCES ports(id) ON DELETE SET NULL,
user_id text,
status_code integer NOT NULL,
method text NOT NULL,
path text NOT NULL,
error_name text,
error_message text,
error_stack text,
request_body_excerpt text,
user_agent text,
ip_address text,
duration_ms integer,
metadata jsonb DEFAULT '{}'::jsonb,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS idx_error_events_port_created
ON error_events(port_id, created_at);
CREATE INDEX IF NOT EXISTS idx_error_events_status_created
ON error_events(status_code, created_at);

View File

@@ -281,6 +281,13 @@
"when": 1778250000000,
"tag": "0039_expense_trip_label",
"breakpoints": true
},
{
"idx": 40,
"version": "7",
"when": 1778300000000,
"tag": "0040_error_events",
"breakpoints": true
}
]
}

View File

@@ -251,8 +251,60 @@ export const customFieldValues = pgTable(
],
);
/**
* Per-request error capture for the super-admin inspector.
*
* Every unhandled (5xx) failure inside a route handler writes one row
* here so a user who hit "Error ID: ab12-..." can paste that id to a
* super admin who pulls the full context (status, path, body excerpt,
* stack, log lines) without grepping through pino files.
*
* Pruning: rows older than 90 days are dropped by the maintenance worker.
* Row size is bounded by deliberately storing only short stack heads
* + pre-truncated request bodies (1 KB cap per row).
*/
export const errorEvents = pgTable(
'error_events',
{
/**
* Equal to the request id minted in `withAuth` and surfaced to the
* client via `X-Request-Id`. Acting as the PK lets us write
* idempotently when duplicate webhook events fire — `ON CONFLICT
* DO NOTHING` skips re-inserting the same error.
*/
requestId: text('request_id').primaryKey(),
/** Resolves null when the error fired pre-port (e.g. login flow). */
portId: text('port_id').references(() => ports.id, { onDelete: 'set null' }),
/** better-auth user id; null when error fired pre-auth. */
userId: text('user_id'),
statusCode: integer('status_code').notNull(),
method: text('method').notNull(),
/** Pathname only (no query string) — keeps PII and tokens out. */
path: text('path').notNull(),
errorName: text('error_name'),
errorMessage: text('error_message'),
/** First 4 KB of the stack — full stacks live in pino, this is for inspector readability. */
errorStack: text('error_stack'),
/** Sanitized request body (max 1 KB) — secret-sounding keys redacted. */
requestBodyExcerpt: text('request_body_excerpt'),
userAgent: text('user_agent'),
ipAddress: text('ip_address'),
/** Request duration in ms when error fired. */
durationMs: integer('duration_ms'),
/** Free-form bag (e.g. parsed zod issues, db error code). */
metadata: jsonb('metadata').default({}),
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
},
(table) => [
index('idx_error_events_port_created').on(table.portId, table.createdAt),
index('idx_error_events_status_created').on(table.statusCode, table.createdAt),
],
);
export type AuditLog = typeof auditLogs.$inferSelect;
export type NewAuditLog = typeof auditLogs.$inferInsert;
export type ErrorEvent = typeof errorEvents.$inferSelect;
export type NewErrorEvent = typeof errorEvents.$inferInsert;
export type Tag = typeof tags.$inferSelect;
export type NewTag = typeof tags.$inferInsert;
export type Webhook = typeof webhooks.$inferSelect;

216
src/lib/error-classifier.ts Normal file
View File

@@ -0,0 +1,216 @@
/**
* Heuristic "likely culprit" classifier for the admin error inspector.
*
* Given an `error_events` row, returns a short human-readable label +
* a longer hint pointing at the probable root cause. This is best-effort
* — the goal is to save the admin five minutes of stack reading on the
* common cases (FK violations, schema drift, external service outages,
* timeouts) without giving false confidence on the unusual ones.
*
* The classifier reads from data already on the row (no DB lookups),
* so it can run inside a server-component render with no extra cost.
*/
import type { ErrorEvent } from '@/lib/db/schema/system';
export interface LikelyCulprit {
/** Short label for a badge / column. */
label: string;
/** Longer hint shown in the detail view, with a "next step" suggestion. */
hint: string;
/** Subsystem tag for filtering: 'db' | 'storage' | 'email' | … */
subsystem: string;
}
/**
* Postgres SQLSTATE codes commonly thrown by `postgres` driver wrappers.
* Drizzle bubbles these up unchanged on the `code` field. We translate
* the most-common ones into plain-English admin hints.
*/
const PG_CODE_HINTS: Record<string, LikelyCulprit> = {
'23502': {
label: 'NOT NULL violation',
hint: 'A required column was missing on insert. Check the validator vs the schema — a recently added .notNull() column may not have a default.',
subsystem: 'db',
},
'23503': {
label: 'FK violation',
hint: 'A referenced row no longer exists (or never did). Check whether the parent was archived/deleted, or the FK was created without ON DELETE handling.',
subsystem: 'db',
},
'23505': {
label: 'Unique violation',
hint: 'A duplicate value hit a UNIQUE index. Common causes: duplicate name within the same port, retried writes after a transient error, or a partial unique index that should fire.',
subsystem: 'db',
},
'23514': {
label: 'CHECK violation',
hint: 'A value failed a CHECK constraint (e.g. polymorphic discriminator outside its allowed set). Verify the input matches the schema enum.',
subsystem: 'db',
},
'42703': {
label: 'Schema drift',
hint: 'A column referenced by the query does not exist in the database. The most recent migration probably has not been applied — run pnpm db:push or apply the SQL file.',
subsystem: 'db',
},
'42P01': {
label: 'Missing table',
hint: 'The query referenced a table that does not exist. Either a migration is unapplied or the table was renamed.',
subsystem: 'db',
},
'40001': {
label: 'Serialization failure',
hint: 'Two transactions raced. Retrying the operation usually resolves it; if it persists, look for hot-row contention.',
subsystem: 'db',
},
'57014': {
label: 'Query cancelled',
hint: 'The query exceeded the configured timeout or the client disconnected mid-flight.',
subsystem: 'db',
},
'53300': {
label: 'Connection limit',
hint: 'The database connection pool is exhausted. Look for leaked connections or scale the pool.',
subsystem: 'db',
},
};
/** Classify by error name (stable across providers). */
const ERROR_NAME_HINTS: Record<string, LikelyCulprit> = {
AbortError: {
label: 'Request aborted',
hint: 'The client disconnected (closed the tab, navigated away) before the response finished. Usually benign.',
subsystem: 'request',
},
TimeoutError: {
label: 'Timeout',
hint: 'An upstream call exceeded its time budget. Check the external service health (Documenso, MinIO, OpenAI, SMTP).',
subsystem: 'request',
},
FetchError: {
label: 'External service unreachable',
hint: 'A fetch() call failed. Likely the Documenso/MinIO/OpenAI/SMTP host is down or blocked by firewall.',
subsystem: 'integration',
},
ZodError: {
label: 'Validation',
hint: 'A zod schema rejected the input. The details array on the response shows which fields failed.',
subsystem: 'validation',
},
};
/** Classify by stack-path heuristics. The first match wins. */
const STACK_PATH_HINTS: Array<{ pattern: RegExp; culprit: LikelyCulprit }> = [
{
pattern: /\/lib\/storage\//,
culprit: {
label: 'Storage backend',
hint: 'Failure inside MinIO/S3 or the filesystem proxy. Check storage availability and the backend config in admin > storage.',
subsystem: 'storage',
},
},
{
pattern: /\/lib\/email\//,
culprit: {
label: 'Email subsystem',
hint: 'SMTP/IMAP error. Check the configured account credentials in admin > email and the SMTP provider status.',
subsystem: 'email',
},
},
{
pattern: /documenso/i,
culprit: {
label: 'Documenso integration',
hint: 'Failure talking to Documenso. Check the API host + key in admin > integrations and verify Documenso uptime.',
subsystem: 'integration',
},
},
{
pattern: /openai|claude/i,
culprit: {
label: 'AI provider',
hint: 'OpenAI/Claude call failed. Likely a provider outage, expired key, or rate-limit ceiling. Falls back to template draft when available.',
subsystem: 'integration',
},
},
{
pattern: /\/queue\/workers\//,
culprit: {
label: 'Background worker',
hint: 'A BullMQ job threw. Check Redis health and the worker logs for the failed job id.',
subsystem: 'queue',
},
},
];
/** Classify by free-text scan of the error message — last-resort. */
const MESSAGE_HINTS: Array<{ pattern: RegExp; culprit: LikelyCulprit }> = [
{
pattern: /econnrefused|enotfound|getaddrinfo/i,
culprit: {
label: 'Network connection refused',
hint: 'A network call could not reach its host. Check that the dependency (DB, Redis, MinIO, SMTP) is running and reachable.',
subsystem: 'integration',
},
},
{
pattern: /rate.?limit/i,
culprit: {
label: 'Rate limited',
hint: 'An upstream provider rate-limited us. Common with SMTP, OpenAI, and Documenso. Back off or raise the per-port cap.',
subsystem: 'integration',
},
},
{
pattern: /unauthorized|invalid.?(api.?)?key|401/i,
culprit: {
label: 'Auth failure',
hint: 'A credential was rejected by an upstream service. Check the encrypted secrets in admin > integrations.',
subsystem: 'integration',
},
},
{
pattern: /timeout/i,
culprit: {
label: 'Timeout',
hint: 'An operation exceeded its time budget. May be a slow upstream call or a heavy DB query.',
subsystem: 'request',
},
},
];
/**
* Best-effort culprit classification. Returns null when nothing
* matches — the inspector will display "Uncategorized".
*/
export function classifyError(row: ErrorEvent): LikelyCulprit | null {
// 1. Postgres SQLSTATE on the metadata bag.
const meta = row.metadata as { code?: unknown; pgCode?: unknown } | null;
const pgCode =
(typeof meta?.code === 'string' && meta.code) ||
(typeof meta?.pgCode === 'string' && meta.pgCode) ||
null;
if (pgCode && PG_CODE_HINTS[pgCode]) return PG_CODE_HINTS[pgCode];
// 2. Error class name.
if (row.errorName) {
const named = ERROR_NAME_HINTS[row.errorName];
if (named) return named;
}
// 3. Stack path heuristics.
if (row.errorStack) {
for (const { pattern, culprit } of STACK_PATH_HINTS) {
if (pattern.test(row.errorStack)) return culprit;
}
}
// 4. Message free-text.
if (row.errorMessage) {
for (const { pattern, culprit } of MESSAGE_HINTS) {
if (pattern.test(row.errorMessage)) return culprit;
}
}
return null;
}

217
src/lib/error-codes.ts Normal file
View File

@@ -0,0 +1,217 @@
/**
* Error code registry.
*
* Every code is a stable identifier you can quote in a support ticket.
* The catalog drives:
* - the HTTP status returned to the client
* - the user-facing plain-text message (no jargon, no internal terms)
* - the documentation page that lists every code with cause + fix
*
* **Naming convention**: SCREAMING_SNAKE_CASE, prefixed with the domain.
* `EXPENSES_RECEIPT_REQUIRED`
* `BERTHS_PDF_MOORING_MISMATCH`
* `STORAGE_FILE_TOO_LARGE`
*
* **Stability contract**: codes are NEVER renamed once shipped. If the
* underlying meaning shifts, retire the old code by marking it
* deprecated (leave it in the registry forwarding to a new code) and
* add a new one. UI / docs / external integrations may pin to a code.
*
* The plain-text messages are written for the rep on the phone with
* the customer — no "constraint violation", no "FK", no internal
* service names. The error code is the only technical artifact the
* user sees, alongside the request id (`X-Request-Id`).
*/
export interface ErrorCodeEntry {
status: number;
/** Plain-language message shown to end-users (toast / inline). */
userMessage: string;
/** Optional: short hint surfaced under the message in admin views. */
hint?: string;
}
/**
* The full catalog. Adding a new code is a one-line entry — services
* pass the key to `new CodedError('FOO_BAR')` and the rest is automatic.
*/
export const ERROR_CODES = {
// ─── Generic ─────────────────────────────────────────────────────────
INTERNAL: {
status: 500,
userMessage:
'Something went wrong on our end. Please try again, and quote the error ID below if it keeps happening.',
},
UNAUTHORIZED: {
status: 401,
userMessage: 'Please sign in to continue.',
},
SESSION_EXPIRED: {
status: 401,
userMessage: 'Your session has expired. Please sign in again.',
},
FORBIDDEN: {
status: 403,
userMessage: "You don't have permission to do that. Ask an admin if you think you should.",
},
NOT_FOUND: {
status: 404,
userMessage: "We couldn't find what you were looking for. It may have been removed.",
},
RATE_LIMITED: {
status: 429,
userMessage: "You've done that a lot in a short time. Please wait a moment and try again.",
},
// ─── Generic validation ─────────────────────────────────────────────
VALIDATION_ERROR: {
status: 400,
userMessage:
"Some of the information you entered isn't valid. Please check the highlighted fields.",
},
REQUIRED_FIELD_MISSING: {
status: 400,
userMessage: 'A required field is missing.',
},
INVALID_EMAIL: {
status: 400,
userMessage: "That email address doesn't look right.",
},
INVALID_DATE: {
status: 400,
userMessage: "That date doesn't look right.",
},
DUPLICATE_NAME: {
status: 409,
userMessage: 'Something with that name already exists. Try a different name.',
},
// ─── Cross-tenant + auth ────────────────────────────────────────────
PORT_CONTEXT_REQUIRED: {
status: 400,
userMessage: 'Please select a port first.',
},
CROSS_PORT_LINK_REJECTED: {
status: 400,
userMessage: 'You can only link records that belong to the same port.',
},
// ─── Expenses ───────────────────────────────────────────────────────
EXPENSES_RECEIPT_REQUIRED: {
status: 400,
userMessage:
"Please attach a receipt or tick the 'I have no receipt' acknowledgement before saving.",
},
EXPENSES_INVOICE_LINKED: {
status: 409,
userMessage:
"This expense is linked to a non-draft invoice and can't be archived. Detach it from the invoice first.",
},
// ─── Berths ─────────────────────────────────────────────────────────
BERTHS_PDF_MAGIC_BYTE: {
status: 400,
userMessage:
"That file doesn't look like a real PDF. Please re-export it from the original source.",
},
BERTHS_PDF_TOO_LARGE: {
status: 413,
userMessage:
'That PDF is too large. Reduce the file size below the configured upload cap and try again.',
},
BERTHS_PDF_EMPTY: {
status: 400,
userMessage: 'That PDF is empty (0 bytes). Please upload the actual file.',
},
BERTHS_PDF_MOORING_MISMATCH: {
status: 409,
userMessage:
"The mooring number in the PDF doesn't match the berth you're uploading to. Confirm to override or upload to the right berth.",
},
BERTHS_VERSION_ALREADY_CURRENT: {
status: 409,
userMessage: "That PDF version is already the active one — there's nothing to roll back to.",
},
// ─── Recommender ────────────────────────────────────────────────────
RECOMMENDER_INTEREST_PORT_MISMATCH: {
status: 400,
userMessage: "The interest you're trying to recommend berths for belongs to a different port.",
},
// ─── Storage ────────────────────────────────────────────────────────
STORAGE_FILE_TOO_LARGE: {
status: 413,
userMessage: 'That file is too large.',
},
STORAGE_INVALID_FILE_TYPE: {
status: 400,
userMessage: "That file type isn't allowed here.",
},
STORAGE_NOT_FOUND: {
status: 404,
userMessage: "We couldn't find that file. It may have been removed.",
},
STORAGE_PROXY_TOKEN_INVALID: {
status: 403,
userMessage: 'That download link is invalid or has expired.',
},
// ─── Documenso / Documents ──────────────────────────────────────────
DOCUMENT_TEMPLATE_MISSING_FIELD: {
status: 400,
userMessage:
'The document template is missing a required field. Ask an admin to update the template.',
},
DOCUMENT_UNRESOLVED_TOKENS: {
status: 400,
userMessage:
'The document still has unfilled placeholders. Please complete them before sending.',
},
DOCUMENT_TEMPLATE_NOT_FOUND: {
status: 404,
userMessage: 'That document template is missing or has been removed.',
},
// ─── Send-outs / Email ──────────────────────────────────────────────
EMAIL_RECIPIENT_MISSING: {
status: 400,
userMessage:
'No email address on file for this recipient. Add one to the client first, then try again.',
},
EMAIL_BODY_TOO_LARGE: {
status: 400,
userMessage: 'The email body is too long. Please trim it down and try again.',
},
EMAIL_RATE_LIMIT_HOURLY: {
status: 429,
userMessage: "You've hit the hourly send limit. Please wait a bit before sending more.",
},
EMAIL_BROCHURE_ARCHIVED: {
status: 400,
userMessage: 'That brochure is archived and can no longer be sent.',
},
// ─── EOI / Interests ────────────────────────────────────────────────
EOI_NO_BERTH_LINKED: {
status: 400,
userMessage: 'This interest has no berth linked yet. Link a berth before generating the EOI.',
},
INTEREST_INVALID_STAGE_TRANSITION: {
status: 400,
userMessage: "That stage change isn't allowed from the current pipeline stage.",
},
// ─── Public form intake ─────────────────────────────────────────────
PUBLIC_INTAKE_SECRET_MISMATCH: {
status: 403,
userMessage: 'This request was rejected by the security check.',
},
} as const satisfies Record<string, ErrorCodeEntry>;
export type ErrorCode = keyof typeof ERROR_CODES;
/** Type-guard: is `s` one of our registered codes? */
export function isErrorCode(s: string): s is ErrorCode {
return Object.prototype.hasOwnProperty.call(ERROR_CODES, s);
}

View File

@@ -2,6 +2,9 @@ import { NextResponse } from 'next/server';
import { ZodError } from 'zod';
import { logger } from '@/lib/logger';
import { getRequestId } from '@/lib/request-context';
import { captureErrorEvent } from '@/lib/services/error-events.service';
import { ERROR_CODES, type ErrorCode } from '@/lib/error-codes';
export class AppError extends Error {
constructor(
@@ -14,20 +17,63 @@ export class AppError extends Error {
}
}
/**
* Throw site for any registered error code. Consolidates the
* status + plain-text message + stable code into one constructor.
*
* throw new CodedError('EXPENSES_RECEIPT_REQUIRED');
*
* Pass `details` for structured payload (e.g. zod validation issues),
* or `internalMessage` for an admin-only string that lands in the
* error_events row but is NEVER returned to the user (the user gets
* the plain-text message from the registry).
*/
export class CodedError extends AppError {
/** Optional structured details surfaced to the client. */
public details?: unknown;
/** Optional verbose message for admin logs only — never sent to client. */
public internalMessage?: string;
constructor(code: ErrorCode, opts: { details?: unknown; internalMessage?: string } = {}) {
const def = ERROR_CODES[code];
super(def.status, def.userMessage, code);
this.name = 'CodedError';
this.details = opts.details;
this.internalMessage = opts.internalMessage;
}
}
/**
* Backwards-compat shims: these existing subclasses are still used in
* lots of places; new throw sites should prefer `CodedError` so the
* code surfaces in the registry.
*
* Messages have been rewritten to plain language (no internal jargon)
* so the user-facing toast reads naturally even before a service is
* migrated to a specific CodedError code.
*/
export class NotFoundError extends AppError {
constructor(entity: string) {
super(404, `${entity} not found`, 'NOT_FOUND');
// Plain-text version of "X not found" — the registered code stays
// generic until callers migrate to specific codes per entity.
super(
404,
`We couldn't find that ${entity.toLowerCase()}. It may have been removed.`,
'NOT_FOUND',
);
}
}
export class ForbiddenError extends AppError {
constructor(message = 'Insufficient permissions') {
constructor(
message = "You don't have permission to do that. Ask an admin if you think you should.",
) {
super(403, message, 'FORBIDDEN');
}
}
export class UnauthorizedError extends AppError {
constructor(message = 'Unauthorized') {
constructor(message = 'Please sign in to continue.') {
super(401, message, 'UNAUTHORIZED');
}
}
@@ -49,44 +95,84 @@ export class ConflictError extends AppError {
export class RateLimitError extends AppError {
constructor(public retryAfter: number) {
super(429, 'Too many requests', 'RATE_LIMITED');
super(
429,
"You've done that a lot in a short time. Please wait a moment and try again.",
'RATE_LIMITED',
);
}
}
/**
* Converts any thrown value into a sanitised NextResponse.
* Never leaks stack traces, internal paths, or database error details to the client.
*
* Always attaches the active `X-Request-Id` to:
* - the response header (so a curl/dev-tools user can see it)
* - the JSON body (so a UI toast can surface "Error ID: …")
*
* For unhandled (5xx) errors, also persists a row to `error_events`
* so a super admin can paste the request id into the inspector and
* pull the full stack + body excerpt + log lines.
*
* Never leaks stack traces, internal paths, or DB error details to
* the client — that data goes to pino + the error_events row only.
*/
export function errorResponse(error: unknown): NextResponse {
const requestId = getRequestId();
const headers = requestId ? { 'X-Request-Id': requestId } : undefined;
if (error instanceof AppError) {
const body: Record<string, unknown> = {
error: error.message,
code: error.code,
};
if (requestId) body.requestId = requestId;
if (error instanceof ValidationError && error.details) {
body.details = error.details;
}
if (error instanceof CodedError && error.details !== undefined) {
body.details = error.details;
}
if (error instanceof RateLimitError) {
body.retryAfter = error.retryAfter;
}
return NextResponse.json(body, { status: error.statusCode });
// 4xx errors are user-action mistakes (validation, not-found,
// permission). They DON'T go to error_events — that table is for
// platform faults the super admin needs to triage. The exception:
// when a CodedError carries an internalMessage, persist it under
// a debug_events flag so admins can still trace deliberate-throw
// patterns. (Only 5xx CodedErrors get persisted automatically.)
if (error.statusCode >= 500) {
void captureErrorEvent({
statusCode: error.statusCode,
error,
metadata: error instanceof CodedError ? { internalMessage: error.internalMessage } : {},
});
}
return NextResponse.json(body, { status: error.statusCode, headers });
}
if (error instanceof ZodError) {
return NextResponse.json(
{
const body: Record<string, unknown> = {
error: 'Validation failed',
code: 'VALIDATION_ERROR',
details: error.errors.map((e) => ({
field: e.path.join('.'),
message: e.message,
})),
},
{ status: 400 },
);
};
if (requestId) body.requestId = requestId;
return NextResponse.json(body, { status: 400, headers });
}
// Log full details server-side; never send them to the client.
// Unhandled — full details to pino + persist to error_events.
logger.error({ err: error }, 'Unhandled error');
return NextResponse.json({ error: 'Internal server error' }, { status: 500 });
void captureErrorEvent({ statusCode: 500, error });
const body: Record<string, unknown> = { error: 'Internal server error', code: 'INTERNAL' };
if (requestId) {
body.requestId = requestId;
body.message = `Something went wrong on our end. Quote error ID ${requestId} when reporting this.`;
}
return NextResponse.json(body, { status: 500, headers });
}

View File

@@ -1,7 +1,26 @@
import pino from 'pino';
import { getRequestContext } from '@/lib/request-context';
export const logger = pino({
level: process.env.LOG_LEVEL ?? 'info',
/**
* Mix the active request context (request id, port id, user id) into
* EVERY log line emitted within an API request — this is what makes
* the super-admin error inspector usable: paste a request id into the
* search and every log line that fired during that request comes back
* keyed to it. Outside a request (queue jobs, scheduled tasks) the
* mixin returns an empty object so the log line is unchanged.
*/
mixin() {
const ctx = getRequestContext();
if (!ctx) return {};
return {
requestId: ctx.requestId,
portId: ctx.portId || undefined,
userId: ctx.userId || undefined,
};
},
redact: {
paths: [
'password',

View File

@@ -0,0 +1,69 @@
/**
* Per-request context propagated via AsyncLocalStorage.
*
* Every API request carries an immutable {requestId, portId, userId,
* method, path} bag that is available from anywhere in the call stack
* without threading it through every function signature. This is what
* lets `logger.info(...)` deep inside a service call automatically
* stamp the originating request id into the log line, and what lets
* `errorResponse` know which request to attach to a persisted
* `error_events` row.
*
* Why ALS over an explicit context arg: 80% of the codebase is already
* written; threading `RequestContext` through every helper would touch
* hundreds of files and break domain isolation. ALS is the standard
* Node-side pattern (Express + Pino + many production services use
* the exact same approach).
*
* Wiring:
* - The `withAuth` wrapper in `src/lib/api/helpers.ts` calls
* `runWithRequestContext({...}, () => handler(...))` so every code
* path inside the request runs inside the ALS frame.
* - The pino logger in `src/lib/logger.ts` mixes `getRequestContext()`
* into every emitted log line via the `mixin` hook.
* - `errorResponse(err)` reads the same context to build the user-
* facing error envelope and to persist a row to `error_events`.
*/
import { AsyncLocalStorage } from 'node:async_hooks';
export interface RequestContext {
/** UUID — surfaces in `X-Request-Id` response header + every log line. */
requestId: string;
/** Active port for this request (empty string for super-admin pre-port). */
portId: string;
/** better-auth user id (empty string for unauthenticated paths). */
userId: string;
/** HTTP method — recorded for error_events triage. */
method: string;
/** Pathname (no query string) — recorded for error_events triage. */
path: string;
/** Wall-clock ms timestamp at request entry. Used for duration metrics. */
startedAt: number;
}
const store = new AsyncLocalStorage<RequestContext>();
/**
* Run `fn` inside a request-context frame. Every call within the
* resulting callstack — including async work, queue callbacks, and
* service-layer DB queries — sees the same context via
* `getRequestContext()`.
*/
export function runWithRequestContext<T>(ctx: RequestContext, fn: () => T): T {
return store.run(ctx, fn);
}
/**
* Read the current request context, or `null` when called outside a
* request (e.g. queue worker, scheduled job). Callers must handle the
* null case — the logger mixin does so gracefully.
*/
export function getRequestContext(): RequestContext | null {
return store.getStore() ?? null;
}
/** Convenience accessor for the most common field. */
export function getRequestId(): string | null {
return store.getStore()?.requestId ?? null;
}

View File

@@ -18,7 +18,7 @@ import { and, desc, eq, isNull, max, sql } from 'drizzle-orm';
import { db } from '@/lib/db';
import { berths, berthPdfVersions } from '@/lib/db/schema/berths';
import { systemSettings } from '@/lib/db/schema/system';
import { ConflictError, NotFoundError, ValidationError } from '@/lib/errors';
import { CodedError, ConflictError, NotFoundError, ValidationError } from '@/lib/errors';
import { logger } from '@/lib/logger';
import { getStorageBackend } from '@/lib/storage';
@@ -218,15 +218,13 @@ export async function uploadBerthPdf(args: UploadBerthPdfArgs): Promise<UploadBe
if (!isPdfMagic(buffer)) {
// Best-effort cleanup if the storage already has a partial.
if (args.storageKey) await backend.delete(args.storageKey).catch(() => undefined);
throw new ValidationError(
'Uploaded file failed PDF magic-byte check (does not start with %PDF-).',
);
throw new CodedError('BERTHS_PDF_MAGIC_BYTE');
}
if (buffer.length === 0) throw new ValidationError('Uploaded PDF is empty (0 bytes).');
if (buffer.length === 0) throw new CodedError('BERTHS_PDF_EMPTY');
if (buffer.length > maxBytes) {
throw new ValidationError(
`PDF exceeds ${maxMb} MB upload cap (got ${(buffer.length / 1024 / 1024).toFixed(1)} MB).`,
);
throw new CodedError('BERTHS_PDF_TOO_LARGE', {
internalMessage: `PDF exceeds ${maxMb} MB upload cap (got ${(buffer.length / 1024 / 1024).toFixed(1)} MB).`,
});
}
const written = await backend.put(storageKey, buffer, { contentType: 'application/pdf' });
storageKey = written.key;
@@ -240,13 +238,13 @@ export async function uploadBerthPdf(args: UploadBerthPdfArgs): Promise<UploadBe
}
if (head.sizeBytes === 0) {
await backend.delete(args.storageKey).catch(() => undefined);
throw new ValidationError('Uploaded PDF is empty (0 bytes).');
throw new CodedError('BERTHS_PDF_EMPTY');
}
if (head.sizeBytes > maxBytes) {
await backend.delete(args.storageKey).catch(() => undefined);
throw new ValidationError(
`PDF exceeds ${maxMb} MB upload cap (got ${(head.sizeBytes / 1024 / 1024).toFixed(1)} MB).`,
);
throw new CodedError('BERTHS_PDF_TOO_LARGE', {
internalMessage: `PDF exceeds ${maxMb} MB upload cap (got ${(head.sizeBytes / 1024 / 1024).toFixed(1)} MB).`,
});
}
if (head.contentType !== 'application/pdf' && head.contentType !== 'application/octet-stream') {
await backend.delete(args.storageKey).catch(() => undefined);
@@ -607,7 +605,7 @@ export async function rollbackToVersion(
if (!berthRow) throw new NotFoundError('Berth');
if (berthRow.currentPdfVersionId === versionId) {
throw new ConflictError('That version is already current; rollback is a no-op.');
throw new CodedError('BERTHS_VERSION_ALREADY_CURRENT');
}
await db

View File

@@ -34,6 +34,7 @@ import { and, eq, inArray, sql } from 'drizzle-orm';
import { db } from '@/lib/db';
import { systemSettings } from '@/lib/db/schema/system';
import { interests } from '@/lib/db/schema/interests';
import { CodedError } from '@/lib/errors';
// ─── Settings ──────────────────────────────────────────────────────────────
@@ -395,9 +396,9 @@ export async function recommendBerths(args: RecommendBerthsArgs): Promise<Recomm
if (!interestInput) return [];
if (interestInput.portId !== args.portId) {
// Defensive: caller passed a port that doesn't own this interest.
throw new Error(
`Recommender: interest ${args.interestId} belongs to port ${interestInput.portId}, not ${args.portId}`,
);
throw new CodedError('RECOMMENDER_INTEREST_PORT_MISMATCH', {
internalMessage: `interest ${args.interestId} belongs to port ${interestInput.portId}, not ${args.portId}`,
});
}
const oversizePct = args.maxOversizePct ?? settings.maxOversizePct;

View File

@@ -0,0 +1,172 @@
/**
* Error event capture + retrieval.
*
* `captureErrorEvent(...)` is called from `errorResponse(...)` whenever
* an unhandled (5xx) error fires inside a route handler. It pulls the
* request context from AsyncLocalStorage, sanitizes the payload, and
* inserts one row into `error_events`. Failure to write must NEVER
* throw — the caller is already in the error path.
*
* `listErrorEvents` / `getErrorEventById` back the super-admin inspector.
*/
import { and, desc, eq, gte, lte } from 'drizzle-orm';
import { db } from '@/lib/db';
import { errorEvents, type ErrorEvent } from '@/lib/db/schema/system';
import { logger } from '@/lib/logger';
import { getRequestContext } from '@/lib/request-context';
const STACK_MAX_BYTES = 4 * 1024;
const BODY_MAX_BYTES = 1 * 1024;
/** Keys whose values are never persisted to the body excerpt. */
const SENSITIVE_KEYS = new Set([
'password',
'newPassword',
'oldPassword',
'token',
'secret',
'apiKey',
'accessKey',
'secretKey',
'creditCard',
'cardNumber',
'cvv',
'ssn',
'authorization',
]);
/** Drop sensitive keys + cap the JSON length. */
function sanitizeBody(body: unknown): string | null {
if (body === null || body === undefined) return null;
let cloned: unknown;
try {
cloned = JSON.parse(JSON.stringify(body));
} catch {
return null;
}
function walk(value: unknown): unknown {
if (Array.isArray(value)) return value.map(walk);
if (value && typeof value === 'object') {
const out: Record<string, unknown> = {};
for (const [k, v] of Object.entries(value)) {
if (SENSITIVE_KEYS.has(k)) {
out[k] = '[REDACTED]';
} else {
out[k] = walk(v);
}
}
return out;
}
return value;
}
const sanitized = walk(cloned);
let serialized: string;
try {
serialized = JSON.stringify(sanitized);
} catch {
return null;
}
if (Buffer.byteLength(serialized, 'utf8') > BODY_MAX_BYTES) {
serialized = serialized.slice(0, BODY_MAX_BYTES) + '…[truncated]';
}
return serialized;
}
interface CaptureArgs {
statusCode: number;
error: unknown;
/** Optional structured metadata (e.g. zod issues parsed from a ZodError). */
metadata?: Record<string, unknown>;
/** Sanitized request body (already JSON-serializable). Optional. */
body?: unknown;
}
/**
* Persist an error_events row tied to the active request context.
* Best-effort — silently swallows any DB failure (the caller is
* already returning the user an error response; we do NOT want to
* mask the original error with a logging-pipeline failure).
*/
export async function captureErrorEvent(args: CaptureArgs): Promise<void> {
const ctx = getRequestContext();
if (!ctx) {
// Outside a request context (e.g. queue worker). Log but skip — the
// queue has its own failure-capture in BullMQ.
return;
}
try {
const err = args.error;
const errorName = err instanceof Error ? err.name : typeof err;
const errorMessage = err instanceof Error ? err.message : err === undefined ? '' : String(err);
const stack = err instanceof Error && err.stack ? err.stack.slice(0, STACK_MAX_BYTES) : null;
const durationMs = Date.now() - ctx.startedAt;
// Pull through any well-known fields the upstream library decorated
// onto the error — Postgres driver uses `code` (SQLSTATE) and
// `severity`, fetch errors carry `cause.code`, etc. The classifier
// reads from `metadata.code` to drive the "likely culprit" badge.
const enriched: Record<string, unknown> = { ...(args.metadata ?? {}) };
if (err && typeof err === 'object') {
const e = err as { code?: unknown; severity?: unknown; cause?: { code?: unknown } };
if (typeof e.code === 'string') enriched.code = e.code;
if (typeof e.severity === 'string') enriched.severity = e.severity;
if (e.cause && typeof e.cause === 'object' && typeof e.cause.code === 'string') {
enriched.causeCode = e.cause.code;
}
}
await db
.insert(errorEvents)
.values({
requestId: ctx.requestId,
portId: ctx.portId || null,
userId: ctx.userId || null,
statusCode: args.statusCode,
method: ctx.method,
path: ctx.path,
errorName,
errorMessage,
errorStack: stack,
requestBodyExcerpt: sanitizeBody(args.body),
metadata: enriched,
durationMs,
})
.onConflictDoNothing();
} catch (writeErr) {
// Logged but never thrown — the caller is in the error path already.
logger.error({ err: writeErr }, 'Failed to persist error_events row');
}
}
export interface ListErrorEventsFilter {
portId?: string;
statusCode?: number;
/** ISO date strings; defaults to last 7 days. */
from?: string;
to?: string;
limit?: number;
}
export async function listErrorEvents(filter: ListErrorEventsFilter): Promise<ErrorEvent[]> {
const conditions = [];
if (filter.portId) conditions.push(eq(errorEvents.portId, filter.portId));
if (filter.statusCode) conditions.push(eq(errorEvents.statusCode, filter.statusCode));
if (filter.from) conditions.push(gte(errorEvents.createdAt, new Date(filter.from)));
if (filter.to) conditions.push(lte(errorEvents.createdAt, new Date(filter.to)));
return db
.select()
.from(errorEvents)
.where(conditions.length ? and(...conditions) : undefined)
.orderBy(desc(errorEvents.createdAt))
.limit(filter.limit ?? 100);
}
export async function getErrorEventById(requestId: string): Promise<ErrorEvent | null> {
const row = await db.query.errorEvents.findFirst({
where: eq(errorEvents.requestId, requestId),
});
return row ?? null;
}

View File

@@ -7,7 +7,7 @@ import { buildListQuery } from '@/lib/db/query-builder';
import { createAuditLog, type AuditMeta } from '@/lib/audit';
import { diffEntity } from '@/lib/entity-diff';
import { softDelete, restore } from '@/lib/db/utils';
import { NotFoundError, ConflictError } from '@/lib/errors';
import { CodedError, NotFoundError } from '@/lib/errors';
import { emitToRoom } from '@/lib/socket/server';
import { convert } from '@/lib/services/currency';
import { logger } from '@/lib/logger';
@@ -213,9 +213,7 @@ export async function updateExpense(
: existing.noReceiptAcknowledged;
const hasReceipts = Array.isArray(mergedReceiptIds) && mergedReceiptIds.length > 0;
if (!hasReceipts && !mergedAck) {
throw new ConflictError(
'Expense must either link a receipt file or acknowledge the no-receipt warning.',
);
throw new CodedError('EXPENSES_RECEIPT_REQUIRED');
}
const updateData: Record<string, unknown> = { ...data, updatedAt: new Date() };
@@ -292,7 +290,7 @@ export async function archiveExpense(id: string, portId: string, meta: AuditMeta
.limit(1);
if (linkedInvoice.length > 0) {
throw new ConflictError('Cannot archive expense linked to a non-draft invoice');
throw new CodedError('EXPENSES_INVOICE_LINKED');
}
await softDelete(expenses, expenses.id, id);

View File

@@ -21,7 +21,7 @@ import { and, desc, eq, inArray } from 'drizzle-orm';
import { db } from '@/lib/db';
import { interestBerths, interests, type InterestBerth } from '@/lib/db/schema/interests';
import { berths } from '@/lib/db/schema/berths';
import { ValidationError } from '@/lib/errors';
import { CodedError } from '@/lib/errors';
type DbOrTx = typeof db | Parameters<Parameters<typeof db.transaction>[0]>[0];
@@ -215,7 +215,9 @@ export async function upsertInterestBerthTx(
.limit(1);
const side = sides[0];
if (side && side.interestPortId !== side.berthPortId) {
throw new ValidationError('Cannot link an interest and a berth from different ports.');
throw new CodedError('CROSS_PORT_LINK_REJECTED', {
internalMessage: `interest ${interestId} (port ${side.interestPortId}) ↔ berth ${berthId} (port ${side.berthPortId})`,
});
}
if (opts.isPrimary === true) {

View File

@@ -1,5 +1,5 @@
import { db } from '@/lib/db';
import { auditLogs } from '@/lib/db/schema';
import { auditLogs, errorEvents } from '@/lib/db/schema';
import { redis } from '@/lib/redis';
import { minioClient } from '@/lib/minio/index';
import { getQueue, QUEUE_CONFIGS, type QueueName } from '@/lib/queue';
@@ -56,10 +56,17 @@ export interface ConnectionStatus {
export interface RecentError {
id: string;
source: 'audit' | 'queue';
source: 'audit' | 'queue' | 'request';
message: string;
timestamp: Date;
metadata?: Record<string, unknown>;
/** Set for `source: 'request'` rows so the UI can deep-link to
* /admin/errors/<requestId>. */
requestId?: string;
/** Set for `source: 'request'` rows. */
statusCode?: number;
/** Set for `source: 'request'` rows. */
errorCode?: string | null;
}
// ─── Timeout helper ───────────────────────────────────────────────────────────
@@ -364,8 +371,42 @@ export async function getRecentErrors(limit = 20): Promise<RecentError[]> {
.filter((r): r is PromiseFulfilledResult<RecentError[]> => r.status === 'fulfilled')
.flatMap((r) => r.value);
// Captured 5xx requests from the per-request error_events table —
// this is the deepest source: full stack head + body excerpt + path.
// The dedicated /admin/errors page paginates this; here we surface
// the most recent for the dashboard.
const requestErrorRows = await db
.select({
requestId: errorEvents.requestId,
statusCode: errorEvents.statusCode,
method: errorEvents.method,
path: errorEvents.path,
errorName: errorEvents.errorName,
errorMessage: errorEvents.errorMessage,
metadata: errorEvents.metadata,
createdAt: errorEvents.createdAt,
})
.from(errorEvents)
.orderBy(desc(errorEvents.createdAt))
.limit(limit);
const requestErrors: RecentError[] = requestErrorRows.map((row) => {
const meta = (row.metadata as Record<string, unknown>) ?? {};
return {
id: row.requestId,
source: 'request' as const,
message:
`${row.method} ${row.path}${row.statusCode} ${row.errorMessage ?? row.errorName ?? ''}`.trim(),
timestamp: row.createdAt,
metadata: meta,
requestId: row.requestId,
statusCode: row.statusCode,
errorCode: typeof meta.code === 'string' ? meta.code : null,
};
});
// Merge and sort combined list by timestamp descending
const combined = [...auditResults, ...queueErrors].sort(
const combined = [...auditResults, ...queueErrors, ...requestErrors].sort(
(a, b) => b.timestamp.getTime() - a.timestamp.getTime(),
);

View File

@@ -101,7 +101,9 @@ describe('uploadBerthPdf', () => {
fileName: 'spoof.pdf',
uploadedBy: 'test-user',
}),
).rejects.toThrow(/magic-byte/);
// Plain-text user message replaced "magic-byte" wording; assert the
// stable error code instead so this test survives copy edits.
).rejects.toMatchObject({ code: 'BERTHS_PDF_MAGIC_BYTE' });
});
it('increments versionNumber on the second upload', async () => {
@@ -296,9 +298,11 @@ describe('rollbackToVersion', () => {
fileName: 'v1.pdf',
uploadedBy: 'test',
});
await expect(rollbackToVersion(berth.id, v1.versionId, port.id)).rejects.toThrow(
/already current/,
);
// Plain-text user message replaced "already current" wording; assert
// the stable error code instead.
await expect(rollbackToVersion(berth.id, v1.versionId, port.id)).rejects.toMatchObject({
code: 'BERTHS_VERSION_ALREADY_CURRENT',
});
});
});
@@ -321,11 +325,11 @@ describe('cross-port tenant guard', () => {
// Port B caller passing port A's berth id must hit NotFoundError on
// every entrypoint — including read-only listing, which previously
// returned 15-min presigned download URLs to the foreign port's PDFs.
await expect(listBerthPdfVersions(berthA.id, portB.id)).rejects.toThrow(/Berth/);
await expect(rollbackToVersion(berthA.id, v1.versionId, portB.id)).rejects.toThrow(/Berth/);
await expect(listBerthPdfVersions(berthA.id, portB.id)).rejects.toThrow(/berth/i);
await expect(rollbackToVersion(berthA.id, v1.versionId, portB.id)).rejects.toThrow(/berth/i);
await expect(
applyParseResults(berthA.id, v1.versionId, { lengthFt: 99 }, portB.id),
).rejects.toThrow(/Berth/);
).rejects.toThrow(/berth/i);
await expect(
uploadBerthPdf({
berthId: berthA.id,
@@ -334,13 +338,13 @@ describe('cross-port tenant guard', () => {
fileName: 'B-cross.pdf',
uploadedBy: 'test',
}),
).rejects.toThrow(/Berth/);
).rejects.toThrow(/berth/i);
await expect(
reconcilePdfWithBerth(
berthA.id,
{ engine: 'ocr', fields: {}, meanConfidence: 1, warnings: [] },
portB.id,
),
).rejects.toThrow(/Berth/);
).rejects.toThrow(/berth/i);
});
});

View File

@@ -7,7 +7,7 @@
* Rule from SECURITY-GUIDELINES.md:
* "Error responses must NEVER contain stack traces, SQL queries, or internal paths"
*/
import { beforeAll, describe, expect, it, vi } from 'vitest';
import { describe, expect, it, vi } from 'vitest';
// ── Mock next/server before importing the module under test ──────────────────
// NextResponse is a Next.js runtime class unavailable in a plain Node environment.
@@ -68,12 +68,13 @@ describe('Error response security — AppError subclasses', () => {
expect(JSON.stringify(body)).not.toContain('node_modules');
});
it('NotFoundError returns 404 with generic message, not entity internals', async () => {
it('NotFoundError returns 404 with plain-text message, not entity internals', async () => {
const error = new NotFoundError('Client');
const response = errorResponse(error);
expect(response.status).toBe(404);
const body = await response.json();
expect(body.error).toBe('Client not found');
// Message is now plain-text user-facing (no jargon, lowercased entity).
expect(body.error).toBe("We couldn't find that client. It may have been removed.");
expect(body.code).toBe('NOT_FOUND');
expect(JSON.stringify(body)).not.toContain('stack');
});
@@ -111,9 +112,7 @@ describe('Error response security — AppError subclasses', () => {
describe('Error response security — unknown / native errors', () => {
it('native Error with SQL content returns generic 500', async () => {
const error = new Error(
"SELECT * FROM users WHERE id = 1; DROP TABLE users;--",
);
const error = new Error('SELECT * FROM users WHERE id = 1; DROP TABLE users;--');
const response = errorResponse(error);
expect(response.status).toBe(500);
const body = await response.json();
@@ -135,9 +134,7 @@ describe('Error response security — unknown / native errors', () => {
});
it('native Error with node_modules path returns generic 500 without path', async () => {
const error = new Error(
'ENOENT: no such file at /app/node_modules/pg/lib/connection.js',
);
const error = new Error('ENOENT: no such file at /app/node_modules/pg/lib/connection.js');
const response = errorResponse(error);
expect(response.status).toBe(500);
const body = await response.json();
@@ -233,10 +230,17 @@ describe('Error response security — response shape invariants', () => {
}
});
it('500 response body has exactly the error key and nothing else', async () => {
it('500 response body carries error + code (and requestId when in-flight)', async () => {
const response = errorResponse(new Error('db connection refused'));
const body = await response.json();
expect(Object.keys(body)).toEqual(['error']);
// Allowed keys for a 500 response. `code` is always present; `requestId`
// and `message` only appear when an active request context is in scope.
const allowed = new Set(['error', 'code', 'requestId', 'message']);
for (const key of Object.keys(body)) {
expect(allowed.has(key)).toBe(true);
}
expect(body.error).toBe('Internal server error');
expect(body.code).toBe('INTERNAL');
expect(body).not.toHaveProperty('stack');
});
});