docs(spec): env-to-admin migration design

Design spec for moving tenant-configurable env vars into the per-port
admin UI via a settings registry. Covers scope decisions, registry
shape, resolver, encryption, admin UI generation, env catalog by
disposition, migration plan, and testing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-15 14:22:39 +02:00
parent d15f5509ad
commit 397dbd1490

View File

@@ -0,0 +1,391 @@
# Env-to-Admin Migration — Design Spec
**Date:** 2026-05-15
**Status:** Draft (awaiting user review)
**Author:** Brainstorm session, Matt + Claude
## Goal
Move every tenant-configurable environment variable into the per-port admin UI, leaving env exclusively for boot-time / build-time / chicken-and-egg secrets. Eliminate the silent drift that produced two of the audit's findings (S-23 plaintext S3 access key; Documenso API key stored plaintext per its own admin form description).
## Non-goals
- **Not** moving boot-time secrets (DATABASE_URL, BETTER_AUTH_SECRET, etc.) — they're needed before the DB is reachable.
- **Not** building a Google OAuth admin form — feature is not in use.
- **Not** changing the existing per-port `system_settings` storage table — only adding columns / rows.
- **Not** silently mutating `.env` files at runtime (rejected as too footgun-y).
## Scope decisions (from brainstorming)
| Decision | Choice |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| Which env vars move | Anything tenant-configurable (option 2). Boot-time + build-time stay in env. |
| Env-fallback policy | Env stays as runtime fallback when admin field is blank. Vars are commented out in `.env.example`, with dev + prod templates committed to repo. |
| Per-port vs global | Per-port with global fallback (`port_id IS NULL`) for credentials and shared infrastructure. Resolution: port → global → env → registry default. |
| Encryption | All credential-class fields AES-256-GCM via `EMAIL_CREDENTIAL_KEY`. Fixes S-23 + Documenso plaintext as part of this migration. |
| Migration UX | "Using env fallback" badge per field + "Copy current value from env" one-click button. Operator-driven; nothing happens automatically at boot. |
| Implementation | Settings registry + uniform resolver (approach A). |
## Architecture
The current code has 4 places that "know" about each setting:
1. Env validation schema (`src/lib/env.ts`)
2. Per-domain resolver (`src/lib/services/port-config.ts` for Documenso/email; ad-hoc reads for others)
3. Admin form definition (`SettingFieldDef[]` in each `admin/<integration>/page.tsx`)
4. Encryption call site (per service)
These drift independently and produce drift bugs. Replace those 4 sites with **one registry entry per setting**. The registry is consumed by:
- **Resolver** (`getSetting(key, portId)`) — port → global → env → default; decrypts on read if `encrypted: true`.
- **Admin form generator** — renders inputs from `type` + `label` + `description`; auto-attaches the "Using env fallback" badge + "Copy from env" button. Encryption is transparent (resolver returns `*IsSet: true` for credential fields, never the cleartext).
- **Validator** — Zod schema attached to each entry, used by both the admin write endpoint AND env validation at boot.
- **Encryption helper** — registry says `encrypted: true` → resolver wraps in `encrypt()`/`decrypt()`.
Existing per-port settings table (`system_settings`) stays — no schema migration beyond adding `_encrypted` suffix to a few previously-plaintext columns and one new column for webhook secret.
```
┌─────────────────────────────────────────────────────────┐
│ src/lib/settings/ │
│ ┌──────────────────────┐ ┌─────────────────────┐ │
│ │ registry.ts │ │ resolver.ts │ │
│ │ - one entry per key │───▶│ getSetting(k, port) │ │
│ │ - type, encrypted, │ │ writeSetting(k, v) │ │
│ │ scope, validator │ │ envFallbackFor(k) │ │
│ └──────────────────────┘ └──────────┬──────────┘ │
│ │ │
│ ┌──────────────────────┐ ┌──────────▼──────────┐ │
│ │ encryption.ts │◀───│ system_settings │ │
│ │ AES-256-GCM │ │ (existing table) │ │
│ └──────────────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────┘
┌──────────────────────┴──────────────────────────────────┐
│ RegistryDrivenForm (React component) │
│ Input: { sections: ['documenso.api', ...] } │
│ Output: <Form> with badges + Copy-from-env buttons │
└─────────────────────────────────────────────────────────┘
```
## Registry shape
```ts
// src/lib/settings/registry.ts
export interface SettingEntry {
/** Stable key written to system_settings.key */
key: string;
/** Human-readable section the admin form groups by */
section: string;
/** UI label */
label: string;
/** UI description (markdown allowed) */
description: string;
/** Type drives both validation and form input */
type: 'string' | 'password' | 'number' | 'boolean' | 'select' | 'url' | 'email';
/** select-only */
options?: Array<{ value: string; label: string }>;
/** Zod schema — overrides type-default validator if provided */
validator?: z.ZodTypeAny;
/** Defaults applied when port + global + env all absent */
defaultValue?: string | number | boolean | null;
/** Encrypt at rest with AES-256-GCM */
encrypted?: boolean;
/** Per-port (default) or global-only (super-admin) */
scope: 'port' | 'global';
/** Env var name to consult as fallback when port + global blank */
envFallback?: string;
/** Optional value transformer applied after resolution */
transform?: (raw: unknown) => unknown;
/** Sensitive: never surface cleartext via admin API; emit `<key>IsSet: boolean` instead */
sensitive?: boolean;
}
export const REGISTRY: SettingEntry[] = [
// Documenso
{
key: 'documenso_api_url',
section: 'documenso.api',
label: 'API URL',
type: 'url',
scope: 'port',
envFallback: 'DOCUMENSO_API_URL',
description: 'Bare host only — never include /api/v1.',
},
{
key: 'documenso_api_key',
section: 'documenso.api',
label: 'API key',
type: 'password',
scope: 'port',
encrypted: true,
sensitive: true,
envFallback: 'DOCUMENSO_API_KEY',
description: 'AES-encrypted at rest.',
},
{
key: 'documenso_api_version',
section: 'documenso.api',
label: 'API version',
type: 'select',
options: [
{ value: 'v1', label: 'v1' },
{ value: 'v2', label: 'v2' },
],
scope: 'port',
envFallback: 'DOCUMENSO_API_VERSION',
defaultValue: 'v1',
},
{
key: 'documenso_webhook_secret',
section: 'documenso.api',
label: 'Webhook secret',
type: 'password',
scope: 'port',
encrypted: true,
sensitive: true,
envFallback: 'DOCUMENSO_WEBHOOK_SECRET',
description: 'Used to verify inbound webhook deliveries via X-Documenso-Secret header.',
},
// ... continued for every migrated key
];
```
Resolver:
```ts
// src/lib/settings/resolver.ts
export async function getSetting<T = unknown>(
key: string,
portId: string | null,
): Promise<T | null> {
const entry = registryFor(key);
if (!entry) throw new Error(`Unknown setting: ${key}`);
// 1. port-specific
if (portId && entry.scope === 'port') {
const row = await db.query.systemSettings.findFirst({
where: and(eq(systemSettings.key, key), eq(systemSettings.portId, portId)),
});
if (row?.value != null) return decryptIf(entry, row.value) as T;
}
// 2. global (port_id IS NULL)
const globalRow = await db.query.systemSettings.findFirst({
where: and(eq(systemSettings.key, key), isNull(systemSettings.portId)),
});
if (globalRow?.value != null) return decryptIf(entry, globalRow.value) as T;
// 3. env fallback
if (entry.envFallback && process.env[entry.envFallback]) {
return (
entry.transform?.(process.env[entry.envFallback]) ?? (process.env[entry.envFallback] as T)
);
}
// 4. registry default
return (entry.defaultValue ?? null) as T;
}
```
The existing `getPortDocumensoConfig` etc. become thin convenience wrappers that batch a few `getSetting` calls and return a typed object:
```ts
export async function getPortDocumensoConfig(portId: string) {
const [apiUrl, apiKey, apiVersion, webhookSecret, ...rest] = await Promise.all([
getSetting<string>('documenso_api_url', portId),
getSetting<string>('documenso_api_key', portId),
getSetting<DocumensoApiVersion>('documenso_api_version', portId),
getSetting<string>('documenso_webhook_secret', portId),
// ...
]);
return { apiUrl, apiKey, apiVersion, webhookSecret, ...mapRest(rest) };
}
```
## Admin UI generation
```tsx
// src/components/admin/registry-driven-form.tsx
interface Props {
sections: string[]; // e.g. ['documenso.api', 'documenso.signers']
portId: string | null; // null = global tab
}
export function RegistryDrivenForm({ sections, portId }: Props) {
const entries = REGISTRY.filter((e) => sections.includes(e.section));
const { data: resolved } = useResolvedValues(entries, portId);
return entries.map((entry) => (
<FormField key={entry.key}>
<Label>{entry.label}</Label>
{entry.description && <p className="text-xs text-muted-foreground">{entry.description}</p>}
<Input
type={entry.type === 'password' ? 'password' : entry.type}
value={
entry.sensitive
? resolved[entry.key]?.isSet
? '••••••••'
: ''
: (resolved[entry.key]?.value ?? '')
}
/>
{resolved[entry.key]?.source === 'env' && (
<div className="flex gap-2">
<Badge>Using env fallback</Badge>
<Button onClick={() => copyFromEnv(entry.key, portId)}>Copy from env</Button>
</div>
)}
</FormField>
));
}
```
The existing per-integration admin pages become 5-line wrappers:
```tsx
// admin/documenso/page.tsx (replaces the current 410-line file)
export default function DocumensoAdmin() {
return (
<>
<PageHeader title="Documenso" />
<RegistryDrivenForm
sections={['documenso.api', 'documenso.signers', 'documenso.templates']}
/>
<DocumensoTestButton />
</>
);
}
```
## API endpoints
Two endpoints replace the current ad-hoc per-section endpoints:
| Method | Path | Purpose |
| ------ | -------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GET | `/api/v1/admin/settings/resolved?sections=documenso.api,documenso.signers` | Returns `{ key, value, source: 'port' \| 'global' \| 'env' \| 'default', isSet }` per requested entry. Sensitive fields never include cleartext. |
| PUT | `/api/v1/admin/settings/:key` | Body `{ value }`. Validates against registry's Zod schema. Encrypts if `encrypted: true`. Writes to `system_settings`. Audit-logged with `action: 'update'`, `entityType: 'setting'`, `metadata: { key }`, secrets masked. |
| DELETE | `/api/v1/admin/settings/:key` | Removes the row → reverts to global → env → default. |
| POST | `/api/v1/admin/settings/:key/copy-from-env` | One-click migration. Reads env var named in `entry.envFallback`, writes to `system_settings`, returns the resulting resolved state. |
Existing `PUT /api/v1/admin/settings` (the generic upsert) stays for backward compat with the few non-registry writers; new fields use the typed endpoint.
## Encryption integration
- Reuse existing `encrypt()` / `decrypt()` from `src/lib/utils/encryption.ts` (AES-256-GCM, random IV per encryption, GCM auth tag).
- Resolver auto-wraps encrypt on write when `entry.encrypted === true`, decrypt on read.
- `system_settings.value` is `JSONB`. For encrypted values, store as `{ ciphertext, iv, tag }` (already the convention in `sales-email-config.service.ts`).
- Sensitive fields surface `<key>IsSet: boolean` in the API response, never the decrypted value. The admin form shows `••••••••` placeholder.
- Audit log integration: when writing to a key with `encrypted: true`, the `newValue` is replaced with `{ value: '[redacted]' }` before audit-log write — fixes audit finding **AU-02** (encrypted ciphertext in audit log) as part of this work.
## Env catalog
Every env var, classified:
### A. Stays in env (boot-time / build-time / chicken-and-egg)
| Var | Reason |
| --------------------------- | ----------------------------------------------------------------------------------------------- |
| `DATABASE_URL` | Need DB connection before reading from DB |
| `REDIS_URL` | Same — Redis pre-init |
| `BETTER_AUTH_SECRET` | Cookie/session signing key, read at auth init |
| `BETTER_AUTH_URL` | Auth callback base URL, read at auth init |
| `CSRF_SECRET` | CSRF token signing, read pre-DB |
| `EMAIL_CREDENTIAL_KEY` | The AES key used to encrypt other DB-stored credentials (chicken-and-egg) |
| `NODE_ENV` | Read pre-init by Next.js, logger, etc. |
| `LOG_LEVEL` | Read at logger init pre-DB |
| `PORT` | Listen port, read at server start |
| `NEXT_PUBLIC_APP_URL` | Inlined into client JS bundle at build time |
| `NEXT_PUBLIC_SENTRY_DSN` | Same — client-side Sentry init |
| `MULTI_NODE_DEPLOYMENT` | Used at boot to gate filesystem backend |
| `SKIP_ENV_VALIDATION` | Internal bypass flag |
| `WEBSITE_INTAKE_SECRET` | Boot-time shared secret with marketing site (could go DB but operator-shared, not user-tunable) |
| `EMAIL_REDIRECT_TO` | Dev-only safety net; operator convenience |
| `SENTRY_ENVIRONMENT` | Read at Sentry SDK init pre-DB |
| `SENTRY_TRACES_SAMPLE_RATE` | Same |
### B. Migrates to admin (per-port, encrypted where credential)
| Var | Registry key | Encrypted | Already in admin? |
| ---------------------------------- | ---------------------------------- | ----------------------------- | ----------------------------- |
| `DOCUMENSO_API_URL` | `documenso_api_url` | no | yes (override) |
| `DOCUMENSO_API_KEY` | `documenso_api_key` | **yes** (was plaintext) | yes (override, plaintext bug) |
| `DOCUMENSO_API_VERSION` | `documenso_api_version` | no | yes |
| `DOCUMENSO_WEBHOOK_SECRET` | `documenso_webhook_secret` | **yes** | **no — gap** |
| `DOCUMENSO_TEMPLATE_ID_EOI` | `documenso_eoi_template_id` | no | yes |
| `DOCUMENSO_CLIENT_RECIPIENT_ID` | `documenso_client_recipient_id` | no | yes |
| `DOCUMENSO_DEVELOPER_RECIPIENT_ID` | `documenso_developer_recipient_id` | no | yes |
| `DOCUMENSO_APPROVAL_RECIPIENT_ID` | `documenso_approval_recipient_id` | no | yes |
| `MINIO_ENDPOINT` | `storage_s3_endpoint` | no | yes (storage admin) |
| `MINIO_PORT` | (combined into endpoint URL) | — | yes |
| `MINIO_ACCESS_KEY` | `storage_s3_access_key` | **yes** (was plaintext, S-23) | yes (plaintext bug) |
| `MINIO_SECRET_KEY` | `storage_s3_secret_key` | yes (already) | yes |
| `MINIO_BUCKET` | `storage_s3_bucket` | no | yes |
| `MINIO_USE_SSL` | (combined into endpoint URL) | — | yes |
| `MINIO_AUTO_CREATE_BUCKET` | `storage_s3_auto_create_bucket` | no | new |
| `SMTP_HOST` | `smtp_host_override` | no | yes |
| `SMTP_PORT` | `smtp_port_override` | no | yes |
| `SMTP_USER` | `smtp_user_override` | no | yes |
| `SMTP_PASS` | `smtp_pass_override` | yes (already) | yes |
| `SMTP_FROM` | `email_from_address` | no | yes |
| `OPENAI_API_KEY` | `openai_api_key` | yes (already) | yes |
| `APP_URL` | `app_url` | no | **new** |
| `PUBLIC_SITE_URL` | `public_site_url` | no | **new** |
### C. Skipped (YAGNI)
| Var | Reason |
| ------------------------------------------ | --------------------------------- |
| `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` | OAuth not used and not on roadmap |
## Migration of existing code
1. **Replace `getPortDocumensoConfig` body** to call the new `getSetting` per field (see Architecture section).
2. **Replace `getSalesEmailConfig` body** the same way.
3. **Replace direct `process.env.X` reads** in: `receipt-scanner.ts:4` (OpenAI client), `documents.service.ts` (any direct env reads), `webhook-event-map.ts` (webhook URL builder), all `src/lib/storage/` backend reads.
4. **Migrate the 5 admin pages** (Documenso, AI, OCR, Email, Storage) to use `RegistryDrivenForm`. Keep page-specific extras (test buttons, status cards, AI budget card, sends log).
5. **Add migrations:**
- One-time data migration: copy any plaintext `documenso_api_key_override` and `storage_s3_access_key` rows into encrypted columns, drop plaintext columns. Reuse `encrypt()`.
- Schema: add `documenso_webhook_secret` row on first registry-resolver init, and any new keys (`app_url`, `public_site_url`).
6. **Update `.env.example`:** comment out everything in category B, add an explanation header pointing operators to `/admin/<integration>` after first super-admin login. Generate `dev.env.example` and `prod.env.example` templates with category-A vars only (the boot-time minimum).
7. **Update `src/lib/env.ts`:** mark all category-B vars as `optional()` (env is fallback, not required for boot). Category-A stays required.
## Error handling
- **Resolver:** unknown key → throws (programming error). Decryption failure → throws + audit-logged with `action: 'decryption_failed'`. Missing required value → returns `null`, caller decides (e.g. Documenso send fails with a clear error toast).
- **Admin write:** Zod validation failure → 400 with field-level errors via `parseBody`. Encryption failure → 500 + audit `action: 'encryption_failed'`. Permission check at route handler (`admin.manage_settings` or domain-specific permission).
- **Form:** "Copy from env" when env var is empty → toast "no env value to copy". Save with empty cleartext on a sensitive field → DELETE the row (reverts to env/default), don't write empty ciphertext.
## Testing
Unit tests:
- `getSetting` — port → global → env → default precedence (per-port hits, global hits, env fallback, default fallback)
- `getSetting` — encrypted entry round-trips
- `getSetting` — sensitive entry surfaces `*IsSet` boolean only
- Registry validators reject malformed values
- Migration script: plaintext → encrypted round-trips correctly
Integration tests:
- `PUT /api/v1/admin/settings/:key` with valid + invalid payloads
- `POST /api/v1/admin/settings/:key/copy-from-env` with present + absent env
- Audit log row written with masked secret value
E2E (Playwright smoke):
- Super-admin opens `/admin/documenso`, sees "Using env fallback" badges on inherited fields, types a value, saves, badge disappears
- Click "Copy from env" → field auto-fills, badge changes to "Set in port"
- Per-port override actually applied: switch port → see different value resolved
## Rollout
Single PR, single migration. Backward compat via env-as-fallback means existing deployments keep working unchanged after deploy (admin DB rows are absent, so resolver falls through to env). Operator opts in to admin-canonical configuration field-by-field.
## Out of scope (separate work)
- Building admin form for OCR / berth-PDF parser tunables (feature settings, not env migration)
- Refactoring all _other_ per-port settings (vocabularies, qualification criteria, custom fields, etc.) into the registry — those already have working bespoke forms; no drift bug there.
- Adding settings versioning / rollback (not requested)
- Multi-tenant settings export/import (not requested)