feat(storage): pluggable s3-or-filesystem backend + migration CLI + admin UI

Phase 6a from docs/berth-recommender-and-pdf-plan.md §4.7a + §14.9a. Lays
the storage groundwork for Phase 6b/7 file-bearing schemas (per-berth PDFs,
brochures) without touching those domains yet.

New files:
- src/lib/storage/index.ts        StorageBackend interface + per-process
                                  factory keyed on system_settings.
- src/lib/storage/s3.ts           S3-compatible backend (MinIO/AWS/B2/R2/
                                  Wasabi/Tigris) wrapping the existing minio
                                  JS client. Includes a healthCheck() used
                                  by the admin "Test connection" button.
- src/lib/storage/filesystem.ts   Local filesystem backend with all §14.9a
                                  mitigations baked in.
- src/lib/storage/migrate.ts      Shared migration core — pg_advisory_lock,
                                  per-row resumable progress markers,
                                  sha256 round-trip verification, atomic
                                  storage_backend flip on success.
- scripts/migrate-storage.ts      Thin CLI shim around runMigration().
- src/app/api/storage/[token]/route.ts
                                  Filesystem proxy GET. Verifies HMAC,
                                  enforces single-use replay protection
                                  via Redis SET NX, streams via NextResponse
                                  ReadableStream with explicit Content-Type
                                  + Content-Disposition. Node runtime only.
- src/app/api/v1/admin/storage/route.ts
                                  GET status + POST connection test.
- src/app/api/v1/admin/storage/migrate/route.ts
                                  Super-admin-only POST that runs the
                                  exact same runMigration() as the CLI.
- src/app/(dashboard)/[portSlug]/admin/storage/page.tsx
                                  Super-admin admin UI (current backend,
                                  capacity stats, switch button with
                                  dry-run, test connection, backup hint).
- src/components/admin/storage-admin-panel.tsx
                                  Client component for the page above.

§14.9a critical mitigations implemented:
- Path-traversal: storage keys validated against ^[a-zA-Z0-9/_.-]+$;
  `..`, `.`, `//`, leading `/`, and overlength keys rejected.
- Realpath: storage root realpath'd at create time, every per-key
  resolution checked against the realpath'd prefix.
- Storage root created (or chmod'd) to 0o700.
- Multi-node refusal: FilesystemBackend.create() throws when
  MULTI_NODE_DEPLOYMENT=true.
- HMAC token: sha256-HMAC over the (key, expiry, nonce, filename,
  content-type) payload. Verified with timingSafeEqual; bad sig,
  expired, or invalid-key payloads all return 403.
- Single-use replay: token body cached in Redis SET NX EX 1800s.
- sha256 round-trip: copyAndVerify() re-fetches from the target after
  put() and aborts the migration on any mismatch.
- Free-disk pre-flight: when migrating to filesystem, sums byte counts
  via source.head() and aborts if free space < total * 1.2.
- pg_advisory_lock(0xc7000a01) prevents concurrent migrations.
- Resumable: per-row progress markers in _storage_migration_progress.

system_settings keys read by the factory (jsonb, no schema change):
storage_backend, storage_s3_endpoint, storage_s3_region,
storage_s3_bucket, storage_s3_access_key,
storage_s3_secret_key_encrypted, storage_s3_force_path_style,
storage_filesystem_root, storage_proxy_hmac_secret_encrypted.

Defaults: storage_backend=`s3`, storage_filesystem_root=`./storage`
(./storage added to .gitignore).

Tests added (34 tests, all green):
- tests/unit/storage/filesystem-backend.test.ts — key validation
  allow/reject matrix, realpath escape, 0o700 perms, multi-node
  refusal, HMAC token sign/verify/tamper/expire/invalid-key.
- tests/unit/storage/copy-and-verify.test.ts — sha256 mismatch on
  round-trip aborts the migration.
- tests/integration/storage/proxy-route.test.ts — happy path, wrong
  HMAC secret, expired token, replay rejection.

Phase 6a ships zero file-bearing tables — TABLES_WITH_STORAGE_KEYS is
intentionally empty. berth_pdf_versions and brochure_versions land in
Phase 6b and join the list there. Existing s3_key columns: only
gdpr_export_jobs.storage_key, already named correctly — no rename needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Ciaccio
2026-05-05 03:15:59 +02:00
parent 15d4849030
commit 83693dd993
15 changed files with 2051 additions and 0 deletions

View File

@@ -0,0 +1,345 @@
/**
* Local filesystem backend. Stores files at `${root}/<key>` on disk and serves
* downloads via a CRM-internal proxy route (`/api/storage/[token]`) that
* verifies an HMAC token before streaming the bytes. Used for single-VPS
* deployments where running MinIO is overkill.
*
* §14.9a critical mitigations:
*
* - Storage keys are validated against `^[a-zA-Z0-9/_.-]+$`. Anything containing
* `..` or that resolves to an absolute path is rejected.
* - The resolved path is checked with `path.resolve` against the resolved
* storage root (realpath form) — symlink escapes are blocked.
* - The storage root is created with mode `0o700` (owner only).
* - Refuses to start when `MULTI_NODE_DEPLOYMENT === 'true'` — multi-node
* deployments must use an S3-compatible store.
* - Proxy download URLs carry an HMAC-signed payload (key + expiry); the
* route refuses to stream a key whose token doesn't verify.
*/
import { createHash, createHmac, randomUUID, timingSafeEqual } from 'node:crypto';
import { createReadStream } from 'node:fs';
import * as fs from 'node:fs/promises';
import * as path from 'node:path';
import { Readable } from 'node:stream';
import { env } from '@/lib/env';
import { NotFoundError, ValidationError } from '@/lib/errors';
import { logger } from '@/lib/logger';
import { decrypt } from '@/lib/utils/encryption';
import type { PresignOpts, PutOpts, StorageBackend } from './index';
// ─── key validation ─────────────────────────────────────────────────────────
const VALID_KEY_RE = /^[a-zA-Z0-9/_.-]+$/;
/**
* Validate a storage key. Rejects:
* - empty / whitespace
* - characters outside `[a-zA-Z0-9/_.-]`
* - traversal segments (`..`, `/..`, `../`)
* - absolute paths (leading `/`)
* - segments starting with `.` (hidden files / dotfiles other than the
* intentional dot-extension at the end)
*
* Use this both at write time AND at read time — a key fed back from the DB
* could in theory have been tampered with at rest.
*/
export function validateStorageKey(key: string): void {
if (typeof key !== 'string' || key.length === 0) {
throw new ValidationError('Storage key must be a non-empty string');
}
if (key.length > 1024) {
throw new ValidationError('Storage key exceeds 1024 chars');
}
if (key.startsWith('/') || key.startsWith('\\')) {
throw new ValidationError('Storage key must not be absolute');
}
if (!VALID_KEY_RE.test(key)) {
throw new ValidationError('Storage key contains forbidden characters');
}
// Reject any traversal segment in any normalized form.
const segments = key.split('/');
for (const seg of segments) {
if (seg === '..' || seg === '.' || seg === '') {
throw new ValidationError('Storage key has empty or traversal segment');
}
}
}
// ─── HMAC token helpers ─────────────────────────────────────────────────────
interface ProxyTokenPayload {
/** Storage key (validated). */
k: string;
/** Expiry epoch seconds. */
e: number;
/** Random nonce so two URLs for the same (key, expiry) differ. */
n: string;
/** Optional download filename. */
f?: string;
/** Optional content-type override. */
c?: string;
}
function b64urlEncode(buf: Buffer): string {
return buf.toString('base64').replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');
}
function b64urlDecode(s: string): Buffer {
const pad = s.length % 4 === 0 ? '' : '='.repeat(4 - (s.length % 4));
return Buffer.from(s.replace(/-/g, '+').replace(/_/g, '/') + pad, 'base64');
}
export function signProxyToken(payload: ProxyTokenPayload, secret: string): string {
const json = JSON.stringify(payload);
const body = b64urlEncode(Buffer.from(json, 'utf8'));
const sig = createHmac('sha256', secret).update(body).digest();
return `${body}.${b64urlEncode(sig)}`;
}
export function verifyProxyToken(
token: string,
secret: string,
): { ok: true; payload: ProxyTokenPayload } | { ok: false; reason: string } {
if (typeof token !== 'string' || !token.includes('.')) {
return { ok: false, reason: 'malformed' };
}
const [body, sigB64] = token.split('.', 2);
if (!body || !sigB64) return { ok: false, reason: 'malformed' };
const expected = createHmac('sha256', secret).update(body).digest();
let provided: Buffer;
try {
provided = b64urlDecode(sigB64);
} catch {
return { ok: false, reason: 'malformed' };
}
if (provided.length !== expected.length) return { ok: false, reason: 'sig-mismatch' };
if (!timingSafeEqual(provided, expected)) return { ok: false, reason: 'sig-mismatch' };
let payload: ProxyTokenPayload;
try {
payload = JSON.parse(b64urlDecode(body).toString('utf8')) as ProxyTokenPayload;
} catch {
return { ok: false, reason: 'malformed-payload' };
}
if (typeof payload.e !== 'number' || payload.e * 1000 < Date.now()) {
return { ok: false, reason: 'expired' };
}
try {
validateStorageKey(payload.k);
} catch {
return { ok: false, reason: 'invalid-key' };
}
return { ok: true, payload };
}
// ─── backend ────────────────────────────────────────────────────────────────
interface FilesystemConfig {
root: string;
/** AES-GCM-encrypted HMAC secret. When absent, falls back to a derived secret. */
proxyHmacSecretEncrypted: string | null;
}
export class FilesystemBackend implements StorageBackend {
readonly name = 'filesystem' as const;
private rootResolved: string;
private hmacSecret: string;
private constructor(rootResolved: string, hmacSecret: string) {
this.rootResolved = rootResolved;
this.hmacSecret = hmacSecret;
}
/** Throws if multi-node mode is set or the root isn't writable. */
static async create(cfg: FilesystemConfig): Promise<FilesystemBackend> {
if (process.env.MULTI_NODE_DEPLOYMENT === 'true') {
throw new Error(
'FilesystemBackend cannot start when MULTI_NODE_DEPLOYMENT=true. ' +
'Use an S3-compatible backend for multi-node deployments.',
);
}
const rootInput = cfg.root || './storage';
const rootAbs = path.isAbsolute(rootInput) ? rootInput : path.resolve(process.cwd(), rootInput);
await fs.mkdir(rootAbs, { recursive: true, mode: 0o700 });
// Defensive: re-chmod even if it already existed.
await fs.chmod(rootAbs, 0o700).catch(() => undefined);
// Use realpath so symlinked roots are flattened to their actual location;
// we then compare every per-key resolution against this exact prefix.
const rootResolved = await fs.realpath(rootAbs);
const hmacSecret = resolveHmacSecret(cfg.proxyHmacSecretEncrypted);
logger.info({ root: rootResolved }, 'FilesystemBackend ready');
return new FilesystemBackend(rootResolved, hmacSecret);
}
/**
* Resolve a (validated) storage key to an absolute path under the root.
* Throws if the resolved path escapes the storage root via symlink/.. tricks.
*/
private resolveKey(key: string): string {
validateStorageKey(key);
const joined = path.join(this.rootResolved, key);
const resolved = path.resolve(joined);
if (resolved !== this.rootResolved && !resolved.startsWith(this.rootResolved + path.sep)) {
throw new ValidationError('Storage key escapes storage root');
}
return resolved;
}
async put(
key: string,
body: Buffer | NodeJS.ReadableStream,
opts: PutOpts,
): Promise<{ key: string; sizeBytes: number; sha256: string }> {
const target = this.resolveKey(key);
await fs.mkdir(path.dirname(target), { recursive: true, mode: 0o700 });
const buffer = Buffer.isBuffer(body) ? body : await streamToBuffer(body);
const sha256 = opts.sha256 ?? createHash('sha256').update(buffer).digest('hex');
// Atomic write via temp + rename so partial writes don't leave half-files.
const tmp = `${target}.${randomUUID()}.tmp`;
await fs.writeFile(tmp, buffer, { mode: 0o600 });
// realpath the temp to make sure the final-rename target resolves correctly
// even if some segment of the path is a symlink we just created.
await fs.rename(tmp, target);
return { key, sizeBytes: buffer.length, sha256 };
}
async get(key: string): Promise<NodeJS.ReadableStream> {
const target = this.resolveKey(key);
try {
const stat = await fs.stat(target);
if (!stat.isFile()) throw new NotFoundError(`Storage object ${key}`);
} catch (err) {
const code = (err as NodeJS.ErrnoException).code;
if (code === 'ENOENT') throw new NotFoundError(`Storage object ${key}`);
throw err;
}
return createReadStream(target);
}
async head(key: string): Promise<{ sizeBytes: number; contentType: string } | null> {
const target = this.resolveKey(key);
try {
const stat = await fs.stat(target);
if (!stat.isFile()) return null;
// Filesystem doesn't track content-type. Caller should consult the DB
// (or sniff via ext) — we return application/octet-stream as a default.
return { sizeBytes: stat.size, contentType: extToContentType(target) };
} catch (err) {
const code = (err as NodeJS.ErrnoException).code;
if (code === 'ENOENT') return null;
throw err;
}
}
async delete(key: string): Promise<void> {
const target = this.resolveKey(key);
try {
await fs.unlink(target);
} catch (err) {
const code = (err as NodeJS.ErrnoException).code;
if (code === 'ENOENT') return;
throw err;
}
}
/**
* Filesystem mode never exposes a direct upload URL. The CRM-internal proxy
* accepts uploads via the regular API surface (multipart POST to /api/v1/...
* or PUT to /api/storage/[token]). We return a placeholder PUT URL pointing
* at the proxy so the contract stays uniform.
*/
async presignUpload(
key: string,
opts: PresignOpts,
): Promise<{ url: string; method: 'PUT' | 'POST' }> {
validateStorageKey(key);
const expiresAt = Math.floor(Date.now() / 1000) + (opts.expirySeconds ?? 900);
const token = signProxyToken(
{ k: key, e: expiresAt, n: randomUUID(), c: opts.contentType },
this.hmacSecret,
);
return { url: `/api/storage/${token}`, method: 'PUT' };
}
async presignDownload(key: string, opts: PresignOpts): Promise<{ url: string; expiresAt: Date }> {
validateStorageKey(key);
const expirySec = opts.expirySeconds ?? 900;
const expiresAtSec = Math.floor(Date.now() / 1000) + expirySec;
const token = signProxyToken(
{ k: key, e: expiresAtSec, n: randomUUID(), f: opts.filename, c: opts.contentType },
this.hmacSecret,
);
return {
url: `/api/storage/${token}`,
expiresAt: new Date(expiresAtSec * 1000),
};
}
/** Used by the proxy route — returns the validated absolute path. */
resolveKeyForProxy(key: string): string {
return this.resolveKey(key);
}
/** Used by the proxy route — same HMAC secret as presignDownload. */
getHmacSecret(): string {
return this.hmacSecret;
}
}
// ─── helpers ────────────────────────────────────────────────────────────────
function resolveHmacSecret(encryptedSecret: string | null): string {
if (encryptedSecret) {
try {
return decrypt(encryptedSecret);
} catch (err) {
logger.error({ err }, 'Failed to decrypt storage_proxy_hmac_secret_encrypted');
}
}
// Derive a stable per-process secret from BETTER_AUTH_SECRET so dev mode
// works without explicit configuration. In production the admin UI writes
// an encrypted random secret.
const seed = process.env.BETTER_AUTH_SECRET ?? env.BETTER_AUTH_SECRET ?? 'storage-default';
return createHash('sha256').update(`storage-proxy:${seed}`).digest('hex');
}
async function streamToBuffer(stream: NodeJS.ReadableStream): Promise<Buffer> {
const chunks: Buffer[] = [];
for await (const chunk of stream as Readable) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk as string));
}
return Buffer.concat(chunks);
}
function extToContentType(filename: string): string {
const ext = path.extname(filename).toLowerCase();
switch (ext) {
case '.pdf':
return 'application/pdf';
case '.png':
return 'image/png';
case '.jpg':
case '.jpeg':
return 'image/jpeg';
case '.json':
return 'application/json';
case '.txt':
return 'text/plain';
case '.csv':
return 'text/csv';
case '.zip':
return 'application/zip';
default:
return 'application/octet-stream';
}
}

200
src/lib/storage/index.ts Normal file
View File

@@ -0,0 +1,200 @@
/**
* Pluggable storage backend (Phase 6a — see docs/berth-recommender-and-pdf-plan.md §4.7a).
*
* The CRM stores files (per-berth PDFs, brochures, GDPR exports, etc.) through a
* single `StorageBackend` abstraction. The deployment chooses between an
* S3-compatible store (MinIO / AWS S3 / Backblaze B2 / Cloudflare R2 / Wasabi /
* Tigris) and a local filesystem at runtime via `system_settings.storage_backend`.
*
* Callers should always import from this barrel — never from `s3.ts` or
* `filesystem.ts` directly — so the factory wiring stays the single source of
* truth.
*/
import { and, eq, isNull } from 'drizzle-orm';
import { db } from '@/lib/db';
import { systemSettings } from '@/lib/db/schema/system';
import { logger } from '@/lib/logger';
import { FilesystemBackend } from './filesystem';
import { S3Backend } from './s3';
export type StorageBackendName = 's3' | 'filesystem';
export interface PutOpts {
contentType: string;
/** Optional pre-computed sha256 hex — if absent, the backend computes one. */
sha256?: string;
/** Bytes (for streams that don't expose .length); used for capacity pre-flight. */
sizeBytes?: number;
/** Optional content-disposition for downloads (filesystem proxy only). */
contentDisposition?: string;
}
export interface PresignOpts {
/** TTL in seconds. Default 900 (15min) per SECURITY-GUIDELINES §7.1. */
expirySeconds?: number;
contentType?: string;
/** Filename used in Content-Disposition for downloads. */
filename?: string;
}
export interface StorageBackend {
/** Upload bytes. Returns the canonical key, size, and sha256 hex. */
put(
key: string,
body: Buffer | NodeJS.ReadableStream,
opts: PutOpts,
): Promise<{ key: string; sizeBytes: number; sha256: string }>;
/** Stream a file out. Throws NotFoundError if missing. */
get(key: string): Promise<NodeJS.ReadableStream>;
/** Existence + size check without reading the full body. Returns null when missing. */
head(key: string): Promise<{ sizeBytes: number; contentType: string } | null>;
/** Delete. Idempotent — missing keys must not throw. */
delete(key: string): Promise<void>;
/** Generate a short-lived URL the browser can PUT to. */
presignUpload(key: string, opts: PresignOpts): Promise<{ url: string; method: 'PUT' | 'POST' }>;
/** Generate a short-lived URL the browser can GET from. */
presignDownload(key: string, opts: PresignOpts): Promise<{ url: string; expiresAt: Date }>;
readonly name: StorageBackendName;
}
// ─── factory ────────────────────────────────────────────────────────────────
interface CachedFactory {
backend: StorageBackend;
/** Resolved at cache-time so we can re-fetch when settings change. */
configFingerprint: string;
}
let cached: CachedFactory | null = null;
/**
* Reset the per-process backend cache. Called after `system_settings` writes
* via the existing pub/sub invalidation hook, and exposed for tests.
*/
export function resetStorageBackendCache(): void {
cached = null;
}
interface StorageConfigSnapshot {
backend: StorageBackendName;
s3?: {
endpoint?: string;
region?: string;
bucket?: string;
accessKey?: string;
secretKeyEncrypted?: string;
forcePathStyle?: boolean;
};
filesystem?: {
root?: string;
proxyHmacSecretEncrypted?: string;
};
}
async function readGlobalSetting<T = unknown>(key: string): Promise<T | null> {
const [row] = await db
.select()
.from(systemSettings)
.where(and(eq(systemSettings.key, key), isNull(systemSettings.portId)));
return (row?.value as T | undefined) ?? null;
}
async function loadStorageConfig(): Promise<StorageConfigSnapshot> {
// Each setting key is a separate row. We read them in parallel.
const keys = [
'storage_backend',
'storage_s3_endpoint',
'storage_s3_region',
'storage_s3_bucket',
'storage_s3_access_key',
'storage_s3_secret_key_encrypted',
'storage_s3_force_path_style',
'storage_filesystem_root',
'storage_proxy_hmac_secret_encrypted',
] as const;
const [
backendRaw,
s3Endpoint,
s3Region,
s3Bucket,
s3AccessKey,
s3SecretKeyEncrypted,
s3ForcePathStyle,
fsRoot,
fsHmacSecretEncrypted,
] = await Promise.all(keys.map((k) => readGlobalSetting<unknown>(k)));
const backend: StorageBackendName = backendRaw === 'filesystem' ? 'filesystem' : 's3';
return {
backend,
s3: {
endpoint: typeof s3Endpoint === 'string' ? s3Endpoint : undefined,
region: typeof s3Region === 'string' ? s3Region : undefined,
bucket: typeof s3Bucket === 'string' ? s3Bucket : undefined,
accessKey: typeof s3AccessKey === 'string' ? s3AccessKey : undefined,
secretKeyEncrypted:
typeof s3SecretKeyEncrypted === 'string' ? s3SecretKeyEncrypted : undefined,
forcePathStyle:
typeof s3ForcePathStyle === 'boolean' ? s3ForcePathStyle : Boolean(s3ForcePathStyle),
},
filesystem: {
root: typeof fsRoot === 'string' ? fsRoot : undefined,
proxyHmacSecretEncrypted:
typeof fsHmacSecretEncrypted === 'string' ? fsHmacSecretEncrypted : undefined,
},
};
}
function fingerprint(cfg: StorageConfigSnapshot): string {
return JSON.stringify(cfg);
}
/**
* Resolve the active backend. Caches per-process; the cache is invalidated by
* `resetStorageBackendCache()` (called when `system_settings.storage_backend`
* changes via the migration flow).
*/
export async function getStorageBackend(): Promise<StorageBackend> {
const cfg = await loadStorageConfig();
const fp = fingerprint(cfg);
if (cached && cached.configFingerprint === fp) {
return cached.backend;
}
const backend = await buildBackend(cfg);
cached = { backend, configFingerprint: fp };
logger.info({ backend: backend.name }, 'Storage backend resolved');
return backend;
}
async function buildBackend(cfg: StorageConfigSnapshot): Promise<StorageBackend> {
if (cfg.backend === 'filesystem') {
return FilesystemBackend.create({
root: cfg.filesystem?.root ?? './storage',
proxyHmacSecretEncrypted: cfg.filesystem?.proxyHmacSecretEncrypted ?? null,
});
}
return S3Backend.create({
endpoint: cfg.s3?.endpoint,
region: cfg.s3?.region,
bucket: cfg.s3?.bucket,
accessKey: cfg.s3?.accessKey,
secretKeyEncrypted: cfg.s3?.secretKeyEncrypted,
forcePathStyle: cfg.s3?.forcePathStyle,
});
}
// ─── re-exports ─────────────────────────────────────────────────────────────
export { S3Backend } from './s3';
export { FilesystemBackend, validateStorageKey } from './filesystem';

318
src/lib/storage/migrate.ts Normal file
View File

@@ -0,0 +1,318 @@
/**
* Storage backend migration core. The CLI in `scripts/migrate-storage.ts` and
* the admin API at `/api/v1/admin/storage/migrate` both call `runMigration()`
* here, so behaviour is identical regardless of trigger.
*
* See docs/berth-recommender-and-pdf-plan.md §4.7a + §14.9a for the contract.
*/
import { createHash } from 'node:crypto';
import { statfs } from 'node:fs/promises';
import { Readable } from 'node:stream';
import { and, eq, isNull, sql } from 'drizzle-orm';
import { db } from '@/lib/db';
import { systemSettings } from '@/lib/db/schema/system';
import { FilesystemBackend } from './filesystem';
import { resetStorageBackendCache, type StorageBackend, type StorageBackendName } from './index';
import { S3Backend } from './s3';
// ─── tables to walk ─────────────────────────────────────────────────────────
export interface StorageKeyTable {
table: string;
/** Column name holding the storage key (always `storage_key` going forward). */
keyColumn: string;
/** Primary-key column for per-row progress markers. */
pkColumn: string;
/** Optional content-type column (lets the target backend persist Content-Type). */
contentTypeColumn?: string;
}
/**
* Phase 6a ships an empty list — `berth_pdf_versions` and `brochure_versions`
* land in Phase 6b. Add new entries here when new file-bearing tables are
* introduced. The migration script reads each named table via raw SQL so it
* does not need to import every domain's Drizzle schema.
*/
export const TABLES_WITH_STORAGE_KEYS: StorageKeyTable[] = [
// { table: 'berth_pdf_versions', keyColumn: 'storage_key', pkColumn: 'id', contentTypeColumn: 'content_type' },
// { table: 'brochure_versions', keyColumn: 'storage_key', pkColumn: 'id', contentTypeColumn: 'content_type' },
];
const ADVISORY_LOCK_KEY = 0xc7000a01;
// ─── helpers ────────────────────────────────────────────────────────────────
interface CliArgs {
from: StorageBackendName;
to: StorageBackendName;
dryRun: boolean;
}
export function parseArgs(argv: string[]): CliArgs {
const args: Partial<CliArgs> = { dryRun: false };
for (let i = 0; i < argv.length; i++) {
const a = argv[i];
if (a === '--dry-run') args.dryRun = true;
else if (a === '--from') args.from = argv[++i] as StorageBackendName;
else if (a === '--to') args.to = argv[++i] as StorageBackendName;
}
if (!args.from || !args.to || (args.from !== 's3' && args.from !== 'filesystem')) {
throw new Error('Usage: --from s3|filesystem --to s3|filesystem [--dry-run]');
}
if (args.to !== 's3' && args.to !== 'filesystem') {
throw new Error('--to must be s3 or filesystem');
}
if (args.from === args.to) {
throw new Error('--from and --to must differ');
}
return args as CliArgs;
}
async function ensureProgressTable(): Promise<void> {
await db.execute(sql`
CREATE TABLE IF NOT EXISTS _storage_migration_progress (
table_name text NOT NULL,
row_pk text NOT NULL,
storage_key text NOT NULL,
sha256 text NOT NULL,
size_bytes bigint NOT NULL,
migrated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (table_name, row_pk)
)
`);
}
function rowsOf(result: unknown): unknown[] {
if (Array.isArray(result)) return result;
const r = result as { rows?: unknown[] } | null;
return r?.rows ?? [];
}
async function isRowMigrated(tableName: string, pk: string): Promise<boolean> {
const res = await db.execute(sql`
SELECT 1 FROM _storage_migration_progress
WHERE table_name = ${tableName} AND row_pk = ${pk}
LIMIT 1
`);
return rowsOf(res).length > 0;
}
async function markRowMigrated(
tableName: string,
pk: string,
key: string,
sha256: string,
sizeBytes: number,
): Promise<void> {
await db.execute(sql`
INSERT INTO _storage_migration_progress (table_name, row_pk, storage_key, sha256, size_bytes)
VALUES (${tableName}, ${pk}, ${key}, ${sha256}, ${sizeBytes})
ON CONFLICT (table_name, row_pk) DO NOTHING
`);
}
interface RowRef {
tableName: string;
pk: string;
key: string;
contentType: string;
}
async function listKeysFor(tbl: StorageKeyTable): Promise<RowRef[]> {
const ctSelect = tbl.contentTypeColumn ? `, ${tbl.contentTypeColumn} as content_type` : '';
const result = await db.execute(
sql.raw(
`SELECT ${tbl.pkColumn} as pk, ${tbl.keyColumn} as key${ctSelect}
FROM ${tbl.table}
WHERE ${tbl.keyColumn} IS NOT NULL`,
),
);
const rows = rowsOf(result) as Array<{ pk: unknown; key: unknown; content_type?: unknown }>;
return rows.map((r) => ({
tableName: tbl.table,
pk: String(r.pk),
key: String(r.key),
contentType:
typeof r.content_type === 'string' && r.content_type.length > 0
? r.content_type
: 'application/octet-stream',
}));
}
// ─── streaming + sha256 verify ──────────────────────────────────────────────
/**
* Stream a file from `source` -> `target` while computing sha256 of the bytes
* actually written. Re-fetches the target object and verifies a second time
* to catch storage-side corruption.
*/
export async function copyAndVerify(
source: StorageBackend,
target: StorageBackend,
ref: RowRef,
): Promise<{ sha256: string; sizeBytes: number }> {
const stream = await source.get(ref.key);
const chunks: Buffer[] = [];
for await (const chunk of stream as Readable) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk as string));
}
const buffer = Buffer.concat(chunks);
const sha256 = createHash('sha256').update(buffer).digest('hex');
const putResult = await target.put(ref.key, buffer, {
contentType: ref.contentType,
sha256,
sizeBytes: buffer.length,
});
if (putResult.sha256 !== sha256) {
throw new Error(`sha256 mismatch on put for ${ref.tableName}/${ref.pk}`);
}
// Re-fetch from the target and verify a second time.
const verifyStream = await target.get(ref.key);
const verifyChunks: Buffer[] = [];
for await (const chunk of verifyStream as Readable) {
verifyChunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk as string));
}
const verifyBuf = Buffer.concat(verifyChunks);
const verifySha = createHash('sha256').update(verifyBuf).digest('hex');
if (verifySha !== sha256) {
throw new Error(`sha256 mismatch after round-trip for ${ref.tableName}/${ref.pk} (${ref.key})`);
}
return { sha256, sizeBytes: buffer.length };
}
// ─── pre-flight ─────────────────────────────────────────────────────────────
async function freeBytesAt(rootPath: string): Promise<number> {
const s = await statfs(rootPath);
return Number(s.bavail) * Number(s.bsize);
}
async function flipBackendSetting(target: StorageBackendName, userId: string): Promise<void> {
const existing = await db.query.systemSettings.findFirst({
where: and(eq(systemSettings.key, 'storage_backend'), isNull(systemSettings.portId)),
});
if (existing) {
await db
.update(systemSettings)
.set({ value: target, updatedBy: userId, updatedAt: new Date() })
.where(and(eq(systemSettings.key, 'storage_backend'), isNull(systemSettings.portId)));
} else {
await db.insert(systemSettings).values({
key: 'storage_backend',
value: target,
portId: null,
updatedBy: userId,
});
}
resetStorageBackendCache();
}
// ─── main ───────────────────────────────────────────────────────────────────
export interface MigrationOptions {
from: StorageBackendName;
to: StorageBackendName;
dryRun: boolean;
/** Override for tests. */
source?: StorageBackend;
target?: StorageBackend;
/** Audit user id. */
userId?: string;
}
export interface MigrationResult {
rowsConsidered: number;
rowsMigrated: number;
rowsSkippedAlreadyDone: number;
totalBytes: number;
flipped: boolean;
dryRun: boolean;
}
export async function runMigration(opts: MigrationOptions): Promise<MigrationResult> {
const lockResult = await db.execute(sql`SELECT pg_try_advisory_lock(${ADVISORY_LOCK_KEY}) as ok`);
const lockRows = rowsOf(lockResult) as Array<{ ok: boolean }>;
if (!lockRows[0]?.ok) {
throw new Error('Could not acquire storage migration advisory lock');
}
try {
await ensureProgressTable();
const source = opts.source ?? (await buildBackendForMigration(opts.from));
const target = opts.target ?? (await buildBackendForMigration(opts.to));
let rowsConsidered = 0;
let rowsMigrated = 0;
let rowsSkippedAlreadyDone = 0;
let totalBytes = 0;
for (const tbl of TABLES_WITH_STORAGE_KEYS) {
const refs = await listKeysFor(tbl);
rowsConsidered += refs.length;
// Pre-flight free-disk check when target is filesystem.
if (opts.to === 'filesystem' && target instanceof FilesystemBackend) {
const heads = await Promise.all(
refs.map((r) => source.head(r.key).then((h) => h?.sizeBytes ?? 0)),
);
const sumBytes = heads.reduce((a, b) => a + b, 0);
const free = await freeBytesAt(process.cwd());
if (free < sumBytes * 1.2) {
throw new Error(
`Insufficient disk: need ${Math.round(sumBytes / 1e6)}MB + 20% margin, have ${Math.round(free / 1e6)}MB free`,
);
}
}
for (const ref of refs) {
if (await isRowMigrated(ref.tableName, ref.pk)) {
rowsSkippedAlreadyDone += 1;
continue;
}
if (opts.dryRun) {
const head = await source.head(ref.key);
totalBytes += head?.sizeBytes ?? 0;
continue;
}
const { sha256, sizeBytes } = await copyAndVerify(source, target, ref);
await markRowMigrated(ref.tableName, ref.pk, ref.key, sha256, sizeBytes);
rowsMigrated += 1;
totalBytes += sizeBytes;
}
}
let flipped = false;
if (!opts.dryRun) {
await flipBackendSetting(opts.to, opts.userId ?? 'cli:migrate-storage');
flipped = true;
}
return {
rowsConsidered,
rowsMigrated,
rowsSkippedAlreadyDone,
totalBytes,
flipped,
dryRun: opts.dryRun,
};
} finally {
await db.execute(sql`SELECT pg_advisory_unlock(${ADVISORY_LOCK_KEY})`);
}
}
async function buildBackendForMigration(name: StorageBackendName): Promise<StorageBackend> {
if (name === 'filesystem') {
return FilesystemBackend.create({
root: process.env.STORAGE_FILESYSTEM_ROOT ?? './storage',
proxyHmacSecretEncrypted: null,
});
}
return S3Backend.create({});
}

209
src/lib/storage/s3.ts Normal file
View File

@@ -0,0 +1,209 @@
/**
* S3-compatible backend (MinIO / AWS S3 / Backblaze B2 / Cloudflare R2 /
* Wasabi / Tigris). Wraps the existing `minio` JS client so we keep one
* dependency for every S3-shaped service.
*
* Configuration is sourced from `system_settings` rows (passed in by the
* factory in `index.ts`). When system_settings are missing we fall back to
* the legacy `MINIO_*` env vars, so an upgrade path doesn't require admins
* to flip the new switches before the storage_backend admin UI lands.
*/
import { createHash } from 'node:crypto';
import { Readable } from 'node:stream';
import { Client } from 'minio';
import { env } from '@/lib/env';
import { NotFoundError } from '@/lib/errors';
import { logger } from '@/lib/logger';
import { decrypt } from '@/lib/utils/encryption';
import type { PresignOpts, PutOpts, StorageBackend } from './index';
interface S3BackendConfig {
endpoint?: string;
region?: string;
bucket?: string;
accessKey?: string;
secretKeyEncrypted?: string;
forcePathStyle?: boolean;
}
interface ResolvedConfig {
endpoint: string;
port: number;
useSSL: boolean;
region: string;
bucket: string;
accessKey: string;
secretKey: string;
}
function decryptIfPresent(stored: string | undefined): string | undefined {
if (!stored) return undefined;
try {
return decrypt(stored);
} catch (err) {
logger.error({ err }, 'Failed to decrypt S3 secret key');
return undefined;
}
}
/**
* Convert a possible URL ("https://s3.amazonaws.com:443") to a {host, port,
* useSSL} triple suitable for the minio Client.
*/
function parseEndpoint(endpoint: string | undefined): {
endPoint: string;
port: number;
useSSL: boolean;
} {
if (!endpoint) {
return { endPoint: env.MINIO_ENDPOINT, port: env.MINIO_PORT, useSSL: env.MINIO_USE_SSL };
}
try {
const url = new URL(endpoint);
const useSSL = url.protocol === 'https:';
const port = url.port ? Number(url.port) : useSSL ? 443 : 80;
return { endPoint: url.hostname, port, useSSL };
} catch {
// Not a URL — treat as bare hostname and use defaults.
return { endPoint: endpoint, port: env.MINIO_PORT, useSSL: env.MINIO_USE_SSL };
}
}
function resolveConfig(cfg: S3BackendConfig): ResolvedConfig {
const ep = parseEndpoint(cfg.endpoint);
return {
endpoint: ep.endPoint,
port: ep.port,
useSSL: ep.useSSL,
region: cfg.region ?? 'us-east-1',
bucket: cfg.bucket ?? env.MINIO_BUCKET,
accessKey: cfg.accessKey ?? env.MINIO_ACCESS_KEY,
secretKey: decryptIfPresent(cfg.secretKeyEncrypted) ?? env.MINIO_SECRET_KEY,
};
}
export class S3Backend implements StorageBackend {
readonly name = 's3' as const;
private client: Client;
private bucket: string;
private constructor(client: Client, bucket: string) {
this.client = client;
this.bucket = bucket;
}
static async create(cfg: S3BackendConfig): Promise<S3Backend> {
const resolved = resolveConfig(cfg);
const client = new Client({
endPoint: resolved.endpoint,
port: resolved.port,
useSSL: resolved.useSSL,
accessKey: resolved.accessKey,
secretKey: resolved.secretKey,
region: resolved.region,
});
return new S3Backend(client, resolved.bucket);
}
async put(
key: string,
body: Buffer | NodeJS.ReadableStream,
opts: PutOpts,
): Promise<{ key: string; sizeBytes: number; sha256: string }> {
// We need both upload + a sha256 of the bytes. For a Buffer this is trivial;
// for a stream we pipe through a hash transform. We buffer streams up to
// a reasonable cap (memory pressure is acceptable here — typical files
// are under 50MB and the alternative is a temp-file dance).
const buffer = Buffer.isBuffer(body) ? body : await streamToBuffer(body);
const sha256 = opts.sha256 ?? createHash('sha256').update(buffer).digest('hex');
await this.client.putObject(this.bucket, key, buffer, buffer.length, {
'Content-Type': opts.contentType,
});
return { key, sizeBytes: buffer.length, sha256 };
}
async get(key: string): Promise<NodeJS.ReadableStream> {
try {
return await this.client.getObject(this.bucket, key);
} catch (err) {
const code = (err as { code?: string } | null)?.code;
if (code === 'NoSuchKey' || code === 'NotFound') {
throw new NotFoundError(`Storage object ${key}`);
}
throw err;
}
}
async head(key: string): Promise<{ sizeBytes: number; contentType: string } | null> {
try {
const stat = await this.client.statObject(this.bucket, key);
const meta = (stat.metaData ?? {}) as Record<string, string>;
const contentType =
meta['content-type'] ?? meta['Content-Type'] ?? 'application/octet-stream';
return { sizeBytes: stat.size, contentType };
} catch (err) {
const code = (err as { code?: string } | null)?.code;
if (code === 'NotFound' || code === 'NoSuchKey') return null;
throw err;
}
}
async delete(key: string): Promise<void> {
try {
await this.client.removeObject(this.bucket, key);
} catch (err) {
const code = (err as { code?: string } | null)?.code;
if (code === 'NotFound' || code === 'NoSuchKey') return;
throw err;
}
}
async presignUpload(
key: string,
opts: PresignOpts,
): Promise<{ url: string; method: 'PUT' | 'POST' }> {
const expiry = opts.expirySeconds ?? 900;
const url = await this.client.presignedPutObject(this.bucket, key, expiry);
return { url, method: 'PUT' };
}
async presignDownload(key: string, opts: PresignOpts): Promise<{ url: string; expiresAt: Date }> {
const expiry = opts.expirySeconds ?? 900;
const url = await this.client.presignedGetObject(this.bucket, key, expiry);
return { url, expiresAt: new Date(Date.now() + expiry * 1000) };
}
/** Used by the admin UI's "Test connection" button. */
async healthCheck(): Promise<{ ok: true } | { ok: false; error: string }> {
const sentinelKey = `_health/${Date.now()}.txt`;
const payload = Buffer.from('ok', 'utf8');
try {
await this.client.putObject(this.bucket, sentinelKey, payload, payload.length, {
'Content-Type': 'text/plain',
});
const stat = await this.client.statObject(this.bucket, sentinelKey);
if (stat.size !== payload.length) {
return { ok: false, error: 'sentinel size mismatch' };
}
await this.client.removeObject(this.bucket, sentinelKey);
return { ok: true };
} catch (err) {
return { ok: false, error: (err as Error).message };
}
}
}
async function streamToBuffer(stream: NodeJS.ReadableStream): Promise<Buffer> {
const chunks: Buffer[] = [];
for await (const chunk of stream as Readable) {
chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk as string));
}
return Buffer.concat(chunks);
}