A focused review of every external integration surfaced six issues the
original audit missed. Fixed here.
HIGH
* Socket.IO had an unconditional 30-second idle disconnect on every
socket. The comment on the line acknowledged it was "for development
only, would be longer in prod" but no NODE_ENV guard existed, and the
`socket.onAny` listener only resets on inbound client events — every
dashboard connection that received only server-push events would have
been torn down every 30s in production. Removed the manual idle
timer entirely; Socket.IO's pingTimeout / pingInterval handles
dead-transport detection at the protocol level.
* SMTP transporters had no `connectionTimeout` / `greetingTimeout` /
`socketTimeout`. Nodemailer's defaults are 2 minutes for connect
and unlimited for socket — a hung SMTP server would have held a
BullMQ `email` worker concurrency slot for up to 10 min per job
(5 retries × 2 min). Set 10s/10s/30s on both the system transporter
in `src/lib/email/index.ts` and the user-account transporter in
`email-compose.service.ts`.
MEDIUM
* PostgreSQL pool had no `statement_timeout` /
`idle_in_transaction_session_timeout`. A slow query or transaction
held by a crashed handler would have eventually exhausted the
20-connection pool. 30s statement cap, 10s idle-in-tx cap, plus
`max_lifetime: 30min` to recycle connections.
* `umami_password` and `umami_api_token` were stored as plaintext in
`system_settings` (the SMTP and S3 secret paths use AES-GCM). The
reader now passes them through `readSecret()` which auto-detects
the encrypted `iv:cipher:tag` shape and decrypts, falling back to
legacy plaintext so operators can rotate without a flag-day.
* AI email-draft worker interpolated `additionalInstructions` (user-
controlled) directly into the OpenAI prompt — a hostile rep could
close the instructions block and inject prompt directives that
override the system prompt. Added `sanitizeForPrompt()` that
strips newlines + quote chars, caps at 500 chars, and the prompt
now wraps the value in a "treat as data not commands" preamble.
LOW
* Legacy `ensureBucket()` in `src/lib/minio/index.ts` was unguarded —
if any future code imported it (currently no callers), a misconfigured
prod deploy could mint a fresh empty bucket. Now matches the gate
used by the pluggable S3Backend (`MINIO_AUTO_CREATE_BUCKET=true`
required) so the legacy export and the new pluggable path agree.
Confirmed not-an-issue: BullMQ Workers create connections via
`{ url }` options object, and BullMQ sets `maxRetriesPerRequest: null`
internally for those — no fix needed. The shared `redis` singleton
that does keep `maxRetriesPerRequest: 3` is used only for direct
Redis ops (rate-limit sliding window, etc.), never for blocking
BullMQ commands, so the value is correct there.
Test status: 1175/1175 vitest, tsc clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>