Ship-what's-done prep ahead of the prod cutover (launch ~today): - Hide Financial + Marketing report cards from the reports landing (both were "Builder in development" placeholders gated on unbuilt data sources). Sales/Operational/Custom + templates/scheduling/ exports remain live. - Trim the Custom-report card copy to match the shipped basic builder (no group-by/filters yet; the builder page header was already honest). - Hide the Bulk Import mockup from search-nav-catalog + the admin sections browser; /admin/import is now unreachable from the UI. - Correct client-facing doc over-claims (waiting-list "next-in-line notification", Import) in features-list.md + new-system-feature-summary.md. - Un-stale BACKLOG.md (Documenso phases 2-7 confirmed shipped). - Log decisions + deferred work (full importer, full custom-builder, waiting-list, maintenance-log, paper-upload bug) to launch-readiness.md. Deferred-importer design spec added at docs/superpowers/specs/2026-06-01-bulk-import-design.md. Verified: tsc --noEmit clean, eslint clean on changed files, 1512/1519 vitest pass (7 failures are Redis-down, unrelated). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
15 KiB
Production Deployment Plan — Port Nimara CRM
Status: DRAFT · pre-deployment · 2026-05-31 Target:
https://crm.portnimara.comon the PN Cloud server. Companion:docs/launch-readiness.md(Initiative 5 — cutover). Credentials live inprivate/deployment-creds.md(gitignored) — never put secrets in this file.
⛔ Guardrails (non-negotiable)
- No change to anything on the prod server without Matt's explicit
per-action approval. Recon/reads are fine; every
sudo, every file write, everydockermutation, everycertbotrun is approved individually before it runs. - Documenso is VITAL. It has broken on past upgrades. Nothing touches the Documenso DB, volumes, or container until a verified backup + S3↔DB reconciliation exists AND the upgrade step is explicitly approved.
- Work one phase at a time; verify before moving on. Keep a rollback for each mutating step.
Access (established 2026-05-31)
| What | Detail | Verified |
|---|---|---|
| Prod server (SSH) | 45.142.177.246:22022, user stefan, key id_ed25519_2026 (macOS keychain) |
✅ connected, key auth |
| Gitea API | https://code.letsbe.solutions as matt (admin) — reads build status, warnings, errors |
✅ v1.25.5, repo letsbe/pn-new-crm |
| Container registry | code.letsbe.solutions/letsbe/pn-new-crm/{crm-app,crm-worker} |
✅ CI pushes :latest + :<sha> |
Notes:
stefanis unprivileged (uid 1000, not in thedockergroup;sudoprompts for a password). Everydocker/nginx/certbot/ cert-read step needssudo(root pass inprivate/deployment-creds.md— VERIFY; the per-server creds file had MOPC's pass by mistake).- Reading build logs:
GET /api/v1/repos/letsbe/pn-new-crm/actions/tasks(run status) + per-job logs; latestmainbuild is success.
How builds reach prod
git push origin main → Gitea Actions .gitea/workflows/build.yml:
- lint job:
pnpm lint+pnpm exec tsc --noEmit. - build-and-push job (main only): builds
Dockerfile→crm-appandDockerfile.worker→crm-worker, pushes:latest+:<sha>to the Gitea registry.
Prod pulls those images — it does not build. So a deploy is:
push → wait for green CI → docker compose pull + up -d on the server.
Prod stack (docker-compose.prod.yml)
| Service | Image | Notes |
|---|---|---|
postgres |
postgres:16-alpine |
self-contained, volume pgdata |
redis |
redis:7-alpine |
self-contained, volume redisdata (BullMQ + socket.io adapter) |
crm-app |
registry crm-app:latest |
host 7100 → container 3000 |
crm-worker |
registry crm-worker:latest |
BullMQ worker |
- Storage: no MinIO service in the compose — the CRM uses external
MinIO via
system_settings.storage_backend+getStorageBackend(). The existing prod MinIO (:9000,s3.conf/minio.confnginx vhosts) is the backend. Confirm bucket + keys (creds file §3). - Decision needed: does the CRM get its own Postgres (the compose
default, isolated
pgdata) or reuse an existing prod Postgres instance? Default = the compose's own Postgres (cleanest isolation). Confirm.
Phase 1 — crm.portnimara.com go-live
DNS already points crm.portnimara.com at the server. No crm.portnimara
nginx vhost exists yet (fresh setup). Template: portnimara_dev.conf
(reverse-proxy + Certbot pattern already in use on this box).
Pre-flight (no approval needed — prep only)
- Assemble the prod
.envfor the CRM. Source of truth:src/lib/env.ts(Zod schema) +.env.example. Critical keys:APP_URL=https://crm.portnimara.comDATABASE_URL(compose Postgres),REDIS_*- storage / MinIO (endpoint, access/secret, bucket) — creds file §3
DOCUMENSO_API_URL(bare host, no/api/v1),DOCUMENSO_API_VERSION, API key- better-auth secret,
WEBSITE_INTAKE_SECRET, SMTP/IMAP EMAIL_REDIRECT_TOMUST be unset in prod.
- Server can pull from the registry:
docker login code.letsbe.solutionswith a registry token (creds file §2 — generate a Gitea token; do not bake the account password into the server).
Step 1 — nginx vhost (⚠ approval)
- Create
/etc/nginx/sites-available/crm_portnimara.confmodelled onportnimara_dev.conf: port-80 → 443 redirect +.well-known/acme-challengelocation; port-443 serverproxy_pass http://127.0.0.1:7100with the same header block (Host, X-Real-IP, CF-Connecting-IP, X-Forwarded-_, websocketUpgrade/Connectionfor socket.io),client_max_body_size 64M,proxy_read_timeout 300, buffering off. HTTP-only first (nossl\__lines yet) so Certbot can complete the challenge. - Symlink into
sites-enabled/. sudo nginx -t— must pass. Thensudo systemctl reload nginx.
Step 2 — TLS cert (⚠ approval)
sudo certbot --nginx -d crm.portnimara.com— pulls + installs the cert, rewrites the vhost with the managedssl_certificatelines + 80→443 redirect. Re-runsudo nginx -t+ reload.
Step 3 — bring up the container (⚠ approval)
- Place
docker-compose.prod.yml+ the prod.envin the deploy dir (e.g./opt/pn-crm— confirm location). sudo docker login code.letsbe.solutions(registry token).sudo docker compose -f docker-compose.prod.yml pull.sudo docker compose -f docker-compose.prod.yml up -d.- Watch for errors:
sudo docker compose logs -f crm-app crm-worker. - Apply schema: migrations via
psql(per CLAUDE.mddb:migrateis broken) or the app's push path — confirm the prod migration approach. - Seed/bootstrap the port + admin user as needed.
Verify
curl -fsS https://crm.portnimara.com/api/public/health→{status:"ok"...}- Authenticated health w/
X-Intake-Secret→{checks:{db,redis}} - Login loads, branding renders, a berth list + a deal render.
- socket.io realtime connects (websocket upgrade through nginx works).
- No
42703column errors (restartcrm-appafter any schema change).
Phase 2 — Documenso v1.13.1 → v2.x upgrade (VITAL — execute SOBER, heavily gated)
Do not execute while impaired. This is the production signing system. Every mutating step needs an explicit, sober go/no-go. The runbook below is reference; the actual run is a scheduled session.
Verified facts (2026-05-31 recon + research)
| Item | Value |
|---|---|
| Current version | documenso/documenso:v1.13.1 (Oct 2025 — last v1) |
| Latest version | v2.11.0 (May 2026). Path: 1.13.1 → 2.0.0 → … → 2.11.0 (major jump) |
| Compose | /root/docker-compose/documenso/docker-compose.yml (project documenso-production, services documenso + database) |
| DB | postgres:15, db documenso_db, user admin, vol documenso-production_documenso-database → /var/lib/postgresql/data |
| App port | container 3000 → host 3020; served at https://signatures.portnimara.dev (nginx documenso.conf, direct — no Cloudflare) |
| Storage | external MinIO, bucket signatures @ s3.portnimara.com, region eu-central-1 |
| Signing cert | /opt/documenso/certificate.p12 (+ passphrase in env) |
Research conclusions (sources in chat):
- v1 API survives in v2 — "API V1 is stable but deprecated; nothing breaks." So the CRM keeps working on v1 API; flip to v2 later. (Will be explicitly re-tested against the clone in Phase 0 before committing.)
- Postgres 15 is v2's official DB — no DB-engine upgrade needed.
- Env vars carry over unchanged; only
NEXTAUTH_URLis dropped in v2 (auth now derives fromNEXT_PUBLIC_WEBAPP_URL, already set correctly) — harmless leftover. - Upgrade = pull new image + restart;
prisma migrate deployauto-runs all pending migrations on startup. - Known migration-failure history (issue #1880: NOT-NULL column added without backfill). 1.13.1 is past that one, but it's the failure pattern to expect — hence the clone dry-run.
- The login bounce (non-
Securecookie /NEXTAUTH_URLquirk) is plausibly fixed in v2's reworked auth, but treat that as a hoped-for bonus, not the goal.
Locked decisions (per Matt, 2026-05-31)
- Dry-run on a clone first: yes. Target latest v2.11.0, staged through v2.0.0.
- No-downtime caveat: true zero-downtime is not possible (migrations run on restart). Goal = brief + pre-rehearsed: validate fully on the clone, pre-pull the image, then a fast prod cutover in a low-traffic window.
- CRM stays on Documenso v1 API after upgrade.
- Backups:
pg_dump+ cert + compose/env pulled to the Mac (private/documenso-backups/, gitignored) and a cold volume snapshot kept on-server for fastest rollback. - Privilege: root via
su(stefan isn't in the docker group; sudo needs a password we don't have — root pass works forsu).
Phase 0 — Dry-run on a disposable clone (zero prod risk)
pg_dump -Fc documenso_db(live, no downtime) → restore into a throwawaypostgres:15+documenso:v2.11.0stack on a different compose project + port, with a copy of the signing cert.- Watch
prisma migrate deployrun the full 1.13.1→2.11.0 chain. Confirm: all migrations succeed, app boots, login works, existing documents render. - Re-test the CRM's v1 API calls against the clone → expect 200s.
- If a migration fails: capture it, fix forward (or decide a target version that's clean) BEFORE touching prod.
Phase A — Prod backups (after Phase 0 passes; verified before any change)
pg_dump -Fc documenso_db→ pull toprivate/documenso-backups/on the Mac (off-box). Plus a plain SQL dump.- Cold volume snapshot: stop stack →
tardocumenso-production_documenso-database→ keep on-server + copy off. (This is the gold rollback — Prisma migrations aren't reversible.) - Copy compose file + env +
/opt/documenso/{certificate.p12,private.key,certificate.crt}. - MinIO
signatures: read-only object inventory ({key,size,lastModified,etag}) + DB→storage-key mapping export (Document/DocumentData → storage key) so files can be re-matched if linkage breaks. - Test-restore the dump into a throwaway PG15; record SHA-256s.
Phase B — Collation pre-fix (low risk; validate need on the clone first)
REFRESH COLLATION VERSIONondocumenso_db(+template1/postgres) + reindex, so the libc 2.36→2.41 mismatch can't interfere with migration index ops.
Phase C — Prod upgrade (staged, pinned tags, low-traffic window)
- Pre-pull images. Edit compose:
v1.13.1 → v2.0.0→up -d→ watch migration logs → verify. - Then
v2.0.0 → v2.11.0→ verify. Keeppostgres:15.
Phase D — Verify
- Login works; an existing completed envelope's PDF resolves from MinIO; send a test envelope; webhook reaches the CRM (
X-Documenso-Secret, idempotenthandleDocumentCompleted); reminders/void work. - CRM unchanged (still v1 API).
Phase E — Rollback (any failure)
- Revert image tag + restore the volume snapshot (and/or DB dump) → back to v1.13.1 exactly.
Until Phase 0 passes AND a sober Phase A/C is explicitly approved step-by-step, do not touch the Documenso container, DB, volumes, or
/opt/documenso.
Open decisions / what I need from you
- ✅ MinIO creds filled; Documenso DB creds filled (creds file §3/§4). Still need the Documenso API token + webhook secret (generate after login as
matt@portnimara.com). - Verify the root/sudo password (
IpMKQ0TW56ovv80— confirmed it works forsuto root; not stefan's sudo password). - CRM Postgres: own (compose default) or reuse an existing instance?
- Deploy dir for the CRM on the server (
/opt/pn-crm?). - Registry pull token — Gitea token for
docker loginon the server. - ✅ Documenso target = v2.11.0, staged, clone-validated first.
- Maintenance window for the (brief, unavoidable) Documenso restart downtime.
- Off-box backup destination confirmed = Mac
private/documenso-backups/+ on-server volume snapshot.
Progress log
- 2026-05-31: Access established (SSH + Gitea API). Read-only recon done
(nginx templates, prod compose, host port 7100). CRM deploy plan drafted.
Documenso fully diagnosed read-only (v1.13.1, healthy app+DB, login issue =
wrong email
@letsbevs@portnimara.com+ a non-Secure-cookie quirk; 5432 publicly exposed + brute-forced; libc collation mismatch). Researched v2 upgrade (v2.11.0 latest, PG15 ok, env vars carry over, v1 API survives). Upgrade runbook drafted. No prod changes made; no backups taken. - 2026-06-01: Phase 0 dry-run PASSED (local, zero prod impact). Read-only
pg_dumpof prod (3.5 MB — metadata only) → restored into a throwawaypostgres:15→ booteddocumenso:v2.11.0against it. Result: full v1.13.1→v2.11.0 chain applied cleanly (All migrations have been successfully applied, 140→157, none unfinished), app boots (home 302, signin 200, v2 api 200), and v1 API still answers (400 not 404) → CRM safe. Dump saved atprivate/documenso-backups/(off-box backup). Dry-run stack torn down 2026-06-01 after the pass (docker compose -p documenso-dryrun down -v— containers + anonymous volume + network removed; restored clone gone, off-box dump retained). Compose file kept atprivate/documenso-dryrun/docker-compose.ymlfor a re-run. Prod still untouched.