deployment-plan.md gains a full env-var reference (CRM + website) and the cutover env-flip sequence; launch-readiness.md gets the 2026-06-02 closeout; BACKLOG.md adds the deferred integration-health-panel idea (section L). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
21 KiB
Production Deployment Plan — Port Nimara CRM
Status: DRAFT · pre-deployment · 2026-05-31 Target:
https://crm.portnimara.comon the PN Cloud server. Companion:docs/launch-readiness.md(Initiative 5 — cutover). Credentials live inprivate/deployment-creds.md(gitignored) — never put secrets in this file.
⛔ Guardrails (non-negotiable)
- No change to anything on the prod server without Matt's explicit
per-action approval. Recon/reads are fine; every
sudo, every file write, everydockermutation, everycertbotrun is approved individually before it runs. - Documenso is VITAL. It has broken on past upgrades. Nothing touches the Documenso DB, volumes, or container until a verified backup + S3↔DB reconciliation exists AND the upgrade step is explicitly approved.
- Work one phase at a time; verify before moving on. Keep a rollback for each mutating step.
Access (established 2026-05-31)
| What | Detail | Verified |
|---|---|---|
| Prod server (SSH) | 45.142.177.246:22022, user stefan, key id_ed25519_2026 (macOS keychain) |
✅ connected, key auth |
| Gitea API | https://code.letsbe.solutions as matt (admin) — reads build status, warnings, errors |
✅ v1.25.5, repo letsbe/pn-new-crm |
| Container registry | code.letsbe.solutions/letsbe/pn-new-crm/{crm-app,crm-worker} |
✅ CI pushes :latest + :<sha> |
Notes:
stefanis unprivileged (uid 1000, not in thedockergroup;sudoprompts for a password). Everydocker/nginx/certbot/ cert-read step needssudo(root pass inprivate/deployment-creds.md— VERIFY; the per-server creds file had MOPC's pass by mistake).- Reading build logs:
GET /api/v1/repos/letsbe/pn-new-crm/actions/tasks(run status) + per-job logs; latestmainbuild is success.
How builds reach prod
git push origin main → Gitea Actions .gitea/workflows/build.yml:
- lint job:
pnpm lint+pnpm exec tsc --noEmit. - build-and-push job (main only): builds
Dockerfile→crm-appandDockerfile.worker→crm-worker, pushes:latest+:<sha>to the Gitea registry.
Prod pulls those images — it does not build. So a deploy is:
push → wait for green CI → docker compose pull + up -d on the server.
Prod stack (docker-compose.prod.yml)
| Service | Image | Notes |
|---|---|---|
postgres |
postgres:16-alpine |
self-contained, volume pgdata |
redis |
redis:7-alpine |
self-contained, volume redisdata (BullMQ + socket.io adapter) |
crm-app |
registry crm-app:latest |
host 7100 → container 3000 |
crm-worker |
registry crm-worker:latest |
BullMQ worker |
- Storage: no MinIO service in the compose — the CRM uses external
MinIO via
system_settings.storage_backend+getStorageBackend(). The existing prod MinIO (:9000,s3.conf/minio.confnginx vhosts) is the backend. Confirm bucket + keys (creds file §3). - Decision needed: does the CRM get its own Postgres (the compose
default, isolated
pgdata) or reuse an existing prod Postgres instance? Default = the compose's own Postgres (cleanest isolation). Confirm.
Phase 1 — crm.portnimara.com go-live
DNS already points crm.portnimara.com at the server. No crm.portnimara
nginx vhost exists yet (fresh setup). Template: portnimara_dev.conf
(reverse-proxy + Certbot pattern already in use on this box).
Pre-flight (no approval needed — prep only)
- Assemble the prod
.envfor the CRM. Source of truth:src/lib/env.ts(Zod schema) +.env.example. Critical keys:APP_URL=https://crm.portnimara.comDATABASE_URL(compose Postgres),REDIS_*- storage / MinIO (endpoint, access/secret, bucket) — creds file §3
DOCUMENSO_API_URL(bare host, no/api/v1),DOCUMENSO_API_VERSION, API key- better-auth secret,
WEBSITE_INTAKE_SECRET, SMTP/IMAP EMAIL_REDIRECT_TOMUST be unset in prod.
- Server can pull from the registry:
docker login code.letsbe.solutionswith a registry token (creds file §2 — generate a Gitea token; do not bake the account password into the server).
Step 1 — nginx vhost (⚠ approval)
- Create
/etc/nginx/sites-available/crm_portnimara.confmodelled onportnimara_dev.conf: port-80 → 443 redirect +.well-known/acme-challengelocation; port-443 serverproxy_pass http://127.0.0.1:7100with the same header block (Host, X-Real-IP, CF-Connecting-IP, X-Forwarded-_, websocketUpgrade/Connectionfor socket.io),client_max_body_size 64M,proxy_read_timeout 300, buffering off. HTTP-only first (nossl\__lines yet) so Certbot can complete the challenge. - Symlink into
sites-enabled/. sudo nginx -t— must pass. Thensudo systemctl reload nginx.
Step 2 — TLS cert (⚠ approval)
sudo certbot --nginx -d crm.portnimara.com— pulls + installs the cert, rewrites the vhost with the managedssl_certificatelines + 80→443 redirect. Re-runsudo nginx -t+ reload.
Step 3 — bring up the container (⚠ approval)
- Place
docker-compose.prod.yml+ the prod.envin the deploy dir (e.g./opt/pn-crm— confirm location). sudo docker login code.letsbe.solutions(registry token).sudo docker compose -f docker-compose.prod.yml pull.sudo docker compose -f docker-compose.prod.yml up -d.- Watch for errors:
sudo docker compose logs -f crm-app crm-worker. - Apply schema: migrations via
psql(per CLAUDE.mddb:migrateis broken) or the app's push path — confirm the prod migration approach. - Seed/bootstrap the port + admin user as needed.
Verify
curl -fsS https://crm.portnimara.com/api/public/health→{status:"ok"...}- Authenticated health w/
X-Intake-Secret→{checks:{db,redis}} - Login loads, branding renders, a berth list + a deal render.
- socket.io realtime connects (websocket upgrade through nginx works).
- No
42703column errors (restartcrm-appafter any schema change).
Phase 2 — Documenso v1.13.1 → v2.x upgrade (VITAL — execute SOBER, heavily gated)
Do not execute while impaired. This is the production signing system. Every mutating step needs an explicit, sober go/no-go. The runbook below is reference; the actual run is a scheduled session.
Verified facts (2026-05-31 recon + research)
| Item | Value |
|---|---|
| Current version | documenso/documenso:v1.13.1 (Oct 2025 — last v1) |
| Latest version | v2.11.0 (May 2026). Path: 1.13.1 → 2.0.0 → … → 2.11.0 (major jump) |
| Compose | /root/docker-compose/documenso/docker-compose.yml (project documenso-production, services documenso + database) |
| DB | postgres:15, db documenso_db, user admin, vol documenso-production_documenso-database → /var/lib/postgresql/data |
| App port | container 3000 → host 3020; served at https://signatures.portnimara.dev (nginx documenso.conf, direct — no Cloudflare) |
| Storage | external MinIO, bucket signatures @ s3.portnimara.com, region eu-central-1 |
| Signing cert | /opt/documenso/certificate.p12 (+ passphrase in env) |
Research conclusions (sources in chat):
- v1 API survives in v2 — "API V1 is stable but deprecated; nothing breaks." So the CRM keeps working on v1 API; flip to v2 later. (Will be explicitly re-tested against the clone in Phase 0 before committing.)
- Postgres 15 is v2's official DB — no DB-engine upgrade needed.
- Env vars carry over unchanged; only
NEXTAUTH_URLis dropped in v2 (auth now derives fromNEXT_PUBLIC_WEBAPP_URL, already set correctly) — harmless leftover. - Upgrade = pull new image + restart;
prisma migrate deployauto-runs all pending migrations on startup. - Known migration-failure history (issue #1880: NOT-NULL column added without backfill). 1.13.1 is past that one, but it's the failure pattern to expect — hence the clone dry-run.
- The login bounce (non-
Securecookie /NEXTAUTH_URLquirk) is plausibly fixed in v2's reworked auth, but treat that as a hoped-for bonus, not the goal.
Locked decisions (per Matt, 2026-05-31)
- Dry-run on a clone first: yes. Target latest v2.11.0, staged through v2.0.0.
- No-downtime caveat: true zero-downtime is not possible (migrations run on restart). Goal = brief + pre-rehearsed: validate fully on the clone, pre-pull the image, then a fast prod cutover in a low-traffic window.
- CRM stays on Documenso v1 API after upgrade.
- Backups:
pg_dump+ cert + compose/env pulled to the Mac (private/documenso-backups/, gitignored) and a cold volume snapshot kept on-server for fastest rollback. - Privilege: root via
su(stefan isn't in the docker group; sudo needs a password we don't have — root pass works forsu).
Phase 0 — Dry-run on a disposable clone (zero prod risk)
pg_dump -Fc documenso_db(live, no downtime) → restore into a throwawaypostgres:15+documenso:v2.11.0stack on a different compose project + port, with a copy of the signing cert.- Watch
prisma migrate deployrun the full 1.13.1→2.11.0 chain. Confirm: all migrations succeed, app boots, login works, existing documents render. - Re-test the CRM's v1 API calls against the clone → expect 200s.
- If a migration fails: capture it, fix forward (or decide a target version that's clean) BEFORE touching prod.
Phase A — Prod backups (after Phase 0 passes; verified before any change)
pg_dump -Fc documenso_db→ pull toprivate/documenso-backups/on the Mac (off-box). Plus a plain SQL dump.- Cold volume snapshot: stop stack →
tardocumenso-production_documenso-database→ keep on-server + copy off. (This is the gold rollback — Prisma migrations aren't reversible.) - Copy compose file + env +
/opt/documenso/{certificate.p12,private.key,certificate.crt}. - MinIO
signatures: read-only object inventory ({key,size,lastModified,etag}) + DB→storage-key mapping export (Document/DocumentData → storage key) so files can be re-matched if linkage breaks. - Test-restore the dump into a throwaway PG15; record SHA-256s.
Phase B — Collation pre-fix (low risk; validate need on the clone first)
REFRESH COLLATION VERSIONondocumenso_db(+template1/postgres) + reindex, so the libc 2.36→2.41 mismatch can't interfere with migration index ops.
Phase C — Prod upgrade (staged, pinned tags, low-traffic window)
- Pre-pull images. Edit compose:
v1.13.1 → v2.0.0→up -d→ watch migration logs → verify. - Then
v2.0.0 → v2.11.0→ verify. Keeppostgres:15.
Phase D — Verify
- Login works; an existing completed envelope's PDF resolves from MinIO; send a test envelope; webhook reaches the CRM (
X-Documenso-Secret, idempotenthandleDocumentCompleted); reminders/void work. - CRM unchanged (still v1 API).
Phase E — Rollback (any failure)
- Revert image tag + restore the volume snapshot (and/or DB dump) → back to v1.13.1 exactly.
Until Phase 0 passes AND a sober Phase A/C is explicitly approved step-by-step, do not touch the Documenso container, DB, volumes, or
/opt/documenso.
Open decisions / what I need from you
- ✅ MinIO creds filled; Documenso DB creds filled (creds file §3/§4). Still need the Documenso API token + webhook secret (generate after login as
matt@portnimara.com). - Verify the root/sudo password (
IpMKQ0TW56ovv80— confirmed it works forsuto root; not stefan's sudo password). - CRM Postgres: own (compose default) or reuse an existing instance?
- Deploy dir for the CRM on the server (
/opt/pn-crm?). - Registry pull token — Gitea token for
docker loginon the server. - ✅ Documenso target = v2.11.0, staged, clone-validated first.
- Maintenance window for the (brief, unavoidable) Documenso restart downtime.
- Off-box backup destination confirmed = Mac
private/documenso-backups/+ on-server volume snapshot.
Progress log
- 2026-05-31: Access established (SSH + Gitea API). Read-only recon done
(nginx templates, prod compose, host port 7100). CRM deploy plan drafted.
Documenso fully diagnosed read-only (v1.13.1, healthy app+DB, login issue =
wrong email
@letsbevs@portnimara.com+ a non-Secure-cookie quirk; 5432 publicly exposed + brute-forced; libc collation mismatch). Researched v2 upgrade (v2.11.0 latest, PG15 ok, env vars carry over, v1 API survives). Upgrade runbook drafted. No prod changes made; no backups taken. - 2026-06-01: Phase 0 dry-run PASSED (local, zero prod impact). Read-only
pg_dumpof prod (3.5 MB — metadata only) → restored into a throwawaypostgres:15→ booteddocumenso:v2.11.0against it. Result: full v1.13.1→v2.11.0 chain applied cleanly (All migrations have been successfully applied, 140→157, none unfinished), app boots (home 302, signin 200, v2 api 200), and v1 API still answers (400 not 404) → CRM safe. Dump saved atprivate/documenso-backups/(off-box backup). Dry-run stack torn down 2026-06-01 after the pass (docker compose -p documenso-dryrun down -v— containers + anonymous volume + network removed; restored clone gone, off-box dump retained). Compose file kept atprivate/documenso-dryrun/docker-compose.ymlfor a re-run. Prod still untouched.
Environment variables — initial deployment + cutover
Single source of truth for the env each instance needs for the website<->CRM integration (added 2026-06-02). Every website-side CRM var is a no-op when unset, so the marketing site behaves exactly as today until these are filled at cutover. Full CRM schema:
src/lib/env.ts.
CRM instance (crm.portnimara.com)
| Var | Value | Notes |
|---|---|---|
APP_URL |
https://crm.portnimara.com |
Absolute URLs + email links (the inquiry sales-alert "Open in CRM" button). |
WEBSITE_INTAKE_SECRET |
shared secret | MUST equal the website's CRM_INTAKE_SECRET. If unset, /api/public/website-inquiries returns 503 and refuses all intake. |
EMAIL_REDIRECT_TO |
unset in prod | Dev-only reroute; the prod build guard fails if it is set. |
DATABASE_URL, REDIS_*, storage/MinIO, DOCUMENSO_*, SMTP_*, better-auth secret |
per .env |
Standard (see Phase 1 Pre-flight). |
Per-port settings (stored in system_settings, set via Admin UI — NOT env):
website_intake_email_enabled— boolean, default OFF. Flip ON at cutover so the CRM sends the registrant confirmation + staff alert for website inquiries (berth / residence / contact), reusing the branded templates + per-port From. Keep OFF until the website's own sending is turned off (seeWEBSITE_INQUIRY_EMAILS_DISABLED) to avoid double-sends.inquiry_notification_recipients(JSON string[]) — staff who receive berth + contact-form inquiry alerts.residential_notification_recipients(JSON string[]) — staff who receive residence inquiry alerts.inquiry_contact_email(string) — fallback alert recipient + reply-to.
Website instance (Nuxt marketing site — repo ron/website.git)
New vars for the CRM integration (read via process.env in Nitro;
all no-op when unset → site unchanged):
| Var | Value | Enables | Set when |
|---|---|---|---|
CRM_INTAKE_URL |
https://crm.portnimara.com (bare host, no trailing slash) |
Inquiry dual-write delivery + base URL for the berth feed | Cutover (safe earlier; just starts populating website_submissions) |
CRM_INTAKE_SECRET |
shared secret | Auth for the dual-write (X-Webhook-Secret); MUST equal CRM WEBSITE_INTAKE_SECRET |
With CRM_INTAKE_URL |
CRM_BERTHS_ENABLED |
1 (or true/yes) |
Switches the public berth map/list to read from CRM /api/public/berths instead of NocoDB (requires CRM_INTAKE_URL) |
Cutover, after CRM berth data is migrated + verified |
WEBSITE_INQUIRY_EMAILS_DISABLED |
1 |
Turns OFF the website's own Gmail confirmation + alert emails, handing email ownership to the CRM | Cutover, flipped together with CRM website_intake_email_enabled = ON |
UTM: no env var — cookieless; the client plugin reads utm_* from the
landing URL and forwards them via an x-utm header.
Existing website env (keep, unchanged): NocoDB url/token, SMTP user/pass,
alertRecipientsBerths/Residences/Contact, RECAPTCHA_SECRET,
NUXT_PUBLIC_RECAPTCHA_SITE_KEY, Directus url. NocoDB stays as the berth
fallback + the dual-write's primary target until the old system is retired;
SMTP + alert recipients stay until WEBSITE_INQUIRY_EMAILS_DISABLED is set.
Cutover env-flip sequence (website)
- Confirm CRM is up, berth data migrated, and
WEBSITE_INTAKE_SECRETset on the CRM. - Set website
CRM_INTAKE_URL+CRM_INTAKE_SECRET→ verify a test inquiry lands inwebsite_submissions. - Flip CRM
website_intake_email_enabled = ONand websiteWEBSITE_INQUIRY_EMAILS_DISABLED = 1together → CRM is the single email owner. - Set website
CRM_BERTHS_ENABLED = 1→ public map reads from the CRM. - Watch errors; rollback = unset the website vars (instant revert to NocoDB + website email).
Progress log (cont.)
- 2026-06-02: Website integration prep (local only; no prod changes, nothing pushed).
Website repo (
main, uncommitted): env-gated berth feed (CRM_BERTHS_ENABLED), cookieless UTM forwarding (no env), inquiry dual-write (pre-existing). Website email kill-switch added (WEBSITE_INQUIRY_EMAILS_DISABLED). CRM repo: flag-gated email ownership (website_intake_email_enabled, default OFF) reusing the inquiry- residential templates plus a new contact-form alert template, hooked into
/api/public/website-inquiries. New website env vars documented above. CRM tsc-clean + unit test added; website berth/UTM vue-tsc-clean. Nothing deployed.
- residential templates plus a new contact-form alert template, hooked into