The 2026-05-03 migration pipeline (src/lib/dedup/*) predates the 9→7 pipeline-stage refactor; its STAGE_MAP emitted invalid stages (open/details_sent/eoi_sent/…) that would write bad pipeline_stage values on --apply. Remap to the current PIPELINE_STAGES (enquiry/qualified/ nurturing/eoi/reservation/deposit_paid/contract) + a deposit-received → deposit_paid override. Frozen-fixture test expectations updated (17/17 pass). Validated: live --dry-run = 239 clients / 255 interests / 41 EOI docs (matches independent snapshot analysis; pipeline is more conservative and flags 3 borderline pairs for review). Adds the migration design spec (source map, scope lock to Port Nimara + Expenses bases, EOI coverage 48/48, in-flight Documenso state, remaining gaps: interest eoiStatus, expenses, doc-blob backfill). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
13 KiB
Legacy → New CRM Data Migration — Design Spec
Status: DRAFT (2026-06-01) · scope locked · awaiting stage-map sign-off Goal: Translate all live legacy data + reconnect documents/EOIs so the new CRM "picks up exactly where we left off." Companion:
docs/launch-readiness.mdInitiative 5 ·docs/deployment-plan.mdSource snapshot: read-onlypg_dumpof prod NocoDB atprivate/nocodb-snapshot/(gitignored), restored locally asnocodb_legacy.
1. Source landscape (verified 2026-06-01)
Legacy data is spread across these systems (portal has no DB of its own):
| System | What | Migrate? |
|---|---|---|
NocoDB "Port Nimara" base plplouets5zw1um |
Interests (255), Berths (117), Residences (45), multi-berth junction _nc_m2m_Berths_Interests (83), Website subs (Interest 64 / Contact 50 / BerthEOI 1), Newsletter (69), reminder/alert settings |
✅ |
NocoDB "Expenses" base p3hq2fxdevqcaq8 |
Expenses (165); invoices empty |
✅ |
MinIO bucket client-portal |
EOIs, berth PDFs, receipts, business cards, general files | ✅ (Phase 2) |
MinIO bucket signatures |
Documenso signed PDFs | ✅ (Phase 2) |
| Documenso v1.13.1 | Signing envelopes, linked per-deal by documensoID |
✅ (Phase 2) |
| 9 other NocoDB bases (Customer_List, Registered Interest, Form Submissions, 2nd Residential, Image Uploads, EOI Queue, …) | Old imports/experiments/backups | ❌ excluded — zero code refs; stale 7–14 months |
| Gmail (IMAP), Keycloak | Email archive, portal auth | ❌ out of scope (per Matt) |
Authority for scope: the live portal + website code reference table IDs in
only the two active bases above; the recency check confirms Interests is
the only actively-written table (last write 2026-05-21).
Legacy has no Company entity (everything is attributed to a person), so the migration creates clients + yachts (client-owned) + deals — no companies.
2. Key linking facts
- Client + yacht are inline on each Interests row → extract + dedup.
documensoID(e.g."82") on each deal → resolves to DocumensoEnvelope.secondaryId = 'document_' || documensoID(verified: dealdoc=114→ envelopedocument_114). The envelope's completed PDF = the signed EOI. (Prod Documenso = v1.13.1, 140 migrations — confirmed.)Berth Number(mooring, e.g.D31) + the_nc_m2m_Berths_Interestsjunction → multi-berth links.- Notes = inline
Internal Notes+Extra Comments(+ 5 rows innc_comments). - Dedup key for people: lowercased email → fallback canonical phone.
3. Phase 1 — NocoDB → new CRM (data)
Build against the local nocodb_legacy snapshot; idempotent; every new row
stamped with its legacy_nocodb_id (add a nullable column or a side mapping
table migration_id_map(entity, legacy_id, new_id)).
Import order (FK-safe): clients → yachts → interests → interest_berths → notes → residential → expenses → website_submissions → settings.
3.1 Clients (from Interests, deduped)
Source fields → clients: Full Name→fullName (title-cased via the legacy
normalizePersonName rule), Email Address→primary email, Phone Number→
canonical phone, Address+Place of Residence→address/locality,
Contact Method Preferred→preferredContactMethod, Source→source,
Lead Category→(deal-level, see below). Dedup: group all 255 interests by
lowercased email (fallback canonical phone); one client per unique person,
N deals.
3.2 Yachts (from Interests)
Yacht Name→name (skip TBC/blank), Length/Width/Depth→dims. Unit
note: legacy stores strings like "50ft" — parse number + unit, convert ft→m
to match the berth/yacht numeric schema (store original string in a note if
ambiguous). Owner = the deduped client (polymorphic client).
3.3 Interests / deals
- Stage: map
Sales Process Level(8) → new 7-stage pipeline — see §4 (needs sign-off). Lead Category(General / Friends and Family)→leadCategory,Source→source.- Statuses:
EOI Status,Deposit 10% Status,Contract Status,Contract Sent Status,Berth Info Sent Status→ drive stage + the new EOI/contract/deposit fields;Deposit 10% Status='Received'→ apaymentsrow (deposit) + auto-advance. - Dates:
Date Added/Created At→createdAt (DD-MM-YYYY → ISO; many are null — fall back to Documenso/earliest signal),EOI Time Sent,Time LOI Sent. documensoID→ stored for Phase 2 EOI relink.- Outcome:
Sales Process Level='Contract Signed'+ deposit/contract complete → won; otherwise open. (No explicit "lost" in legacy.)
3.4 interest_berths (multi-berth)
From _nc_m2m_Berths_Interests (83 links) → interest_berths via
interestBerthsService. is_primary = the Berth Number plain-text mooring
(or first link); is_in_eoi_bundle = true for signed/sent EOIs. Resolve berth
by mooring against the migrated 117 berths.
3.5 Notes
Internal Notes + Extra Comments (and nc_comments) → interestNotes via
notes.service, preserving original timestamps where present.
3.6 Residential
Interests (Residences) (45) → residential_clients + residential_interests
(dedup by email). The 2nd residential base (16 rows) is excluded (stale).
3.7 Expenses
Expenses base (165) → the expenses module. Map Time→date, Payer→payer,
Category→category, Price (string "€1,234")→numeric+currency. Receipts linked
in Phase 2 (the Receipts images live in MinIO).
3.8 Website submissions + settings
Website Interest/Contact/BerthEOI subs → website_submissions. reminder_settings
/alert_settings → best-effort into system_settings.
4. Stage mapping (8 → 7) — NEEDS SIGN-OFF
Legacy Sales Process Level → new pipeline stage (proposed):
| Legacy | New stage |
|---|---|
| General Qualified Interest | qualified |
| Specific Qualified Interest | nurturing |
| EOI and NDA Sent | eoi |
| Signed EOI and NDA | eoi (EOI signed) |
| Made Reservation | reservation |
| Contract Negotiation | reservation → contract? |
| Contract Negotiations Finalized | contract |
| Contract Signed | contract (won) |
Open questions for Matt: (a) is "General Qualified Interest" really qualified
or should some map to enquiry? (b) does "Contract Negotiation" belong in
reservation or contract? (c) treat Contract Signed as a closed-won
outcome?
5. Phase 2 — documents & EOIs (MinIO inventoried 2026-06-01)
Documents live in three MinIO buckets (verified):
client-portal(248 objects, 240 MB) — cleanly foldered:Berth-PDFs/(114, mooring in filename),EOIs/(95 signed EOIs foldered by client name),Client Documents/(6),Legal/(14),expense-sheets/(2),client-emails/(3 sent-email JSONs keyedinterest-<id>).signatures(323) — Documenso's raw per-envelope store (many test dupes — secondary source).database— NocoDB's own attachment store atdatabase/nc/uploads/noco/plplouets5zw1um/mbs9hjauug4eseo/cjzx7y2h9sxwd0n/…(fieldcjzx7y2h9sxwd0n=EOI_Document). This is where the pre-Documenso ("before/aside") signed EOIs live, as NocoDB attachments.
EOI coverage — verified, no missing signed EOI. Of 255 interests, 48 are EOI-signed; every one resolves to a recoverable PDF:
- ~38 via
documensoID→Envelope.secondaryId='document_'||id→ completed PDF (+ curated copy inclient-portal/EOIs/<name>/). - ~10 old LOI-process deals (no documensoID,
LOI=Signing Complete) →EOI_Documentattachment in thedatabasebucket. - 3 via explicit
S3_Documenso_Path→client-portal/EOIs/.
Backfill order per deal: prefer the curated client-portal/EOIs/ copy → fall
back to Documenso (by secondaryId) → then the NocoDB database attachment. Each
→ store via getStorageBackend() → files+documents rows → ensureEntityFolder.
Still run a file↔deal reconciliation to flag orphan EOI files + confirm each
envelope PDF actually downloads.
- Berth PDFs:
client-portal/Berth-PDFs/(114) →berth_pdf_versions(mooring parsed from filename). - Receipts / business cards: NOT in
client-portal— likely informs/images/directusbuckets (OpnForm uploads). Hunt only if wanted. - Unresolved → manual-review CSV.
⚠ Crossover gate — in-flight Documenso signings
Documenso currently holds 6 PENDING (sent, awaiting signature) + 6 DRAFT envelopes (of 58 total; 46 COMPLETED). PENDING: Thomas Nemic (2026-02-04), Davy Morée (2025-11-28), Matthew Ciaccio (2025-11-24), Ben Sturge (2025-10-11), Van der Merwe (2025-10-02), Charles Davis (2025-08-22) — most stale/likely abandoned, only one from 2026. Before the Documenso upgrade/crossover, review these: void the dead ones, let any genuine one finish — don't strand an active signature.
6. Verification & reconcile
Validated run (2026-06-01, extract-nocodb.ts): 255 interests → 232
unique clients (1.10×; 21 with >1 deal roll up correctly), 39 yachts, 84
deal↔berth links (12 multi-berth), 63 notes. Stages 8→7: qualified 171 · eoi 51
· nurturing 30 · reservation 2 · contract 1. EOI coverage 48/48 resolvable.
Signing state (Documenso-authoritative): signed 48 · awaiting_signature 3
(interests 581/633/639 → migrate as "awaiting" + keep envelope link + display
pending) · none 204. Duplicate review: 1 exact-name (Etiennette Clamouze ×2), 0
fuzzy. Residential 45→35. Expenses 165 (0 parse fails). Output →
private/migration-output/ (gitignored).
In-flight signing display: the 3 awaiting_signature deals load with the
interest's EOI state = sent/awaiting + the Documenso envelope linked, so the new
CRM's webhook/poll completes them and the UI shows "Waiting for signatures."
Reconcile the 6 Documenso PENDING: 3 link to deals (in-flight above); 3 are
abandoned re-sends of already-signed deals → void-review before crossover.
Remaining: spot-check 5 deals end-to-end after load.
7. Deliverables (scripts/migration/)
probe-minio.ts— bucket inventory (Phase 2 sizing; answers "are the business cards there?").extract-nocodb.ts— read the snapshot, emit normalized JSON per entity.transform-load.ts— dedup + map + load via service helpers, idempotent.backfill-documents.ts— Phase 2 EOI/PDF/receipt backfill.reconcile.ts— final report.
8. Decisions locked (2026-06-01)
- Scope = the 2 active bases only; 9 others excluded; email/Keycloak out.
- Extract via read-only pg_dump snapshot (done).
- No company entities (legacy has none).
- Idempotent, keyed on
legacy_nocodb_id.