From 7dba1a47bb21a07fb4f3c02fa4caa34f77848c8f Mon Sep 17 00:00:00 2001 From: Matt Date: Mon, 1 Jun 2026 19:03:32 +0200 Subject: [PATCH] =?UTF-8?q?fix(migration):=20modernize=20stale=20NocoDB?= =?UTF-8?q?=E2=86=92CRM=20pipeline=20stage=20map=20to=20current=207=20stag?= =?UTF-8?q?es?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 2026-05-03 migration pipeline (src/lib/dedup/*) predates the 9→7 pipeline-stage refactor; its STAGE_MAP emitted invalid stages (open/details_sent/eoi_sent/…) that would write bad pipeline_stage values on --apply. Remap to the current PIPELINE_STAGES (enquiry/qualified/ nurturing/eoi/reservation/deposit_paid/contract) + a deposit-received → deposit_paid override. Frozen-fixture test expectations updated (17/17 pass). Validated: live --dry-run = 239 clients / 255 interests / 41 EOI docs (matches independent snapshot analysis; pipeline is more conservative and flags 3 borderline pairs for review). Adds the migration design spec (source map, scope lock to Port Nimara + Expenses bases, EOI coverage 48/48, in-flight Documenso state, remaining gaps: interest eoiStatus, expenses, doc-blob backfill). Co-Authored-By: Claude Opus 4.8 (1M context) --- ...2026-06-01-legacy-data-migration-design.md | 212 ++++++++++++++++++ src/lib/dedup/migration-transform.ts | 28 ++- tests/unit/dedup/migration-transform.test.ts | 8 +- 3 files changed, 235 insertions(+), 13 deletions(-) create mode 100644 docs/superpowers/specs/2026-06-01-legacy-data-migration-design.md diff --git a/docs/superpowers/specs/2026-06-01-legacy-data-migration-design.md b/docs/superpowers/specs/2026-06-01-legacy-data-migration-design.md new file mode 100644 index 00000000..90607eb7 --- /dev/null +++ b/docs/superpowers/specs/2026-06-01-legacy-data-migration-design.md @@ -0,0 +1,212 @@ +# Legacy → New CRM Data Migration — Design Spec + +> **Status:** DRAFT (2026-06-01) · scope locked · awaiting stage-map sign-off +> **Goal:** Translate all live legacy data + reconnect documents/EOIs so the +> new CRM "picks up exactly where we left off." +> **Companion:** `docs/launch-readiness.md` Initiative 5 · `docs/deployment-plan.md` +> **Source snapshot:** read-only `pg_dump` of prod NocoDB at +> `private/nocodb-snapshot/` (gitignored), restored locally as `nocodb_legacy`. + +## 1. Source landscape (verified 2026-06-01) + +Legacy data is spread across these systems (portal has **no DB of its own**): + +| System | What | Migrate? | +| ------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------- | +| **NocoDB "Port Nimara"** base `plplouets5zw1um` | Interests (255), Berths (117), Residences (45), multi-berth junction `_nc_m2m_Berths_Interests` (83), Website subs (Interest 64 / Contact 50 / BerthEOI 1), Newsletter (69), reminder/alert settings | ✅ | +| **NocoDB "Expenses"** base `p3hq2fxdevqcaq8` | Expenses (165); `invoices` empty | ✅ | +| **MinIO bucket `client-portal`** | EOIs, berth PDFs, receipts, business cards, general files | ✅ (Phase 2) | +| **MinIO bucket `signatures`** | Documenso signed PDFs | ✅ (Phase 2) | +| **Documenso v1.13.1** | Signing envelopes, linked per-deal by `documensoID` | ✅ (Phase 2) | +| 9 other NocoDB bases (Customer_List, Registered Interest, Form Submissions, 2nd Residential, Image Uploads, EOI Queue, …) | Old imports/experiments/backups | ❌ **excluded** — zero code refs; stale 7–14 months | +| Gmail (IMAP), Keycloak | Email archive, portal auth | ❌ out of scope (per Matt) | + +**Authority for scope:** the live portal + website code reference table IDs in +**only** the two active bases above; the recency check confirms `Interests` is +the only actively-written table (last write 2026-05-21). + +**Legacy has no Company entity** (everything is attributed to a person), so the +migration creates **clients + yachts (client-owned) + deals** — no companies. + +## 2. Key linking facts + +- **Client + yacht are inline on each Interests row** → extract + dedup. +- **`documensoID`** (e.g. `"82"`) on each deal → resolves to Documenso + `Envelope.secondaryId = 'document_' || documensoID` (verified: deal + `doc=114` → envelope `document_114`). The envelope's completed PDF = the + signed EOI. (Prod Documenso = v1.13.1, 140 migrations — confirmed.) +- **`Berth Number`** (mooring, e.g. `D31`) + the `_nc_m2m_Berths_Interests` + junction → multi-berth links. +- **Notes** = inline `Internal Notes` + `Extra Comments` (+ 5 rows in + `nc_comments`). +- Dedup key for people: **lowercased email → fallback canonical phone**. + +## 3. Phase 1 — NocoDB → new CRM (data) + +Build against the local `nocodb_legacy` snapshot; idempotent; every new row +stamped with its `legacy_nocodb_id` (add a nullable column or a side mapping +table `migration_id_map(entity, legacy_id, new_id)`). + +**Import order (FK-safe):** clients → yachts → interests → interest_berths → +notes → residential → expenses → website_submissions → settings. + +### 3.1 Clients (from Interests, deduped) + +Source fields → `clients`: `Full Name`→fullName (title-cased via the legacy +`normalizePersonName` rule), `Email Address`→primary email, `Phone Number`→ +canonical phone, `Address`+`Place of Residence`→address/locality, +`Contact Method Preferred`→preferredContactMethod, `Source`→source, +`Lead Category`→(deal-level, see below). **Dedup:** group all 255 interests by +lowercased email (fallback canonical phone); one client per unique person, +N deals. + +### 3.2 Yachts (from Interests) + +`Yacht Name`→name (skip `TBC`/blank), `Length`/`Width`/`Depth`→dims. **Unit +note:** legacy stores strings like `"50ft"` — parse number + unit, convert ft→m +to match the berth/yacht numeric schema (store original string in a note if +ambiguous). Owner = the deduped client (polymorphic `client`). + +### 3.3 Interests / deals + +- **Stage:** map `Sales Process Level` (8) → new 7-stage pipeline — **see §4 + (needs sign-off).** +- `Lead Category` (General / Friends and Family)→leadCategory, `Source`→source. +- Statuses: `EOI Status`, `Deposit 10% Status`, `Contract Status`, + `Contract Sent Status`, `Berth Info Sent Status` → drive stage + the new + EOI/contract/deposit fields; `Deposit 10% Status='Received'` → a `payments` + row (deposit) + auto-advance. +- Dates: `Date Added`/`Created At`→createdAt (DD-MM-YYYY → ISO; many are null — + fall back to Documenso/earliest signal), `EOI Time Sent`, `Time LOI Sent`. +- `documensoID` → stored for Phase 2 EOI relink. +- **Outcome:** `Sales Process Level='Contract Signed'` + deposit/contract + complete → won; otherwise open. (No explicit "lost" in legacy.) + +### 3.4 interest_berths (multi-berth) + +From `_nc_m2m_Berths_Interests` (83 links) → `interest_berths` via +`interestBerthsService`. `is_primary` = the `Berth Number` plain-text mooring +(or first link); `is_in_eoi_bundle` = true for signed/sent EOIs. Resolve berth +by mooring against the migrated 117 berths. + +### 3.5 Notes + +`Internal Notes` + `Extra Comments` (and `nc_comments`) → `interestNotes` via +`notes.service`, preserving original timestamps where present. + +### 3.6 Residential + +`Interests (Residences)` (45) → `residential_clients` + `residential_interests` +(dedup by email). The 2nd residential base (16 rows) is **excluded** (stale). + +### 3.7 Expenses + +`Expenses` base (165) → the expenses module. Map Time→date, Payer→payer, +Category→category, Price (string `"€1,234"`)→numeric+currency. Receipts linked +in Phase 2 (the `Receipts` images live in MinIO). + +### 3.8 Website submissions + settings + +Website Interest/Contact/BerthEOI subs → `website_submissions`. `reminder_settings` +/`alert_settings` → best-effort into `system_settings`. + +## 4. Stage mapping (8 → 7) — NEEDS SIGN-OFF + +Legacy `Sales Process Level` → new pipeline stage (proposed): + +| Legacy | New stage | +| ------------------------------- | --------------------------- | +| General Qualified Interest | `qualified` | +| Specific Qualified Interest | `nurturing` | +| EOI and NDA Sent | `eoi` | +| Signed EOI and NDA | `eoi` (EOI signed) | +| Made Reservation | `reservation` | +| Contract Negotiation | `reservation` → `contract`? | +| Contract Negotiations Finalized | `contract` | +| Contract Signed | `contract` (won) | + +Open questions for Matt: (a) is "General Qualified Interest" really `qualified` +or should some map to `enquiry`? (b) does "Contract Negotiation" belong in +`reservation` or `contract`? (c) treat `Contract Signed` as a closed-won +outcome? + +## 5. Phase 2 — documents & EOIs (MinIO inventoried 2026-06-01) + +Documents live in **three** MinIO buckets (verified): + +- **`client-portal`** (248 objects, 240 MB) — cleanly foldered: `Berth-PDFs/` + (114, mooring in filename), `EOIs/` (95 signed EOIs foldered by client name), + `Client Documents/` (6), `Legal/` (14), `expense-sheets/` (2), + `client-emails/` (3 sent-email JSONs keyed `interest-`). +- **`signatures`** (323) — Documenso's raw per-envelope store (many test dupes — + secondary source). +- **`database`** — NocoDB's own attachment store at + `database/nc/uploads/noco/plplouets5zw1um/mbs9hjauug4eseo/cjzx7y2h9sxwd0n/…` + (field `cjzx7y2h9sxwd0n` = `EOI_Document`). **This is where the pre-Documenso + ("before/aside") signed EOIs live**, as NocoDB attachments. + +**EOI coverage — verified, no missing signed EOI.** Of 255 interests, 48 are +EOI-signed; every one resolves to a recoverable PDF: + +1. **~38 via `documensoID`** → `Envelope.secondaryId='document_'||id` → + completed PDF (+ curated copy in `client-portal/EOIs//`). +2. **~10 old LOI-process deals** (no documensoID, `LOI=Signing Complete`) → + `EOI_Document` attachment in the **`database`** bucket. +3. **3 via explicit `S3_Documenso_Path`** → `client-portal/EOIs/`. + +Backfill order per deal: prefer the curated `client-portal/EOIs/` copy → fall +back to Documenso (by secondaryId) → then the NocoDB `database` attachment. Each +→ store via `getStorageBackend()` → `files`+`documents` rows → `ensureEntityFolder`. +Still run a file↔deal reconciliation to flag orphan EOI files + confirm each +envelope PDF actually downloads. + +4. **Berth PDFs:** `client-portal/Berth-PDFs/` (114) → `berth_pdf_versions` + (mooring parsed from filename). +5. **Receipts / business cards:** NOT in `client-portal` — likely in `forms`/ + `images`/`directus` buckets (OpnForm uploads). Hunt only if wanted. +6. Unresolved → manual-review CSV. + +### ⚠ Crossover gate — in-flight Documenso signings + +Documenso currently holds **6 PENDING** (sent, awaiting signature) + **6 DRAFT** +envelopes (of 58 total; 46 COMPLETED). PENDING: Thomas Nemic (2026-02-04), Davy +Morée (2025-11-28), Matthew Ciaccio (2025-11-24), Ben Sturge (2025-10-11), Van +der Merwe (2025-10-02), Charles Davis (2025-08-22) — most stale/likely abandoned, +only one from 2026. **Before the Documenso upgrade/crossover, review these:** void +the dead ones, let any genuine one finish — don't strand an active signature. + +## 6. Verification & reconcile + +**Validated run (2026-06-01, `extract-nocodb.ts`):** 255 interests → **232 +unique clients** (1.10×; 21 with >1 deal roll up correctly), 39 yachts, 84 +deal↔berth links (12 multi-berth), 63 notes. Stages 8→7: qualified 171 · eoi 51 +· nurturing 30 · reservation 2 · contract 1. **EOI coverage 48/48 resolvable.** +Signing state (Documenso-authoritative): signed 48 · **awaiting_signature 3** +(interests 581/633/639 → migrate as "awaiting" + keep envelope link + display +pending) · none 204. Duplicate review: 1 exact-name (Etiennette Clamouze ×2), 0 +fuzzy. Residential 45→35. Expenses 165 (0 parse fails). Output → +`private/migration-output/` (gitignored). + +**In-flight signing display:** the 3 `awaiting_signature` deals load with the +interest's EOI state = sent/awaiting + the Documenso envelope linked, so the new +CRM's webhook/poll completes them and the UI shows "Waiting for signatures." +Reconcile the 6 Documenso PENDING: 3 link to deals (in-flight above); 3 are +abandoned re-sends of already-signed deals → void-review before crossover. + +Remaining: spot-check 5 deals end-to-end after load. + +## 7. Deliverables (scripts/migration/) + +- `probe-minio.ts` — bucket inventory (Phase 2 sizing; answers "are the + business cards there?"). +- `extract-nocodb.ts` — read the snapshot, emit normalized JSON per entity. +- `transform-load.ts` — dedup + map + load via service helpers, idempotent. +- `backfill-documents.ts` — Phase 2 EOI/PDF/receipt backfill. +- `reconcile.ts` — final report. + +## 8. Decisions locked (2026-06-01) + +- Scope = the 2 active bases only; 9 others excluded; email/Keycloak out. +- Extract via read-only pg_dump snapshot (done). +- No company entities (legacy has none). +- Idempotent, keyed on `legacy_nocodb_id`. diff --git a/src/lib/dedup/migration-transform.ts b/src/lib/dedup/migration-transform.ts index 99a7b655..da4f7b18 100644 --- a/src/lib/dedup/migration-transform.ts +++ b/src/lib/dedup/migration-transform.ts @@ -201,15 +201,19 @@ const DEFAULT_OPTIONS: TransformOptions = { // ─── Stage mapping ────────────────────────────────────────────────────────── +// Updated 2026-06-01 to the current 7-stage pipeline. The prior map targeted +// the pre-(9→7)-refactor vocab (open/details_sent/eoi_sent/…) and would have +// written invalid `pipeline_stage` values. Current stages live in +// `src/lib/constants.ts` PIPELINE_STAGES. const STAGE_MAP: Record = { - 'General Qualified Interest': 'open', - 'Specific Qualified Interest': 'details_sent', - 'EOI and NDA Sent': 'eoi_sent', - 'Signed EOI and NDA': 'eoi_signed', - 'Made Reservation': 'deposit_10pct', - 'Contract Negotiation': 'contract_sent', - 'Contract Negotiations Finalized': 'contract_sent', - 'Contract Signed': 'contract_signed', + 'General Qualified Interest': 'qualified', + 'Specific Qualified Interest': 'nurturing', + 'EOI and NDA Sent': 'eoi', + 'Signed EOI and NDA': 'eoi', + 'Made Reservation': 'reservation', + 'Contract Negotiation': 'contract', + 'Contract Negotiations Finalized': 'contract', + 'Contract Signed': 'contract', }; const LEAD_CATEGORY_MAP: Record = { @@ -622,6 +626,12 @@ function buildPlannedClient( function buildPlannedInterest(row: NocoDbRow, clientTempId: string): PlannedInterest { const stage = (row['Sales Process Level'] as string | undefined) ?? ''; const cat = (row['Lead Category'] as string | undefined) ?? ''; + // Deposit received overrides the mapped stage → deposit_paid (unless the deal + // is already at contract). Default for unmapped/blank legacy stage = enquiry. + const depositReceived = + ((row['Deposit 10% Status'] as string | undefined) ?? '').trim() === 'Received'; + let mappedStage = STAGE_MAP[stage] ?? 'enquiry'; + if (depositReceived && mappedStage !== 'contract') mappedStage = 'deposit_paid'; const notesParts: string[] = []; const internalNotes = row['Internal Notes'] as string | undefined; @@ -634,7 +644,7 @@ function buildPlannedInterest(row: NocoDbRow, clientTempId: string): PlannedInte return { sourceId: row.Id, clientTempId, - pipelineStage: STAGE_MAP[stage] ?? 'open', + pipelineStage: mappedStage, leadCategory: LEAD_CATEGORY_MAP[cat] ?? null, source: ((row['Source'] as string | undefined) ?? null) || null, notes: notesParts.join('\n\n') || null, diff --git a/tests/unit/dedup/migration-transform.test.ts b/tests/unit/dedup/migration-transform.test.ts index a38ce173..be8bd674 100644 --- a/tests/unit/dedup/migration-transform.test.ts +++ b/tests/unit/dedup/migration-transform.test.ts @@ -183,10 +183,10 @@ describe('transformSnapshot - fixture regression', () => { it('maps the legacy 8-stage enum to new pipeline stages', () => { const plan = transformSnapshot(FIXTURE); const stagesById = new Map(plan.interests.map((i) => [i.sourceId, i.pipelineStage])); - expect(stagesById.get(681)).toBe('open'); // General Qualified Interest - expect(stagesById.get(682)).toBe('details_sent'); // Specific Qualified Interest - expect(stagesById.get(336)).toBe('contract_signed'); // Contract Signed - expect(stagesById.get(585)).toBe('eoi_signed'); // Signed EOI and NDA + expect(stagesById.get(681)).toBe('qualified'); // General Qualified Interest + expect(stagesById.get(682)).toBe('nurturing'); // Specific Qualified Interest + expect(stagesById.get(336)).toBe('contract'); // Contract Signed + expect(stagesById.get(585)).toBe('eoi'); // Signed EOI and NDA }); it('attaches different yachts to one merged Constanzo client', () => {