Files
pn-new-crm/NOCODB-MIGRATION-MAPPING.md
Matt 67d7e6e3d5
Some checks failed
Build & Push Docker Images / build-and-push (push) Has been cancelled
Build & Push Docker Images / deploy (push) Has been cancelled
Build & Push Docker Images / lint (push) Has been cancelled
Initial commit: Port Nimara CRM (Layers 0-4)
Full CRM rebuild with Next.js 15, TypeScript, Tailwind, Drizzle ORM,
PostgreSQL, Redis, BullMQ, MinIO, and Socket.io. Includes 461 source
files covering clients, berths, interests/pipeline, documents/EOI,
expenses/invoices, email, notifications, dashboard, admin, and
client portal. CI/CD via Gitea Actions with Docker builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:52:51 +01:00

63 KiB
Raw Blame History

NocoDB → PostgreSQL Migration Mapping

Purpose: Field-by-field mapping from the current NocoDB tables to the new PostgreSQL schema, including transformation rules, data quality fixes, deduplication logic, and migration order. References: 07-DATABASE-SCHEMA.md, IMPL-L6-MIGRATION.md, annotated source utils/types.ts, server/utils/nocodb.ts, server/utils/nocodb-invoice.ts


Source System Summary

The current NocoDB system has four primary tables:

NocoDB Table Table ID Est. Records Destination Tables
Interests mbs9hjauug4eseo ~200500 clients, client_contacts, interests, interest_notes, documents, document_signers
Berths mczgos9hr3oa9qc ~100200 berths, berth_map_data
Expenses mxfcefkk4dqs6uq ~5002000 expenses, files (receipts)
Invoices mvyvz0lpc30p01s ~200500 invoices, invoice_line_items, invoice_expenses

Key structural change: NocoDB's Interests table is a "mega-table" that combines client identity data, contact details, vessel specifications, pipeline tracking, milestone dates, and document references into a single 60+ field row. The new schema normalizes this into separate clients, client_contacts, interests, interest_notes, and documents tables.


Migration Order (Respects Foreign Keys)

1. ports              ← Seed from L0 (already exists)
2. clients            ← Derived from NocoDB Interests (dedup)
3. client_contacts    ← Derived from NocoDB Interests (dedup)
4. berths             ← From NocoDB Berths
5. berth_map_data     ← From NocoDB Berths (if SVG data exists)
6. interests          ← From NocoDB Interests (references clients + berths)
7. interest_notes     ← From NocoDB Interests "Extra Comments" field
8. documents          ← From NocoDB Interests (EOI/contract references)
9. document_signers   ← From NocoDB Interests (signing link fields)
10. expenses          ← From NocoDB Expenses
11. files             ← From MinIO scan (receipts, EOI PDFs, contracts)
12. invoices          ← From NocoDB Invoices
13. invoice_line_items ← Extracted from NocoDB Invoice data
14. invoice_expenses  ← Cross-reference invoices ↔ expenses
15. tags              ← Derived from any categorical data worth tagging
16. audit_logs        ← Synthetic historical entries (created_at events)

Table 1: NocoDB Interests → clients

Deduplication Strategy

The Interests table has one row per interest, but multiple interests can belong to the same client. Client identity must be deduplicated before insertion.

Primary match key: Email Address (case-insensitive, trimmed) Secondary match key: Full Name (normalized) + Place of Residence (when email is null) On conflict: Merge contacts, keep earliest Created At, concatenate notes with separator

normalizeClientName(name: string): string
  → trim whitespace
  → collapse multiple spaces
  → title case
  → strip honorifics (Mr., Mrs., Dr., Capt., etc.)
  → return normalized name

Field Mapping

NocoDB Interests Field Type (NocoDB) New Table.Column Type (PG) Transformation
Full Name string clients.full_name TEXT normalizeClientName() — trim, title-case, strip honorifics
— (no company field) clients.company_name TEXT NULL — NocoDB has no company field; leave empty
— (no nationality) clients.nationality TEXT NULL
Yacht Name string clients.yacht_name TEXT Trim, title-case
Length string clients.yacht_length_ft NUMERIC parseFloat() — NocoDB stores as string. If contains "m" suffix, convert m→ft (×3.28084). If contains "ft", strip suffix.
Width string clients.yacht_width_ft NUMERIC Same string→number parsing as Length
Depth string clients.yacht_draft_ft NUMERIC Same string→number parsing as Length
Length string clients.yacht_length_m NUMERIC If originally in ft, convert ft→m (÷3.28084). If in m, use directly.
Width string clients.yacht_width_m NUMERIC Same dual-unit logic
Depth string clients.yacht_draft_m NUMERIC Same dual-unit logic
Berth Size Desired string clients.berth_size_desired TEXT Direct copy, trim
Contact Method Preferred string clients.preferred_contact_method TEXT Map: "Email""email", "Phone""phone", "WhatsApp""whatsapp". Default "email".
Source string clients.source TEXT Map: "Website""website", "Referral""referral", "Direct""manual", "Broker""broker". Default "manual".
Place of Residence string clients.source_details TEXT Store as additional context (may indicate referral origin)
Created At datetime clients.created_at TIMESTAMPTZ Use earliest Created At from all interests belonging to this client
Created At datetime clients.updated_at TIMESTAMPTZ Use latest Updated At from all interests belonging to this client
clients.port_id UUID Constant: Port Nimara UUID (from seed data)
clients.is_proxy BOOLEAN false (no proxy data in NocoDB)
clients.archived_at TIMESTAMPTZ NULL

Unit parsing helper:

function parseDimension(value: string | null): { ft: number | null; m: number | null } {
  if (!value || value.trim() === '') return { ft: null, m: null };
  const cleaned = value.replace(/['"]/g, '').trim();
  const num = parseFloat(cleaned);
  if (isNaN(num)) return { ft: null, m: null };

  if (cleaned.toLowerCase().endsWith('m')) {
    return { ft: Math.round(num * 3.28084 * 100) / 100, m: num };
  }
  // Default: assume feet (most common in NocoDB data)
  return { ft: num, m: Math.round((num / 3.28084) * 100) / 100 };
}

Table 2: NocoDB Interests → client_contacts

Each unique client may have up to 3 contacts extracted from the Interests table. Deduplicate across all interests belonging to the same client.

NocoDB Interests Field New Table.Column Type (PG) Transformation
Email Address client_contacts.value TEXT Trim, lowercase. Channel = "email". Label = "primary".
Phone Number client_contacts.value TEXT Trim, normalize to E.164 if possible. Channel = "phone". Label = "primary".
Address client_contacts.value TEXT Trim. Channel = "other". Label = "address".
client_contacts.client_id UUID From dedup map: NocoDB Interest → client UUID
client_contacts.is_primary BOOLEAN true for first email, first phone. false for subsequent.
client_contacts.channel TEXT Determined by source field (email/phone/other)

Deduplication rules:

  • If the same email appears across multiple interests for the same client → create one contact record
  • If different emails appear across interests for the same client → create multiple records, first is is_primary=true
  • Skip empty/null values

Table 3: NocoDB Interests → interests

One NocoDB Interest row = one new interests row (after client dedup, the interest still exists as a pipeline record).

NocoDB Interests Field Type (NocoDB) New Table.Column Type (PG) Transformation
Id number interests.metadata JSONB Store as { "nocodb_id": value } for reference
interests.client_id UUID From client dedup map
Berths_id number interests.berth_id UUID From berth ID map (NocoDB numeric ID → new UUID). Nullable if Berths_id is null.
Sales Process Level string interests.pipeline_stage TEXT See Pipeline Stage Mapping below
Lead Category string interests.lead_category TEXT Map: "General""general_interest", "Specific""specific_qualified", "Hot""hot_lead".
Source string interests.source TEXT Same mapping as clients.source
EOI Status string interests.eoi_status TEXT Map: "Pending""waiting_for_signatures", "Signed""signed", "Expired""expired", "Sent""waiting_for_signatures". NULL if empty.
documensoID string interests.documenso_id TEXT Direct copy. Trim.
Contract Status string interests.contract_status TEXT Lowercase, snake_case conversion
Deposit 10% Status string interests.deposit_status TEXT Lowercase, snake_case conversion
Date Added datetime interests.date_first_contact TIMESTAMPTZ Parse date string → ISO 8601
EOI Time Sent datetime interests.date_eoi_sent TIMESTAMPTZ Parse date string → ISO 8601
Time LOI Sent datetime interests.date_eoi_signed TIMESTAMPTZ NocoDB uses "LOI" but this is actually EOI signed date. Parse date → ISO 8601.
Contract Sent Status string interests.date_contract_sent TIMESTAMPTZ If status indicates "Sent", use Updated At as approximate date. Otherwise NULL.
Extra Comments string interests.notes TEXT Direct copy, trim
Berth Number string Cross-reference field; use for berth_id resolution if Berths_id is null
Request More Information boolean interests.metadata JSONB Store in metadata: { "request_more_info": true/false }
Request Form Sent string interests.metadata JSONB Store in metadata
Borth Info Sent Status string interests.metadata JSONB Store in metadata (note: field name has typo "Borth")
Created At datetime interests.created_at TIMESTAMPTZ Direct copy
interests.port_id UUID Constant: Port Nimara UUID
interests.archived_at TIMESTAMPTZ NULL
interests.reminder_enabled BOOLEAN false (configure post-migration)

Pipeline Stage Mapping

NocoDB Sales Process Level New pipeline_stage Notes
General Qualified Interest open Broadest inquiry stage
Specific Qualified Interest details_sent Client interested in specific berth(s)
EOI and NDA Sent in_communication Active document exchange
Signed EOI and NDA signed_eoi_nda EOI/NDA completed
Made Reservation deposit_10pct Reservation ≈ deposit stage
Contract Negotiation contract Active contract discussion
Contract Negotiations Finalized contract Same stage, later substatus
Contract Signed completed Deal closed
NULL or empty open Default for missing data

Important: The NocoDB schema uses 8 stage names while the new schema uses 8 stage codes. "Contract Negotiation" and "Contract Negotiations Finalized" both map to contract — distinguish them via date_contract_sent (null = negotiation, not null = finalized).


Table 4: NocoDB Interests → interest_notes

The Extra Comments field on each Interest becomes an interest_notes record if it contains content.

Source New Table.Column Transformation
Extra Comments (if non-empty) interest_notes.content Trim. Prefix with "[Migrated from NocoDB] "
interest_notes.author_id System migration user ID
interest_notes.interest_id From interest UUID map
interest_notes.is_locked true (historical, non-editable)
Created At interest_notes.created_at From parent interest Created At

Table 5: NocoDB Interests → documents + document_signers

Interests with EOI/contract document references create documents and document_signers records.

Trigger: Any interest where EOI Document is not null, or LOI-NDA Document is not null, or documensoID is not null.

NocoDB Interests Field New Table.Column Transformation
EOI Document documents.file_id Match to MinIO file → files table entry
LOI-NDA Document documents.file_id Match to MinIO file → files table entry
documensoID documents.documenso_id Direct copy
EOI Status documents.status Map: "Signed""completed", "Pending""sent", "Expired""expired"
documents.document_type "eoi" for EOI documents, "nda" for LOI-NDA documents
documents.port_id Port Nimara UUID
documents.interest_id From interest UUID map
documents.client_id From client UUID map
documents.title Generate: "EOI - {client_name} - {berth_number}"
documents.created_by System migration user ID

Signing links → document_signers:

NocoDB stores up to 9 signing link fields per interest. These map to document_signers entries.

NocoDB Field Pattern document_signers Column Transformation
Signing link fields (1-9) signer_email Extract email from Documenso signing URL if available
signer_role "client" (default for most), infer from context
signing_order 1-based index of the link field
status If EOI Status = "Signed" → "signed", else "pending"
signing_url Direct copy of Documenso signing link URL

Table 6: NocoDB Berths → berths

One NocoDB Berth row = one new berths row. Berths have cleaner data than Interests.

NocoDB Berths Field Type (NocoDB) New Table.Column Type (PG) Transformation
Id number Store in berth ID map: NocoDB ID → new UUID
Mooring Number string berths.mooring_number TEXT Direct copy, trim
Area string (enum) berths.area TEXT Direct copy. NocoDB enum A-E; new system supports A-G. Validate.
Status string (enum) berths.status TEXT Map: "Available""available", "UnderOffer""under_offer", "Sold""sold". Add missing: "Reserved""under_offer", "Not Available""sold", "Draft""available".
status_override_mode boolean Not migrated (new system handles status rules differently)
Nominal Boat Size string berths.nominal_boat_size TEXT Direct copy (already in ft format)
Nominal Boat Size_metric string berths.nominal_boat_size_m TEXT Direct copy (m format)
Length number berths.length_ft NUMERIC Direct copy (NocoDB berths store as numbers, not strings)
Length_metric number berths.length_m NUMERIC Direct copy. If missing, calculate: length_ft / 3.28084
Width number berths.width_ft NUMERIC Direct copy
Width_metric number berths.width_m NUMERIC Direct copy. If missing, calculate.
Draft number berths.draft_ft NUMERIC Direct copy
Draft_metric number berths.draft_m NUMERIC Direct copy. If missing, calculate.
Water Depth number berths.water_depth NUMERIC Direct copy (ft)
Water Depth_metric number berths.water_depth_m NUMERIC Direct copy. If missing, calculate.
Side Pontoon string (enum) berths.side_pontoon TEXT Lowercase: "Port""port", "Starboard""starboard", "Both""both"
Power Capacity string berths.power_capacity TEXT Direct copy
Voltage string berths.voltage TEXT Direct copy
Mooring Type string (enum) berths.mooring_type TEXT Lowercase snake_case: "Med Mooring""med_mooring", "Alongside""alongside"
Cleat Type/Capacity string berths.cleat_type + cleat_capacity TEXT Split on delimiter if combined. NocoDB may store as single field or separate enum.
Bollard Type/Capacity string berths.bollard_type + bollard_capacity TEXT Same split logic
Bow Facing string berths.bow_facing TEXT Direct copy, lowercase
Access string (enum) berths.access TEXT Lowercase: "Walk On""walk_on", "Dinghy""dinghy"
Price number berths.price NUMERIC Direct copy
berths.price_currency TEXT Default "USD"
berths.tenure_type TEXT Default "permanent" (most Port Nimara berths are permanent tenure)
Created At datetime berths.created_at TIMESTAMPTZ Direct copy
Updated At datetime berths.updated_at TIMESTAMPTZ Direct copy
berths.port_id UUID Constant: Port Nimara UUID
berths.berth_approved BOOLEAN true (all existing berths are approved)
Interested Parties link field Not migrated directly; relationship is rebuilt via interests.berth_id

Berth Area Expansion

NocoDB BerthArea enum: A | B | C | D | E New system supports: A | B | C | D | E | F | G

Berths in areas F and G exist in production but the NocoDB enum doesn't define them. The extraction script should treat area as a free-text field and not validate against the old enum.

Berth Status Mapping

NocoDB Status New Status Notes
Available available
UnderOffer under_offer
Sold sold
Reserved under_offer NocoDB may have this in data but not in enum
Not Available sold Treat as sold/occupied
Draft available Draft berths become available
NULL or empty available Default

Table 7: NocoDB Berths → berth_map_data

If the NocoDB berths table or a related data source contains SVG positioning data for the marina map, extract it.

Source New Table.Column Transformation
SVG path data (if exists) berth_map_data.svg_path Direct copy
X coordinate berth_map_data.x Direct copy
Y coordinate berth_map_data.y Direct copy
Transform data berth_map_data.transform Direct copy
berth_map_data.berth_id From berth UUID map

Note: The current berth map may be rendered from a static SVG with hardcoded positions. If no per-berth coordinate data exists in NocoDB, the berth_map_data table will be populated manually post-migration or via the L4 berth spec import tool.


Table 8: NocoDB Expenses → expenses

One NocoDB Expense row = one new expenses row.

NocoDB Expenses Field Type (NocoDB) New Table.Column Type (PG) Transformation
Id number Store in expense ID map: NocoDB ID → new UUID
Establishment Name string expenses.establishment_name TEXT Trim
Price string expenses.amount NUMERIC parseFloat() — NocoDB stores price as STRING. Strip currency symbols ($, €, £). Strip commas. Parse to number.
currency string expenses.currency TEXT Uppercase: "usd""USD", "eur""EUR". Default "USD" if missing.
expenses.amount_usd NUMERIC If currency ≠ USD, convert using historical exchange rate for the expense date. Otherwise same as amount.
Payment Method string (enum) expenses.payment_method TEXT Lowercase snake_case: "Credit Card""credit_card", "Cash""cash", "Bank Transfer""bank_transfer"
Category string (enum) expenses.category TEXT Lowercase snake_case: "Fuel""fuel", "Maintenance""maintenance", "Office Supplies""office_supplies", etc.
Payer string expenses.payer TEXT Trim
Time datetime expenses.expense_date TIMESTAMPTZ Parse date string → ISO 8601
Contents string expenses.description TEXT Trim
Receipt file attachment(s) expenses.receipt_file_ids UUID[] For each receipt file: create files table entry, store MinIO path. Collect UUIDs into array.
Paid boolean expenses.payment_status TEXT true"paid", false"unpaid", NULL → "unpaid"
payment_date date (optional) expenses.payment_date DATE Direct copy if present
payment_reference string (optional) expenses.payment_reference TEXT Direct copy if present
payment_notes string (optional) expenses.payment_notes TEXT Direct copy if present
CreatedAt datetime expenses.created_at TIMESTAMPTZ Direct copy. Note: NocoDB uses CreatedAt (camelCase), not Created At (spaced).
UpdatedAt datetime expenses.updated_at TIMESTAMPTZ Direct copy
expenses.port_id UUID Constant: Port Nimara UUID
expenses.created_by TEXT System migration user ID
expenses.archived_at TIMESTAMPTZ NULL

Price parsing helper:

function parsePrice(value: string | null): number | null {
  if (!value || value.trim() === '') return null;
  // Strip currency symbols and thousand separators
  const cleaned = value.replace(/[$€£¥,\s]/g, '').trim();
  const num = parseFloat(cleaned);
  return isNaN(num) ? null : Math.round(num * 100) / 100;
}

Expense category normalization:

const CATEGORY_MAP: Record<string, string> = {
  Fuel: 'fuel',
  Maintenance: 'maintenance',
  'Office Supplies': 'office_supplies',
  Utilities: 'utilities',
  'Marina Operations': 'marina_operations',
  Travel: 'travel',
  Equipment: 'equipment',
  Insurance: 'insurance',
  Legal: 'legal',
  Other: 'other',
};

function normalizeCategory(value: string | null): string {
  if (!value) return 'other';
  return CATEGORY_MAP[value.trim()] ?? value.toLowerCase().replace(/\s+/g, '_');
}

Table 9: NocoDB Invoices → invoices

NocoDB Invoices Field Type (NocoDB) New Table.Column Type (PG) Transformation
Id number Store in invoice ID map
Invoice number string invoices.invoice_number TEXT Preserve existing numbering (format: INV-YYYYMM-XYZ). Verify uniqueness.
Client reference string/link invoices.client_name TEXT Resolve client name from interest/client data
Client email string invoices.billing_email TEXT From client contacts
Client address string invoices.billing_address TEXT From client contacts
Due date date invoices.due_date DATE Direct copy
Currency string invoices.currency TEXT Uppercase. Default "USD"
Subtotal number invoices.subtotal NUMERIC Direct copy or calculate from line items
Discount number invoices.discount_pct + discount_amount NUMERIC Parse from invoice data
Fee number invoices.fee_pct + fee_amount NUMERIC Parse from invoice data
Total number invoices.total NUMERIC Direct copy or calculate: subtotal - discount + fee
Status string invoices.status TEXT Map: "Draft""draft", "Sent""sent", "Paid""paid", "Overdue""overdue", "Cancelled""cancelled"
Payment info object invoices.payment_status, payment_date, payment_method, payment_reference mixed Extract from payment fields
PDF reference file invoices.pdf_file_id UUID Create files table entry for stored PDF
Notes string invoices.notes TEXT Trim
Created at datetime invoices.created_at TIMESTAMPTZ Direct copy
invoices.port_id UUID Constant: Port Nimara UUID
invoices.created_by TEXT System migration user ID
invoices.payment_terms TEXT Default "net30" (can be refined post-migration)

Invoice → Line Items

NocoDB invoices may store line item data inline or as linked records. Each expense linked to an invoice becomes an invoice_line_items entry.

Source New Table.Column Transformation
Expense description invoice_line_items.description From linked expense Contents / Establishment Name
Quantity invoice_line_items.quantity Default 1 (NocoDB doesn't track quantity separately)
Amount invoice_line_items.unit_price From linked expense Price
invoice_line_items.total quantity × unit_price
invoice_line_items.sort_order Auto-increment from 0

Invoice ↔ Expense Junction

For each expense linked to an invoice in NocoDB, create an invoice_expenses junction record:

Source New Table Transformation
Invoice NocoDB ID invoice_expenses.invoice_id From invoice UUID map
Expense NocoDB ID invoice_expenses.expense_id From expense UUID map

Table 10: MinIO Files → files

Scan MinIO storage and create files table entries. Map files to clients where possible.

MinIO Metadata New Table.Column Transformation
Object key (path) files.storage_path Direct copy (current path). Will be reorganized post-load.
files.storage_bucket "crm-files"
Filename from path files.filename Extract from object key
Original filename files.original_name Same as filename (original names may not be preserved in MinIO)
Content-Type files.mime_type From MinIO object metadata
Content-Length files.size_bytes From MinIO object metadata
Client association files.client_id AI-assisted matching: filename patterns, folder structure, PDF text extraction. Log unmatched to CSV.
File category files.category Infer from filename/path: *eoi*"eoi", *contract*"contract", *receipt*"receipt", *nda*"eoi", else "misc"
files.port_id Port Nimara UUID
files.uploaded_by System migration user ID

New MinIO path structure (post-migration reorganization):

{portSlug}/clients/{clientId}/eoi/filename.pdf
{portSlug}/clients/{clientId}/contracts/filename.pdf
{portSlug}/clients/{clientId}/receipts/filename.jpg
{portSlug}/clients/{clientId}/correspondence/filename.pdf
{portSlug}/clients/{clientId}/misc/filename.pdf
{portSlug}/invoices/{invoiceId}/invoice.pdf
{portSlug}/expenses/{expenseId}/receipt.jpg

Data Quality Issues & Fixes

Issue 1: String Dimensions on Interests

Problem: Interest table stores vessel dimensions (Length, Width, Depth) as strings. Values may include unit suffixes ("45ft", "12m"), be empty strings, or contain non-numeric characters.

Fix: parseDimension() helper (defined above) handles all cases. Log any unparseable values to migration-warnings.csv.

Issue 2: Mixed Naming Conventions

Problem: NocoDB fields use inconsistent naming: "Full Name" (Title Spaced), "documensoID" (camelCase), "status_override_mode" (snake_case), CreatedAt (PascalCase on Expenses vs Created At on Interests).

Fix: Access each field by its exact NocoDB name. The extraction layer handles the naming; transformation maps to consistent snake_case.

Issue 3: LOI → EOI Terminology

Problem: The codebase has remnants of "LOI" (Letter of Intent) terminology that was renamed to "EOI" (Expression of Interest). Fields like LOI-NDA Document and Time LOI Sent actually refer to EOI documents.

Fix: Map all LOI references to EOI in the new schema. LOI-NDA Documentdocuments with document_type = "eoi". Time LOI Sentinterests.date_eoi_signed.

Issue 4: Incomplete Enums

Problem: BerthStatus enum (Available | UnderOffer | Sold) is missing Reserved, Not Available, Draft which may exist in production data. BerthArea enum covers A-E but production has F and G.

Fix: Status mapping table handles all known values (see Table 6). Area is treated as free text. Any unrecognized values are logged and defaulted.

Issue 5: Price as String on Expenses

Problem: Expense Price field is typed as string in the NocoDB interface. Values may contain currency symbols, commas, or be formatted inconsistently.

Fix: parsePrice() helper strips non-numeric characters and parses to NUMERIC. NULL values logged.

Issue 6: Berth References

Problem: Interests reference berths via both Berths_id (numeric link field) and Berth Number (string display field). These may be inconsistent.

Fix: Primary resolution via Berths_id → berth UUID map. Fallback: match Berth Number string against berths.mooring_number. If both fail, set berth_id = NULL and log.

Issue 7: Invoice Number Collision Risk

Problem: The current generateInvoiceNumber() function uses format INV-YYYYMM-XYZ with a random 3-char suffix, which has collision risk.

Fix: During migration, preserve all existing invoice numbers exactly. Post-migration, the new system uses a sequential counter per port per month, eliminating collision risk.

Issue 8: Duplicate Clients

Problem: The same person may appear as multiple NocoDB Interest rows with slightly different name spellings or different email addresses.

Fix: Three-pass dedup:

  1. Exact email match (case-insensitive) → same client
  2. Fuzzy name match (Levenshtein distance ≤ 2 on normalized name) + same residence → candidate merge (log for manual review)
  3. Remaining → create as separate clients (can be merged manually in new system)

All fuzzy-match candidates logged to dedup-candidates.csv for Matt's review before committing.


Synthetic Records (Created During Migration)

These records don't exist in NocoDB but are generated to provide a complete dataset in the new system:

Record Type Source Logic
audit_logs (historical) All migrated records For each migrated entity, create a synthetic action: "create" audit log entry with created_at matching the original record's creation date, user_id = system migration user, metadata = { "source": "nocodb_migration" }
tags Lead categories, sources Create tags for common categories: "Hot Lead", "Website Lead", "Referral", "Broker", etc.
client_tags Interest Lead Category Auto-tag clients based on their highest lead category across all interests
interest_tags Interest Source Auto-tag interests by source
client_notes Dedup merge info When two interests are merged into one client, create a note documenting the merge

ID Mapping Tables (In-Memory During Migration)

The migration script maintains these maps for cross-referencing:

interface MigrationMaps {
  // NocoDB numeric ID → new PostgreSQL UUID
  clientMap: Map<string, string>; // emailOrNameKey → client UUID
  berthMap: Map<number, string>; // NocoDB Berth Id → berth UUID
  interestMap: Map<number, string>; // NocoDB Interest Id → interest UUID
  expenseMap: Map<number, string>; // NocoDB Expense Id → expense UUID
  invoiceMap: Map<number, string>; // NocoDB Invoice Id → invoice UUID

  // Reverse lookups
  interestToClient: Map<number, string>; // NocoDB Interest Id → client UUID
  berthNumberToId: Map<string, string>; // Mooring number string → berth UUID
}

Validation Checklist

After migration completes, run these validation checks:

Check Query Expected
Client count SELECT COUNT(*) FROM clients ≤ NocoDB interest count (dedup reduces)
Contact count SELECT COUNT(*) FROM client_contacts ≥ client count (most clients have ≥1 contact)
Interest count SELECT COUNT(*) FROM interests = NocoDB interest count (1:1)
Berth count SELECT COUNT(*) FROM berths = NocoDB berth count (1:1)
Expense count SELECT COUNT(*) FROM expenses = NocoDB expense count (1:1)
Invoice count SELECT COUNT(*) FROM invoices = NocoDB invoice count (1:1)
Orphan interests SELECT COUNT(*) FROM interests WHERE client_id NOT IN (SELECT id FROM clients) 0
Orphan contacts SELECT COUNT(*) FROM client_contacts WHERE client_id NOT IN (SELECT id FROM clients) 0
NULL berth refs SELECT COUNT(*) FROM interests WHERE berth_id IS NOT NULL AND berth_id NOT IN (SELECT id FROM berths) 0
Invoice-expense links SELECT COUNT(*) FROM invoice_expenses WHERE expense_id NOT IN (SELECT id FROM expenses) 0
File records SELECT COUNT(*) FROM files ≥ (receipt files + EOI docs + contract docs)
Pipeline distribution SELECT pipeline_stage, COUNT(*) FROM interests GROUP BY 1 Reasonable distribution, no unexpected NULLs
Duplicate emails SELECT value, COUNT(*) FROM client_contacts WHERE channel='email' GROUP BY 1 HAVING COUNT(*) > 1 Review each — may be valid (shared email)
Amount sanity SELECT COUNT(*) FROM expenses WHERE amount <= 0 0 (or review outliers)

Migration Script Configuration

// scripts/migrate-nocodb/config.ts
export const MIGRATION_CONFIG = {
  // NocoDB connection
  nocodb: {
    apiUrl: process.env.NOCODB_API_URL, // e.g., 'https://nocodb.portnimara.com'
    token: process.env.NOCODB_AUTH_TOKEN,
    projectId: process.env.NOCODB_PROJECT_ID,
    tables: {
      interests: 'mbs9hjauug4eseo',
      berths: 'mczgos9hr3oa9qc',
      expenses: 'mxfcefkk4dqs6uq',
      invoices: 'mvyvz0lpc30p01s',
    },
  },

  // Target PostgreSQL
  postgres: {
    connectionString: process.env.DATABASE_URL,
  },

  // MinIO
  minio: {
    endpoint: process.env.MINIO_ENDPOINT,
    accessKey: process.env.MINIO_ACCESS_KEY,
    secretKey: process.env.MINIO_SECRET_KEY,
    bucket: 'crm-files',
  },

  // Migration defaults
  defaults: {
    portId: process.env.PORT_NIMARA_UUID, // seeded in L0
    portSlug: 'port-nimara',
    migrationUserId: 'system-migration',
    defaultCurrency: 'USD',
    defaultTimezone: 'America/Anguilla',
  },

  // Deduplication thresholds
  dedup: {
    nameLevenshteinThreshold: 2, // max edit distance for fuzzy name match
    requireResidenceMatch: true, // secondary match requires same residence
    logCandidates: true, // write fuzzy candidates to CSV
  },

  // Output files
  output: {
    warningsFile: 'migration-warnings.csv', // unparseable values, unexpected data
    dedupCandidatesFile: 'dedup-candidates.csv', // fuzzy match candidates for review
    unmatchedFilesFile: 'unmatched-files.csv', // MinIO files that couldn't be linked to a client
    validationReport: 'validation-report.json', // final counts and check results
  },
};

Post-Migration Manual Tasks

Task Owner When
Review dedup-candidates.csv — approve or reject client merges Matt Saturday morning
Review unmatched-files.csv — manually assign files to clients or mark as misc Matt Saturday morning
Review migration-warnings.csv — fix any data issues Matt Saturday morning
Spot-check 5 high-profile clients — verify all data migrated correctly Matt Saturday after validation
Spot-check 5 berths — verify dimensions, status, map data Matt Saturday after validation
Spot-check 3 invoices — verify amounts, line items, linked expenses Matt Saturday after validation
Create user accounts and assign roles in new system Matt Saturday afternoon
Configure SMTP (Poste.io) connection Matt Saturday afternoon
Test Documenso API connection Matt Saturday afternoon
Populate berth_map_data if not extracted from NocoDB Matt Post-migration week