Files
pn-new-crm/NOCODB-MIGRATION-MAPPING.md

629 lines
63 KiB
Markdown
Raw Permalink Normal View History

# NocoDB → PostgreSQL Migration Mapping
**Purpose:** Field-by-field mapping from the current NocoDB tables to the new PostgreSQL schema, including transformation rules, data quality fixes, deduplication logic, and migration order.
**References:** `07-DATABASE-SCHEMA.md`, `IMPL-L6-MIGRATION.md`, annotated source `utils/types.ts`, `server/utils/nocodb.ts`, `server/utils/nocodb-invoice.ts`
---
## Source System Summary
The current NocoDB system has four primary tables:
| NocoDB Table | Table ID | Est. Records | Destination Tables |
| ------------ | ----------------- | ------------ | -------------------------------------------------------------------------------------------- |
| Interests | `mbs9hjauug4eseo` | ~200500 | `clients`, `client_contacts`, `interests`, `interest_notes`, `documents`, `document_signers` |
| Berths | `mczgos9hr3oa9qc` | ~100200 | `berths`, `berth_map_data` |
| Expenses | `mxfcefkk4dqs6uq` | ~5002000 | `expenses`, `files` (receipts) |
| Invoices | `mvyvz0lpc30p01s` | ~200500 | `invoices`, `invoice_line_items`, `invoice_expenses` |
**Key structural change:** NocoDB's Interests table is a "mega-table" that combines client identity data, contact details, vessel specifications, pipeline tracking, milestone dates, and document references into a single 60+ field row. The new schema normalizes this into separate `clients`, `client_contacts`, `interests`, `interest_notes`, and `documents` tables.
---
## Migration Order (Respects Foreign Keys)
```
1. ports ← Seed from L0 (already exists)
2. clients ← Derived from NocoDB Interests (dedup)
3. client_contacts ← Derived from NocoDB Interests (dedup)
4. berths ← From NocoDB Berths
5. berth_map_data ← From NocoDB Berths (if SVG data exists)
6. interests ← From NocoDB Interests (references clients + berths)
7. interest_notes ← From NocoDB Interests "Extra Comments" field
8. documents ← From NocoDB Interests (EOI/contract references)
9. document_signers ← From NocoDB Interests (signing link fields)
10. expenses ← From NocoDB Expenses
11. files ← From MinIO scan (receipts, EOI PDFs, contracts)
12. invoices ← From NocoDB Invoices
13. invoice_line_items ← Extracted from NocoDB Invoice data
14. invoice_expenses ← Cross-reference invoices ↔ expenses
15. tags ← Derived from any categorical data worth tagging
16. audit_logs ← Synthetic historical entries (created_at events)
```
---
## Table 1: NocoDB Interests → `clients`
### Deduplication Strategy
The Interests table has one row per interest, but multiple interests can belong to the same client. Client identity must be deduplicated before insertion.
**Primary match key:** `Email Address` (case-insensitive, trimmed)
**Secondary match key:** `Full Name` (normalized) + `Place of Residence` (when email is null)
**On conflict:** Merge contacts, keep earliest `Created At`, concatenate notes with separator
```
normalizeClientName(name: string): string
→ trim whitespace
→ collapse multiple spaces
→ title case
→ strip honorifics (Mr., Mrs., Dr., Capt., etc.)
→ return normalized name
```
### Field Mapping
| NocoDB Interests Field | Type (NocoDB) | → | New Table.Column | Type (PG) | Transformation |
| -------------------------- | ------------- | --- | ---------------------------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| `Full Name` | string | → | `clients.full_name` | TEXT | `normalizeClientName()` — trim, title-case, strip honorifics |
| — (no company field) | — | → | `clients.company_name` | TEXT | NULL — NocoDB has no company field; leave empty |
| — (no nationality) | — | → | `clients.nationality` | TEXT | NULL |
| `Yacht Name` | string | → | `clients.yacht_name` | TEXT | Trim, title-case |
| `Length` | string | → | `clients.yacht_length_ft` | NUMERIC | `parseFloat()` — NocoDB stores as string. If contains "m" suffix, convert m→ft (×3.28084). If contains "ft", strip suffix. |
| `Width` | string | → | `clients.yacht_width_ft` | NUMERIC | Same string→number parsing as Length |
| `Depth` | string | → | `clients.yacht_draft_ft` | NUMERIC | Same string→number parsing as Length |
| `Length` | string | → | `clients.yacht_length_m` | NUMERIC | If originally in ft, convert ft→m (÷3.28084). If in m, use directly. |
| `Width` | string | → | `clients.yacht_width_m` | NUMERIC | Same dual-unit logic |
| `Depth` | string | → | `clients.yacht_draft_m` | NUMERIC | Same dual-unit logic |
| `Berth Size Desired` | string | → | `clients.berth_size_desired` | TEXT | Direct copy, trim |
| `Contact Method Preferred` | string | → | `clients.preferred_contact_method` | TEXT | Map: `"Email"``"email"`, `"Phone"``"phone"`, `"WhatsApp"``"whatsapp"`. Default `"email"`. |
| `Source` | string | → | `clients.source` | TEXT | Map: `"Website"``"website"`, `"Referral"``"referral"`, `"Direct"``"manual"`, `"Broker"``"broker"`. Default `"manual"`. |
| `Place of Residence` | string | → | `clients.source_details` | TEXT | Store as additional context (may indicate referral origin) |
| `Created At` | datetime | → | `clients.created_at` | TIMESTAMPTZ | Use earliest `Created At` from all interests belonging to this client |
| `Created At` | datetime | → | `clients.updated_at` | TIMESTAMPTZ | Use latest `Updated At` from all interests belonging to this client |
| — | — | → | `clients.port_id` | UUID | Constant: Port Nimara UUID (from seed data) |
| — | — | → | `clients.is_proxy` | BOOLEAN | `false` (no proxy data in NocoDB) |
| — | — | → | `clients.archived_at` | TIMESTAMPTZ | NULL |
**Unit parsing helper:**
```typescript
function parseDimension(value: string | null): { ft: number | null; m: number | null } {
if (!value || value.trim() === '') return { ft: null, m: null };
const cleaned = value.replace(/['"]/g, '').trim();
const num = parseFloat(cleaned);
if (isNaN(num)) return { ft: null, m: null };
if (cleaned.toLowerCase().endsWith('m')) {
return { ft: Math.round(num * 3.28084 * 100) / 100, m: num };
}
// Default: assume feet (most common in NocoDB data)
return { ft: num, m: Math.round((num / 3.28084) * 100) / 100 };
}
```
---
## Table 2: NocoDB Interests → `client_contacts`
Each unique client may have up to 3 contacts extracted from the Interests table. Deduplicate across all interests belonging to the same client.
| NocoDB Interests Field | → | New Table.Column | Type (PG) | Transformation |
| ---------------------- | --- | ---------------------------- | --------- | ------------------------------------------------------------------------------- |
| `Email Address` | → | `client_contacts.value` | TEXT | Trim, lowercase. Channel = `"email"`. Label = `"primary"`. |
| `Phone Number` | → | `client_contacts.value` | TEXT | Trim, normalize to E.164 if possible. Channel = `"phone"`. Label = `"primary"`. |
| `Address` | → | `client_contacts.value` | TEXT | Trim. Channel = `"other"`. Label = `"address"`. |
| — | → | `client_contacts.client_id` | UUID | From dedup map: NocoDB Interest → client UUID |
| — | → | `client_contacts.is_primary` | BOOLEAN | `true` for first email, first phone. `false` for subsequent. |
| — | → | `client_contacts.channel` | TEXT | Determined by source field (email/phone/other) |
**Deduplication rules:**
- If the same email appears across multiple interests for the same client → create one contact record
- If different emails appear across interests for the same client → create multiple records, first is `is_primary=true`
- Skip empty/null values
---
## Table 3: NocoDB Interests → `interests`
One NocoDB Interest row = one new `interests` row (after client dedup, the interest still exists as a pipeline record).
| NocoDB Interests Field | Type (NocoDB) | → | New Table.Column | Type (PG) | Transformation |
| -------------------------- | ------------- | --- | ------------------------------ | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Id` | number | → | `interests.metadata` | JSONB | Store as `{ "nocodb_id": value }` for reference |
| — | — | → | `interests.client_id` | UUID | From client dedup map |
| `Berths_id` | number | → | `interests.berth_id` | UUID | From berth ID map (NocoDB numeric ID → new UUID). Nullable if `Berths_id` is null. |
| `Sales Process Level` | string | → | `interests.pipeline_stage` | TEXT | **See Pipeline Stage Mapping below** |
| `Lead Category` | string | → | `interests.lead_category` | TEXT | Map: `"General"``"general_interest"`, `"Specific"``"specific_qualified"`, `"Hot"``"hot_lead"`. |
| `Source` | string | → | `interests.source` | TEXT | Same mapping as clients.source |
| `EOI Status` | string | → | `interests.eoi_status` | TEXT | Map: `"Pending"``"waiting_for_signatures"`, `"Signed"``"signed"`, `"Expired"``"expired"`, `"Sent"``"waiting_for_signatures"`. NULL if empty. |
| `documensoID` | string | → | `interests.documenso_id` | TEXT | Direct copy. Trim. |
| `Contract Status` | string | → | `interests.contract_status` | TEXT | Lowercase, snake_case conversion |
| `Deposit 10% Status` | string | → | `interests.deposit_status` | TEXT | Lowercase, snake_case conversion |
| `Date Added` | datetime | → | `interests.date_first_contact` | TIMESTAMPTZ | Parse date string → ISO 8601 |
| `EOI Time Sent` | datetime | → | `interests.date_eoi_sent` | TIMESTAMPTZ | Parse date string → ISO 8601 |
| `Time LOI Sent` | datetime | → | `interests.date_eoi_signed` | TIMESTAMPTZ | NocoDB uses "LOI" but this is actually EOI signed date. Parse date → ISO 8601. |
| `Contract Sent Status` | string | → | `interests.date_contract_sent` | TIMESTAMPTZ | If status indicates "Sent", use `Updated At` as approximate date. Otherwise NULL. |
| `Extra Comments` | string | → | `interests.notes` | TEXT | Direct copy, trim |
| `Berth Number` | string | → | — | — | Cross-reference field; use for berth_id resolution if `Berths_id` is null |
| `Request More Information` | boolean | → | `interests.metadata` | JSONB | Store in metadata: `{ "request_more_info": true/false }` |
| `Request Form Sent` | string | → | `interests.metadata` | JSONB | Store in metadata |
| `Borth Info Sent Status` | string | → | `interests.metadata` | JSONB | Store in metadata (note: field name has typo "Borth") |
| `Created At` | datetime | → | `interests.created_at` | TIMESTAMPTZ | Direct copy |
| — | — | → | `interests.port_id` | UUID | Constant: Port Nimara UUID |
| — | — | → | `interests.archived_at` | TIMESTAMPTZ | NULL |
| — | — | → | `interests.reminder_enabled` | BOOLEAN | `false` (configure post-migration) |
### Pipeline Stage Mapping
| NocoDB `Sales Process Level` | → | New `pipeline_stage` | Notes |
| --------------------------------- | --- | -------------------- | -------------------------------------- |
| `General Qualified Interest` | → | `open` | Broadest inquiry stage |
| `Specific Qualified Interest` | → | `details_sent` | Client interested in specific berth(s) |
| `EOI and NDA Sent` | → | `in_communication` | Active document exchange |
| `Signed EOI and NDA` | → | `signed_eoi_nda` | EOI/NDA completed |
| `Made Reservation` | → | `deposit_10pct` | Reservation ≈ deposit stage |
| `Contract Negotiation` | → | `contract` | Active contract discussion |
| `Contract Negotiations Finalized` | → | `contract` | Same stage, later substatus |
| `Contract Signed` | → | `completed` | Deal closed |
| NULL or empty | → | `open` | Default for missing data |
**Important:** The NocoDB schema uses 8 stage names while the new schema uses 8 stage codes. "Contract Negotiation" and "Contract Negotiations Finalized" both map to `contract` — distinguish them via `date_contract_sent` (null = negotiation, not null = finalized).
---
## Table 4: NocoDB Interests → `interest_notes`
The `Extra Comments` field on each Interest becomes an `interest_notes` record if it contains content.
| Source | → | New Table.Column | Transformation |
| ------------------------------- | --- | ---------------------------- | --------------------------------------------- |
| `Extra Comments` (if non-empty) | → | `interest_notes.content` | Trim. Prefix with `"[Migrated from NocoDB] "` |
| — | → | `interest_notes.author_id` | System migration user ID |
| — | → | `interest_notes.interest_id` | From interest UUID map |
| — | → | `interest_notes.is_locked` | `true` (historical, non-editable) |
| `Created At` | → | `interest_notes.created_at` | From parent interest `Created At` |
---
## Table 5: NocoDB Interests → `documents` + `document_signers`
Interests with EOI/contract document references create `documents` and `document_signers` records.
**Trigger:** Any interest where `EOI Document` is not null, or `LOI-NDA Document` is not null, or `documensoID` is not null.
| NocoDB Interests Field | → | New Table.Column | Transformation |
| ---------------------- | --- | ------------------------- | ---------------------------------------------------------------------------------- |
| `EOI Document` | → | `documents.file_id` | Match to MinIO file → `files` table entry |
| `LOI-NDA Document` | → | `documents.file_id` | Match to MinIO file → `files` table entry |
| `documensoID` | → | `documents.documenso_id` | Direct copy |
| `EOI Status` | → | `documents.status` | Map: `"Signed"``"completed"`, `"Pending"``"sent"`, `"Expired"``"expired"` |
| — | → | `documents.document_type` | `"eoi"` for EOI documents, `"nda"` for LOI-NDA documents |
| — | → | `documents.port_id` | Port Nimara UUID |
| — | → | `documents.interest_id` | From interest UUID map |
| — | → | `documents.client_id` | From client UUID map |
| — | → | `documents.title` | Generate: `"EOI - {client_name} - {berth_number}"` |
| — | → | `documents.created_by` | System migration user ID |
**Signing links → `document_signers`:**
NocoDB stores up to 9 signing link fields per interest. These map to `document_signers` entries.
| NocoDB Field Pattern | → | `document_signers` Column | Transformation |
| ------------------------- | --- | ------------------------- | ------------------------------------------------------- |
| Signing link fields (1-9) | → | `signer_email` | Extract email from Documenso signing URL if available |
| — | → | `signer_role` | `"client"` (default for most), infer from context |
| — | → | `signing_order` | 1-based index of the link field |
| — | → | `status` | If EOI Status = "Signed" → `"signed"`, else `"pending"` |
| — | → | `signing_url` | Direct copy of Documenso signing link URL |
---
## Table 6: NocoDB Berths → `berths`
One NocoDB Berth row = one new `berths` row. Berths have cleaner data than Interests.
| NocoDB Berths Field | Type (NocoDB) | → | New Table.Column | Type (PG) | Transformation |
| -------------------------- | ------------- | --- | ------------------------------------------ | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `Id` | number | → | — | — | Store in berth ID map: NocoDB ID → new UUID |
| `Mooring Number` | string | → | `berths.mooring_number` | TEXT | Direct copy, trim |
| `Area` | string (enum) | → | `berths.area` | TEXT | Direct copy. NocoDB enum A-E; new system supports A-G. Validate. |
| `Status` | string (enum) | → | `berths.status` | TEXT | Map: `"Available"``"available"`, `"UnderOffer"``"under_offer"`, `"Sold"``"sold"`. Add missing: `"Reserved"``"under_offer"`, `"Not Available"``"sold"`, `"Draft"``"available"`. |
| `status_override_mode` | boolean | → | — | — | Not migrated (new system handles status rules differently) |
| `Nominal Boat Size` | string | → | `berths.nominal_boat_size` | TEXT | Direct copy (already in ft format) |
| `Nominal Boat Size_metric` | string | → | `berths.nominal_boat_size_m` | TEXT | Direct copy (m format) |
| `Length` | number | → | `berths.length_ft` | NUMERIC | Direct copy (NocoDB berths store as numbers, not strings) |
| `Length_metric` | number | → | `berths.length_m` | NUMERIC | Direct copy. If missing, calculate: `length_ft / 3.28084` |
| `Width` | number | → | `berths.width_ft` | NUMERIC | Direct copy |
| `Width_metric` | number | → | `berths.width_m` | NUMERIC | Direct copy. If missing, calculate. |
| `Draft` | number | → | `berths.draft_ft` | NUMERIC | Direct copy |
| `Draft_metric` | number | → | `berths.draft_m` | NUMERIC | Direct copy. If missing, calculate. |
| `Water Depth` | number | → | `berths.water_depth` | NUMERIC | Direct copy (ft) |
| `Water Depth_metric` | number | → | `berths.water_depth_m` | NUMERIC | Direct copy. If missing, calculate. |
| `Side Pontoon` | string (enum) | → | `berths.side_pontoon` | TEXT | Lowercase: `"Port"``"port"`, `"Starboard"``"starboard"`, `"Both"``"both"` |
| `Power Capacity` | string | → | `berths.power_capacity` | TEXT | Direct copy |
| `Voltage` | string | → | `berths.voltage` | TEXT | Direct copy |
| `Mooring Type` | string (enum) | → | `berths.mooring_type` | TEXT | Lowercase snake_case: `"Med Mooring"``"med_mooring"`, `"Alongside"``"alongside"` |
| `Cleat Type/Capacity` | string | → | `berths.cleat_type` + `cleat_capacity` | TEXT | Split on delimiter if combined. NocoDB may store as single field or separate enum. |
| `Bollard Type/Capacity` | string | → | `berths.bollard_type` + `bollard_capacity` | TEXT | Same split logic |
| `Bow Facing` | string | → | `berths.bow_facing` | TEXT | Direct copy, lowercase |
| `Access` | string (enum) | → | `berths.access` | TEXT | Lowercase: `"Walk On"``"walk_on"`, `"Dinghy"``"dinghy"` |
| `Price` | number | → | `berths.price` | NUMERIC | Direct copy |
| — | — | → | `berths.price_currency` | TEXT | Default `"USD"` |
| — | — | → | `berths.tenure_type` | TEXT | Default `"permanent"` (most Port Nimara berths are permanent tenure) |
| `Created At` | datetime | → | `berths.created_at` | TIMESTAMPTZ | Direct copy |
| `Updated At` | datetime | → | `berths.updated_at` | TIMESTAMPTZ | Direct copy |
| — | — | → | `berths.port_id` | UUID | Constant: Port Nimara UUID |
| — | — | → | `berths.berth_approved` | BOOLEAN | `true` (all existing berths are approved) |
| `Interested Parties` | link field | → | — | — | Not migrated directly; relationship is rebuilt via `interests.berth_id` |
### Berth Area Expansion
NocoDB BerthArea enum: `A | B | C | D | E`
New system supports: `A | B | C | D | E | F | G`
Berths in areas F and G exist in production but the NocoDB enum doesn't define them. The extraction script should treat area as a free-text field and not validate against the old enum.
### Berth Status Mapping
| NocoDB Status | → | New Status | Notes |
| --------------- | --- | ------------- | -------------------------------------------- |
| `Available` | → | `available` | |
| `UnderOffer` | → | `under_offer` | |
| `Sold` | → | `sold` | |
| `Reserved` | → | `under_offer` | NocoDB may have this in data but not in enum |
| `Not Available` | → | `sold` | Treat as sold/occupied |
| `Draft` | → | `available` | Draft berths become available |
| NULL or empty | → | `available` | Default |
---
## Table 7: NocoDB Berths → `berth_map_data`
If the NocoDB berths table or a related data source contains SVG positioning data for the marina map, extract it.
| Source | → | New Table.Column | Transformation |
| ------------------------- | --- | -------------------------- | ------------------- |
| SVG path data (if exists) | → | `berth_map_data.svg_path` | Direct copy |
| X coordinate | → | `berth_map_data.x` | Direct copy |
| Y coordinate | → | `berth_map_data.y` | Direct copy |
| Transform data | → | `berth_map_data.transform` | Direct copy |
| — | → | `berth_map_data.berth_id` | From berth UUID map |
**Note:** The current berth map may be rendered from a static SVG with hardcoded positions. If no per-berth coordinate data exists in NocoDB, the `berth_map_data` table will be populated manually post-migration or via the L4 berth spec import tool.
---
## Table 8: NocoDB Expenses → `expenses`
One NocoDB Expense row = one new `expenses` row.
| NocoDB Expenses Field | Type (NocoDB) | → | New Table.Column | Type (PG) | Transformation |
| --------------------- | ------------------ | --- | ----------------------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------- |
| `Id` | number | → | — | — | Store in expense ID map: NocoDB ID → new UUID |
| `Establishment Name` | string | → | `expenses.establishment_name` | TEXT | Trim |
| `Price` | string | → | `expenses.amount` | NUMERIC | `parseFloat()` — NocoDB stores price as STRING. Strip currency symbols ($, €, £). Strip commas. Parse to number. |
| `currency` | string | → | `expenses.currency` | TEXT | Uppercase: `"usd"``"USD"`, `"eur"``"EUR"`. Default `"USD"` if missing. |
| — | — | → | `expenses.amount_usd` | NUMERIC | If currency ≠ USD, convert using historical exchange rate for the expense date. Otherwise same as amount. |
| `Payment Method` | string (enum) | → | `expenses.payment_method` | TEXT | Lowercase snake_case: `"Credit Card"``"credit_card"`, `"Cash"``"cash"`, `"Bank Transfer"``"bank_transfer"` |
| `Category` | string (enum) | → | `expenses.category` | TEXT | Lowercase snake_case: `"Fuel"``"fuel"`, `"Maintenance"``"maintenance"`, `"Office Supplies"``"office_supplies"`, etc. |
| `Payer` | string | → | `expenses.payer` | TEXT | Trim |
| `Time` | datetime | → | `expenses.expense_date` | TIMESTAMPTZ | Parse date string → ISO 8601 |
| `Contents` | string | → | `expenses.description` | TEXT | Trim |
| `Receipt` | file attachment(s) | → | `expenses.receipt_file_ids` | UUID[] | For each receipt file: create `files` table entry, store MinIO path. Collect UUIDs into array. |
| `Paid` | boolean | → | `expenses.payment_status` | TEXT | `true``"paid"`, `false``"unpaid"`, NULL → `"unpaid"` |
| `payment_date` | date (optional) | → | `expenses.payment_date` | DATE | Direct copy if present |
| `payment_reference` | string (optional) | → | `expenses.payment_reference` | TEXT | Direct copy if present |
| `payment_notes` | string (optional) | → | `expenses.payment_notes` | TEXT | Direct copy if present |
| `CreatedAt` | datetime | → | `expenses.created_at` | TIMESTAMPTZ | Direct copy. Note: NocoDB uses `CreatedAt` (camelCase), not `Created At` (spaced). |
| `UpdatedAt` | datetime | → | `expenses.updated_at` | TIMESTAMPTZ | Direct copy |
| — | — | → | `expenses.port_id` | UUID | Constant: Port Nimara UUID |
| — | — | → | `expenses.created_by` | TEXT | System migration user ID |
| — | — | → | `expenses.archived_at` | TIMESTAMPTZ | NULL |
**Price parsing helper:**
```typescript
function parsePrice(value: string | null): number | null {
if (!value || value.trim() === '') return null;
// Strip currency symbols and thousand separators
const cleaned = value.replace(/[$€£¥,\s]/g, '').trim();
const num = parseFloat(cleaned);
return isNaN(num) ? null : Math.round(num * 100) / 100;
}
```
**Expense category normalization:**
```typescript
const CATEGORY_MAP: Record<string, string> = {
Fuel: 'fuel',
Maintenance: 'maintenance',
'Office Supplies': 'office_supplies',
Utilities: 'utilities',
'Marina Operations': 'marina_operations',
Travel: 'travel',
Equipment: 'equipment',
Insurance: 'insurance',
Legal: 'legal',
Other: 'other',
};
function normalizeCategory(value: string | null): string {
if (!value) return 'other';
return CATEGORY_MAP[value.trim()] ?? value.toLowerCase().replace(/\s+/g, '_');
}
```
---
## Table 9: NocoDB Invoices → `invoices`
| NocoDB Invoices Field | Type (NocoDB) | → | New Table.Column | Type (PG) | Transformation |
| --------------------- | ------------- | --- | -------------------------------------------------------------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------ |
| `Id` | number | → | — | — | Store in invoice ID map |
| Invoice number | string | → | `invoices.invoice_number` | TEXT | Preserve existing numbering (format: `INV-YYYYMM-XYZ`). Verify uniqueness. |
| Client reference | string/link | → | `invoices.client_name` | TEXT | Resolve client name from interest/client data |
| Client email | string | → | `invoices.billing_email` | TEXT | From client contacts |
| Client address | string | → | `invoices.billing_address` | TEXT | From client contacts |
| Due date | date | → | `invoices.due_date` | DATE | Direct copy |
| Currency | string | → | `invoices.currency` | TEXT | Uppercase. Default `"USD"` |
| Subtotal | number | → | `invoices.subtotal` | NUMERIC | Direct copy or calculate from line items |
| Discount | number | → | `invoices.discount_pct` + `discount_amount` | NUMERIC | Parse from invoice data |
| Fee | number | → | `invoices.fee_pct` + `fee_amount` | NUMERIC | Parse from invoice data |
| Total | number | → | `invoices.total` | NUMERIC | Direct copy or calculate: `subtotal - discount + fee` |
| Status | string | → | `invoices.status` | TEXT | Map: `"Draft"``"draft"`, `"Sent"``"sent"`, `"Paid"``"paid"`, `"Overdue"``"overdue"`, `"Cancelled"``"cancelled"` |
| Payment info | object | → | `invoices.payment_status`, `payment_date`, `payment_method`, `payment_reference` | mixed | Extract from payment fields |
| PDF reference | file | → | `invoices.pdf_file_id` | UUID | Create `files` table entry for stored PDF |
| Notes | string | → | `invoices.notes` | TEXT | Trim |
| Created at | datetime | → | `invoices.created_at` | TIMESTAMPTZ | Direct copy |
| — | — | → | `invoices.port_id` | UUID | Constant: Port Nimara UUID |
| — | — | → | `invoices.created_by` | TEXT | System migration user ID |
| — | — | → | `invoices.payment_terms` | TEXT | Default `"net30"` (can be refined post-migration) |
### Invoice → Line Items
NocoDB invoices may store line item data inline or as linked records. Each expense linked to an invoice becomes an `invoice_line_items` entry.
| Source | → | New Table.Column | Transformation |
| ------------------- | --- | -------------------------------- | ------------------------------------------------------ |
| Expense description | → | `invoice_line_items.description` | From linked expense `Contents` / `Establishment Name` |
| Quantity | → | `invoice_line_items.quantity` | Default `1` (NocoDB doesn't track quantity separately) |
| Amount | → | `invoice_line_items.unit_price` | From linked expense `Price` |
| — | → | `invoice_line_items.total` | `quantity × unit_price` |
| — | → | `invoice_line_items.sort_order` | Auto-increment from 0 |
### Invoice ↔ Expense Junction
For each expense linked to an invoice in NocoDB, create an `invoice_expenses` junction record:
| Source | → | New Table | Transformation |
| ----------------- | --- | ----------------------------- | --------------------- |
| Invoice NocoDB ID | → | `invoice_expenses.invoice_id` | From invoice UUID map |
| Expense NocoDB ID | → | `invoice_expenses.expense_id` | From expense UUID map |
---
## Table 10: MinIO Files → `files`
Scan MinIO storage and create `files` table entries. Map files to clients where possible.
| MinIO Metadata | → | New Table.Column | Transformation |
| ------------------ | --- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| Object key (path) | → | `files.storage_path` | Direct copy (current path). Will be reorganized post-load. |
| — | → | `files.storage_bucket` | `"crm-files"` |
| Filename from path | → | `files.filename` | Extract from object key |
| Original filename | → | `files.original_name` | Same as filename (original names may not be preserved in MinIO) |
| Content-Type | → | `files.mime_type` | From MinIO object metadata |
| Content-Length | → | `files.size_bytes` | From MinIO object metadata |
| Client association | → | `files.client_id` | AI-assisted matching: filename patterns, folder structure, PDF text extraction. Log unmatched to CSV. |
| File category | → | `files.category` | Infer from filename/path: `*eoi*``"eoi"`, `*contract*``"contract"`, `*receipt*``"receipt"`, `*nda*``"eoi"`, else `"misc"` |
| — | → | `files.port_id` | Port Nimara UUID |
| — | → | `files.uploaded_by` | System migration user ID |
**New MinIO path structure (post-migration reorganization):**
```
{portSlug}/clients/{clientId}/eoi/filename.pdf
{portSlug}/clients/{clientId}/contracts/filename.pdf
{portSlug}/clients/{clientId}/receipts/filename.jpg
{portSlug}/clients/{clientId}/correspondence/filename.pdf
{portSlug}/clients/{clientId}/misc/filename.pdf
{portSlug}/invoices/{invoiceId}/invoice.pdf
{portSlug}/expenses/{expenseId}/receipt.jpg
```
---
## Data Quality Issues & Fixes
### Issue 1: String Dimensions on Interests
**Problem:** Interest table stores vessel dimensions (Length, Width, Depth) as strings. Values may include unit suffixes ("45ft", "12m"), be empty strings, or contain non-numeric characters.
**Fix:** `parseDimension()` helper (defined above) handles all cases. Log any unparseable values to `migration-warnings.csv`.
### Issue 2: Mixed Naming Conventions
**Problem:** NocoDB fields use inconsistent naming: `"Full Name"` (Title Spaced), `"documensoID"` (camelCase), `"status_override_mode"` (snake_case), `CreatedAt` (PascalCase on Expenses vs `Created At` on Interests).
**Fix:** Access each field by its exact NocoDB name. The extraction layer handles the naming; transformation maps to consistent snake_case.
### Issue 3: LOI → EOI Terminology
**Problem:** The codebase has remnants of "LOI" (Letter of Intent) terminology that was renamed to "EOI" (Expression of Interest). Fields like `LOI-NDA Document` and `Time LOI Sent` actually refer to EOI documents.
**Fix:** Map all LOI references to EOI in the new schema. `LOI-NDA Document``documents` with `document_type = "eoi"`. `Time LOI Sent``interests.date_eoi_signed`.
### Issue 4: Incomplete Enums
**Problem:** BerthStatus enum (`Available | UnderOffer | Sold`) is missing `Reserved`, `Not Available`, `Draft` which may exist in production data. BerthArea enum covers A-E but production has F and G.
**Fix:** Status mapping table handles all known values (see Table 6). Area is treated as free text. Any unrecognized values are logged and defaulted.
### Issue 5: Price as String on Expenses
**Problem:** Expense `Price` field is typed as `string` in the NocoDB interface. Values may contain currency symbols, commas, or be formatted inconsistently.
**Fix:** `parsePrice()` helper strips non-numeric characters and parses to NUMERIC. NULL values logged.
### Issue 6: Berth References
**Problem:** Interests reference berths via both `Berths_id` (numeric link field) and `Berth Number` (string display field). These may be inconsistent.
**Fix:** Primary resolution via `Berths_id` → berth UUID map. Fallback: match `Berth Number` string against `berths.mooring_number`. If both fail, set `berth_id = NULL` and log.
### Issue 7: Invoice Number Collision Risk
**Problem:** The current `generateInvoiceNumber()` function uses format `INV-YYYYMM-XYZ` with a random 3-char suffix, which has collision risk.
**Fix:** During migration, preserve all existing invoice numbers exactly. Post-migration, the new system uses a sequential counter per port per month, eliminating collision risk.
### Issue 8: Duplicate Clients
**Problem:** The same person may appear as multiple NocoDB Interest rows with slightly different name spellings or different email addresses.
**Fix:** Three-pass dedup:
1. Exact email match (case-insensitive) → same client
2. Fuzzy name match (Levenshtein distance ≤ 2 on normalized name) + same residence → candidate merge (log for manual review)
3. Remaining → create as separate clients (can be merged manually in new system)
All fuzzy-match candidates logged to `dedup-candidates.csv` for Matt's review before committing.
---
## Synthetic Records (Created During Migration)
These records don't exist in NocoDB but are generated to provide a complete dataset in the new system:
| Record Type | Source | Logic |
| ------------------------- | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `audit_logs` (historical) | All migrated records | For each migrated entity, create a synthetic `action: "create"` audit log entry with `created_at` matching the original record's creation date, `user_id` = system migration user, `metadata = { "source": "nocodb_migration" }` |
| `tags` | Lead categories, sources | Create tags for common categories: "Hot Lead", "Website Lead", "Referral", "Broker", etc. |
| `client_tags` | Interest `Lead Category` | Auto-tag clients based on their highest lead category across all interests |
| `interest_tags` | Interest `Source` | Auto-tag interests by source |
| `client_notes` | Dedup merge info | When two interests are merged into one client, create a note documenting the merge |
---
## ID Mapping Tables (In-Memory During Migration)
The migration script maintains these maps for cross-referencing:
```typescript
interface MigrationMaps {
// NocoDB numeric ID → new PostgreSQL UUID
clientMap: Map<string, string>; // emailOrNameKey → client UUID
berthMap: Map<number, string>; // NocoDB Berth Id → berth UUID
interestMap: Map<number, string>; // NocoDB Interest Id → interest UUID
expenseMap: Map<number, string>; // NocoDB Expense Id → expense UUID
invoiceMap: Map<number, string>; // NocoDB Invoice Id → invoice UUID
// Reverse lookups
interestToClient: Map<number, string>; // NocoDB Interest Id → client UUID
berthNumberToId: Map<string, string>; // Mooring number string → berth UUID
}
```
---
## Validation Checklist
After migration completes, run these validation checks:
| Check | Query | Expected |
| --------------------- | ------------------------------------------------------------------------------------------------------- | --------------------------------------------- |
| Client count | `SELECT COUNT(*) FROM clients` | ≤ NocoDB interest count (dedup reduces) |
| Contact count | `SELECT COUNT(*) FROM client_contacts` | ≥ client count (most clients have ≥1 contact) |
| Interest count | `SELECT COUNT(*) FROM interests` | = NocoDB interest count (1:1) |
| Berth count | `SELECT COUNT(*) FROM berths` | = NocoDB berth count (1:1) |
| Expense count | `SELECT COUNT(*) FROM expenses` | = NocoDB expense count (1:1) |
| Invoice count | `SELECT COUNT(*) FROM invoices` | = NocoDB invoice count (1:1) |
| Orphan interests | `SELECT COUNT(*) FROM interests WHERE client_id NOT IN (SELECT id FROM clients)` | 0 |
| Orphan contacts | `SELECT COUNT(*) FROM client_contacts WHERE client_id NOT IN (SELECT id FROM clients)` | 0 |
| NULL berth refs | `SELECT COUNT(*) FROM interests WHERE berth_id IS NOT NULL AND berth_id NOT IN (SELECT id FROM berths)` | 0 |
| Invoice-expense links | `SELECT COUNT(*) FROM invoice_expenses WHERE expense_id NOT IN (SELECT id FROM expenses)` | 0 |
| File records | `SELECT COUNT(*) FROM files` | ≥ (receipt files + EOI docs + contract docs) |
| Pipeline distribution | `SELECT pipeline_stage, COUNT(*) FROM interests GROUP BY 1` | Reasonable distribution, no unexpected NULLs |
| Duplicate emails | `SELECT value, COUNT(*) FROM client_contacts WHERE channel='email' GROUP BY 1 HAVING COUNT(*) > 1` | Review each — may be valid (shared email) |
| Amount sanity | `SELECT COUNT(*) FROM expenses WHERE amount <= 0` | 0 (or review outliers) |
---
## Migration Script Configuration
```typescript
// scripts/migrate-nocodb/config.ts
export const MIGRATION_CONFIG = {
// NocoDB connection
nocodb: {
apiUrl: process.env.NOCODB_API_URL, // e.g., 'https://nocodb.portnimara.com'
token: process.env.NOCODB_AUTH_TOKEN,
projectId: process.env.NOCODB_PROJECT_ID,
tables: {
interests: 'mbs9hjauug4eseo',
berths: 'mczgos9hr3oa9qc',
expenses: 'mxfcefkk4dqs6uq',
invoices: 'mvyvz0lpc30p01s',
},
},
// Target PostgreSQL
postgres: {
connectionString: process.env.DATABASE_URL,
},
// MinIO
minio: {
endpoint: process.env.MINIO_ENDPOINT,
accessKey: process.env.MINIO_ACCESS_KEY,
secretKey: process.env.MINIO_SECRET_KEY,
bucket: 'crm-files',
},
// Migration defaults
defaults: {
portId: process.env.PORT_NIMARA_UUID, // seeded in L0
portSlug: 'port-nimara',
migrationUserId: 'system-migration',
defaultCurrency: 'USD',
defaultTimezone: 'America/Anguilla',
},
// Deduplication thresholds
dedup: {
nameLevenshteinThreshold: 2, // max edit distance for fuzzy name match
requireResidenceMatch: true, // secondary match requires same residence
logCandidates: true, // write fuzzy candidates to CSV
},
// Output files
output: {
warningsFile: 'migration-warnings.csv', // unparseable values, unexpected data
dedupCandidatesFile: 'dedup-candidates.csv', // fuzzy match candidates for review
unmatchedFilesFile: 'unmatched-files.csv', // MinIO files that couldn't be linked to a client
validationReport: 'validation-report.json', // final counts and check results
},
};
```
---
## Post-Migration Manual Tasks
| Task | Owner | When |
| ------------------------------------------------------------------------------- | ----- | ------------------------- |
| Review `dedup-candidates.csv` — approve or reject client merges | Matt | Saturday morning |
| Review `unmatched-files.csv` — manually assign files to clients or mark as misc | Matt | Saturday morning |
| Review `migration-warnings.csv` — fix any data issues | Matt | Saturday morning |
| Spot-check 5 high-profile clients — verify all data migrated correctly | Matt | Saturday after validation |
| Spot-check 5 berths — verify dimensions, status, map data | Matt | Saturday after validation |
| Spot-check 3 invoices — verify amounts, line items, linked expenses | Matt | Saturday after validation |
| Create user accounts and assign roles in new system | Matt | Saturday afternoon |
| Configure SMTP (Poste.io) connection | Matt | Saturday afternoon |
| Test Documenso API connection | Matt | Saturday afternoon |
| Populate `berth_map_data` if not extracted from NocoDB | Matt | Post-migration week |