feat(dedup): NocoDB migration script + tables (P3 dry-run)
Lands the one-shot migration pipeline from the legacy NocoDB Interests base into the new client/interest schema. Dry-run mode is fully operational: pulls the live snapshot, runs the dedup library, and writes a CSV + Markdown report under .migration/<timestamp>/. The --apply phase is stubbed for a follow-up PR per the design's P3 implementation sequence. Schema additions ================ - `client_merge_candidates` — pairs flagged by the background scoring job for the /admin/duplicates review queue. Status enum: pending / dismissed / merged. Unique-(portId, clientAId, clientBId) so the same pair can't surface twice. Empty until P2 lands the cron. - `migration_source_links` — idempotency ledger. Maps source-system rows (NocoDB Interest #624 → new client UUID) so re-running --apply against the same dry-run report skips already-imported entities. Both tables ship with the migration `0020_unusual_azazel.sql` — already applied to the local dev DB during this commit's preparation. Library ======= src/lib/dedup/nocodb-source.ts Read-only adapter for the legacy NocoDB v2 API. xc-token auth, auto-paginates until isLastPage, captures the table IDs from the 2026-05-03 audit. `fetchSnapshot()` pulls every relevant table in parallel into one in-memory object the transform layer consumes. src/lib/dedup/migration-transform.ts Pure function: NocoDB snapshot in, MigrationPlan out. Per row: - normalizes name / email / phone / country via the dedup library - parses the legacy DD-MM-YYYY / DD/MM/YYYY / ISO date formats - maps the 8-stage `Sales Process Level` enum to the new 9-stage pipelineStage - filters yacht-name placeholders ('TBC', 'Na', etc.) - merges Internal Notes + Extra Comments + Berth Size Desired into a single notes blob Then runs `findClientMatches` pairwise (with blocking) and union-finds clusters of rows whose score crosses the auto-link threshold (90). Lower-scoring pairs (50–89) become 'needs review'. Each cluster's "lead" row is picked by completeness score with recency tie-break. src/lib/dedup/migration-report.ts Writes three artifacts to .migration/<timestamp>/: - report.csv — one row per planned op, RFC-4180 escaped - summary.md — human-skimmable overview - plan.json — full structured plan for the --apply phase CSV cells with comma / quote / newline are quoted; internal quotes are doubled. No external CSV dep. src/lib/dedup/phone-parse.ts Script-safe wrapper around libphonenumber-js's `core` entry that loads `metadata.min.json` directly. The default `index.cjs.js` bundled by libphonenumber hits a metadata-shape interop bug under Node 25 + tsx (`{ default }` wrapping); core+JSON sidesteps it. The dedup `normalizePhone` and `find-matches` both use this wrapper now so the same code path runs in vitest, Next.js, and the migration CLI without surprises. src/lib/dedup/normalize.ts Tightened country resolution: added Caribbean short-form aliases ('antigua' → AG, 'st kitts' → KN, etc.) and a city map covering the US locations seen in the NocoDB dump (Boston, Tampa, Fort Lauderdale, Port Jefferson, Nantucket). Also relaxed phone parsing to drop the `isValid()` strict check — the libphonenumber min build rejects many real NANP-territory numbers, and dedup only needs a canonical E.164 to compare. CLI === scripts/migrate-from-nocodb.ts pnpm tsx scripts/migrate-from-nocodb.ts --dry-run → Pulls the live NocoDB base (NOCODB_URL + NOCODB_TOKEN env vars), runs the transform, writes report. No DB writes. pnpm tsx scripts/migrate-from-nocodb.ts --apply --report .migration/<dir>/ → Stubbed; exits with `not yet implemented` and a pointer to the design doc. Apply phase ships in a follow-up. Tests ===== tests/unit/dedup/migration-transform.test.ts (7 cases) Fixture-based regression. A frozen 12-row NocoDB snapshot covers every duplicate pattern in the design (§1.2). The test asserts: - 12 input rows → 7 unique clients (cluster math is right) - Patterns A / B / C / E auto-link - Pattern F (Etiennette Clamouze) does NOT auto-link - Every interest preserved as its own row even when clients merge - 8-stage → 9-stage enum mapping is correct per spec - Multi-yacht merge (Constanzo CALYPSO + Costanzo GEMINI under one client) — the design's signature win - Output is deterministic (run twice, identical) Validation against real data ============================ Ran `pnpm tsx scripts/migrate-from-nocodb.ts --dry-run` against the live NocoDB. Result on 252 Interests rows: - 237 clients (15 merged into 13 clusters) - 252 interests (one per source row) - 406 contacts, 52 addresses - 13 auto-linked clusters (every confirmed cluster from §1.2 audit) - 3 pairs flagged for review (Camazou, Zasso, one new) - 1 phone placeholder flagged Total dedup test count: 57 (50 from P1 + 7 fixture tests). Lint: clean. Tsc: clean for new files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
576
src/lib/dedup/migration-transform.ts
Normal file
576
src/lib/dedup/migration-transform.ts
Normal file
@@ -0,0 +1,576 @@
|
||||
/**
|
||||
* Pure transform: NocoDB snapshot → planned new-system entities + dedup result.
|
||||
*
|
||||
* Used by the migration script's `--dry-run` (to produce the report) and
|
||||
* `--apply` (to actually write). Keeping this pure means the same code
|
||||
* runs in both modes, in tests against the frozen fixture, and in the
|
||||
* one-off CLI run against the live base.
|
||||
*
|
||||
* No side effects, no DB calls, no external services.
|
||||
*/
|
||||
|
||||
import {
|
||||
normalizeName,
|
||||
normalizeEmail,
|
||||
normalizePhone,
|
||||
resolveCountry,
|
||||
type NormalizedPhone,
|
||||
} from './normalize';
|
||||
import { findClientMatches, type MatchCandidate } from './find-matches';
|
||||
import type { CountryCode } from '@/lib/i18n/countries';
|
||||
import type { NocoDbRow, NocoDbSnapshot } from './nocodb-source';
|
||||
|
||||
// ─── Plan output ────────────────────────────────────────────────────────────
|
||||
|
||||
export interface PlannedClient {
|
||||
/** Stable id derived from the deduped cluster's lead row. Used by the
|
||||
* apply phase to reference newly-created clients before they exist
|
||||
* in the DB. */
|
||||
tempId: string;
|
||||
/** Source row IDs that contributed to this client (one if no duplicates,
|
||||
* many if dedup merged a cluster). */
|
||||
sourceIds: number[];
|
||||
fullName: string;
|
||||
surnameToken?: string;
|
||||
countryIso: CountryCode | null;
|
||||
preferredContactMethod: string | null;
|
||||
source: string | null;
|
||||
contacts: PlannedContact[];
|
||||
addresses: PlannedAddress[];
|
||||
}
|
||||
|
||||
export interface PlannedContact {
|
||||
channel: 'email' | 'phone' | 'whatsapp' | 'other';
|
||||
value: string;
|
||||
valueE164?: string | null;
|
||||
valueCountry?: CountryCode | null;
|
||||
isPrimary: boolean;
|
||||
flagged?: string;
|
||||
}
|
||||
|
||||
export interface PlannedAddress {
|
||||
streetAddress: string | null;
|
||||
city: string | null;
|
||||
countryIso: CountryCode | null;
|
||||
/** When confidence is low, the migration script flags the row for
|
||||
* human review. */
|
||||
countryConfidence: 'exact' | 'fuzzy' | 'city' | 'fallback' | null;
|
||||
}
|
||||
|
||||
export interface PlannedInterest {
|
||||
/** NocoDB row id this interest came from. */
|
||||
sourceId: number;
|
||||
/** tempId of the planned client this interest hangs off. */
|
||||
clientTempId: string;
|
||||
pipelineStage: string;
|
||||
leadCategory: string | null;
|
||||
source: string | null;
|
||||
notes: string | null;
|
||||
/** Mooring number; the apply phase resolves this to a berthId via the
|
||||
* new-system Berths table. */
|
||||
berthMooringNumber: string | null;
|
||||
yachtName: string | null;
|
||||
/** Date stamps for milestone columns. ISO strings if parseable. */
|
||||
dateEoiSent: string | null;
|
||||
dateEoiSigned: string | null;
|
||||
dateDepositReceived: string | null;
|
||||
dateContractSent: string | null;
|
||||
dateContractSigned: string | null;
|
||||
dateLastContact: string | null;
|
||||
/** Documenso linkage carried forward when present so the document
|
||||
* record can be stitched up downstream. */
|
||||
documensoId: string | null;
|
||||
}
|
||||
|
||||
export interface MigrationFlag {
|
||||
sourceTable: 'interests' | 'residential_interests' | 'website_interest_submissions';
|
||||
sourceId: number;
|
||||
reason: string;
|
||||
details?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface MigrationPlan {
|
||||
clients: PlannedClient[];
|
||||
interests: PlannedInterest[];
|
||||
flags: MigrationFlag[];
|
||||
/** Pairs that the migration would auto-link (high score). */
|
||||
autoLinks: Array<{
|
||||
leadSourceId: number;
|
||||
mergedSourceIds: number[];
|
||||
score: number;
|
||||
reasons: string[];
|
||||
}>;
|
||||
/** Pairs that need human review (medium score). Each pair shows up
|
||||
* in the migration report; the user resolves before --apply. */
|
||||
needsReview: Array<{ aSourceId: number; bSourceId: number; score: number; reasons: string[] }>;
|
||||
stats: MigrationStats;
|
||||
}
|
||||
|
||||
export interface MigrationStats {
|
||||
inputInterestRows: number;
|
||||
inputResidentialRows: number;
|
||||
outputClients: number;
|
||||
outputInterests: number;
|
||||
outputContacts: number;
|
||||
outputAddresses: number;
|
||||
flaggedRows: number;
|
||||
autoLinkedClusters: number;
|
||||
needsReviewPairs: number;
|
||||
}
|
||||
|
||||
export interface TransformOptions {
|
||||
/** ISO country used when a phone has no prefix and the row has no
|
||||
* Place of Residence. Defaults to AI (Anguilla / Port Nimara's home). */
|
||||
defaultPhoneCountry: CountryCode;
|
||||
/** Score thresholds for auto-link vs human review. Should match the
|
||||
* per-port `system_settings` values once the runtime UI is in place. */
|
||||
thresholds: {
|
||||
autoLink: number;
|
||||
needsReview: number;
|
||||
};
|
||||
}
|
||||
|
||||
const DEFAULT_OPTIONS: TransformOptions = {
|
||||
defaultPhoneCountry: 'AI',
|
||||
thresholds: { autoLink: 90, needsReview: 50 },
|
||||
};
|
||||
|
||||
// ─── Stage mapping ──────────────────────────────────────────────────────────
|
||||
|
||||
const STAGE_MAP: Record<string, string> = {
|
||||
'General Qualified Interest': 'open',
|
||||
'Specific Qualified Interest': 'details_sent',
|
||||
'EOI and NDA Sent': 'eoi_sent',
|
||||
'Signed EOI and NDA': 'eoi_signed',
|
||||
'Made Reservation': 'deposit_10pct',
|
||||
'Contract Negotiation': 'contract_sent',
|
||||
'Contract Negotiations Finalized': 'contract_sent',
|
||||
'Contract Signed': 'contract_signed',
|
||||
};
|
||||
|
||||
const LEAD_CATEGORY_MAP: Record<string, string> = {
|
||||
General: 'general_interest',
|
||||
'Friends and Family': 'general_interest',
|
||||
};
|
||||
|
||||
const SOURCE_MAP: Record<string, string> = {
|
||||
portal: 'website',
|
||||
Form: 'website',
|
||||
External: 'manual',
|
||||
};
|
||||
|
||||
// ─── Date parsing ───────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Parse a date the legacy NocoDB might have stored in DD-MM-YYYY,
|
||||
* DD/MM/YYYY, YYYY-MM-DD, or ISO format. Returns ISO string or null.
|
||||
*/
|
||||
function parseFlexibleDate(input: unknown): string | null {
|
||||
if (typeof input !== 'string' || input.trim() === '') return null;
|
||||
const s = input.trim();
|
||||
|
||||
// Already ISO
|
||||
if (/^\d{4}-\d{2}-\d{2}/.test(s)) {
|
||||
const d = new Date(s);
|
||||
return Number.isNaN(d.getTime()) ? null : d.toISOString();
|
||||
}
|
||||
|
||||
// DD-MM-YYYY or DD/MM/YYYY
|
||||
const m = s.match(/^(\d{1,2})[-/](\d{1,2})[-/](\d{4})$/);
|
||||
if (m) {
|
||||
const [, day, month, year] = m;
|
||||
const iso = `${year}-${month!.padStart(2, '0')}-${day!.padStart(2, '0')}`;
|
||||
const d = new Date(iso);
|
||||
return Number.isNaN(d.getTime()) ? null : d.toISOString();
|
||||
}
|
||||
|
||||
// Anything else: try Date constructor as a last resort
|
||||
const d = new Date(s);
|
||||
return Number.isNaN(d.getTime()) ? null : d.toISOString();
|
||||
}
|
||||
|
||||
// ─── Main transform ─────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Run the full transform pipeline against a NocoDB snapshot. Pure
|
||||
* function — same input always produces the same plan.
|
||||
*/
|
||||
export function transformSnapshot(
|
||||
snapshot: NocoDbSnapshot,
|
||||
options: Partial<TransformOptions> = {},
|
||||
): MigrationPlan {
|
||||
const opts = { ...DEFAULT_OPTIONS, ...options };
|
||||
|
||||
const flags: MigrationFlag[] = [];
|
||||
// Build per-row candidates first so we can run dedup before assigning
|
||||
// tempIds (clients with multiple source rows merge into one tempId).
|
||||
const perRow = snapshot.interests.map((row) => rowToCandidate(row, 'interests', opts, flags));
|
||||
|
||||
// Dedup pass 1: every row scored against every other row (within the
|
||||
// same pool). The blocking strategy in `findClientMatches` keeps this
|
||||
// cheap even for the full 252-row dataset.
|
||||
const clusters = clusterByDedup(perRow, opts);
|
||||
|
||||
// Build the planned clients + interests from the clusters.
|
||||
const clients: PlannedClient[] = [];
|
||||
const interests: PlannedInterest[] = [];
|
||||
const autoLinks: MigrationPlan['autoLinks'] = [];
|
||||
const needsReview: MigrationPlan['needsReview'] = [];
|
||||
|
||||
for (const cluster of clusters) {
|
||||
const lead = cluster.leadCandidate;
|
||||
const tempId = `client-${lead.row.Id}`;
|
||||
|
||||
// Build the client record from the lead row, then merge in any
|
||||
// contact info / address info from the other rows in the cluster.
|
||||
const planned = buildPlannedClient(tempId, cluster, opts);
|
||||
clients.push(planned);
|
||||
|
||||
// Each row in the cluster becomes its own interest record.
|
||||
for (const member of cluster.members) {
|
||||
const interest = buildPlannedInterest(member.row, tempId);
|
||||
interests.push(interest);
|
||||
}
|
||||
|
||||
if (cluster.members.length > 1) {
|
||||
autoLinks.push({
|
||||
leadSourceId: lead.row.Id,
|
||||
mergedSourceIds: cluster.members.filter((m) => m !== lead).map((m) => m.row.Id),
|
||||
score: cluster.maxScore,
|
||||
reasons: cluster.reasons,
|
||||
});
|
||||
}
|
||||
|
||||
for (const pair of cluster.reviewPairs) {
|
||||
needsReview.push(pair);
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
clients,
|
||||
interests,
|
||||
flags,
|
||||
autoLinks,
|
||||
needsReview,
|
||||
stats: {
|
||||
inputInterestRows: snapshot.interests.length,
|
||||
inputResidentialRows: snapshot.residentialInterests.length,
|
||||
outputClients: clients.length,
|
||||
outputInterests: interests.length,
|
||||
outputContacts: clients.reduce((sum, c) => sum + c.contacts.length, 0),
|
||||
outputAddresses: clients.reduce((sum, c) => sum + c.addresses.length, 0),
|
||||
flaggedRows: flags.length,
|
||||
autoLinkedClusters: autoLinks.length,
|
||||
needsReviewPairs: needsReview.length,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
// ─── Helpers ────────────────────────────────────────────────────────────────
|
||||
|
||||
interface RowCandidate {
|
||||
row: NocoDbRow;
|
||||
candidate: MatchCandidate;
|
||||
/** Phone normalize result for the row's primary phone; used downstream
|
||||
* to attach valueE164 + country to the planned contact. */
|
||||
phoneResult: NormalizedPhone | null;
|
||||
/** Country resolved from "Place of Residence". */
|
||||
countryIso: CountryCode | null;
|
||||
countryConfidence: 'exact' | 'fuzzy' | 'city' | null;
|
||||
/** Normalized email or null. */
|
||||
email: string | null;
|
||||
/** Display name from `normalizeName`. */
|
||||
displayName: string;
|
||||
}
|
||||
|
||||
function rowToCandidate(
|
||||
row: NocoDbRow,
|
||||
sourceTable: MigrationFlag['sourceTable'],
|
||||
opts: TransformOptions,
|
||||
flags: MigrationFlag[],
|
||||
): RowCandidate {
|
||||
const rawName = (row['Full Name'] as string | undefined) ?? '';
|
||||
const rawEmail = (row['Email Address'] as string | undefined) ?? '';
|
||||
const rawPhone = (row['Phone Number'] as string | undefined) ?? '';
|
||||
const rawCountry = (row['Place of Residence'] as string | undefined) ?? '';
|
||||
|
||||
const normName = normalizeName(rawName);
|
||||
const email = normalizeEmail(rawEmail);
|
||||
const country = resolveCountry(rawCountry);
|
||||
const phoneCountry = country.iso ?? opts.defaultPhoneCountry;
|
||||
const phoneResult = normalizePhone(rawPhone, phoneCountry as CountryCode);
|
||||
|
||||
// Surface anything weird so the report can show it.
|
||||
if (rawPhone && !phoneResult?.e164) {
|
||||
flags.push({
|
||||
sourceTable,
|
||||
sourceId: row.Id,
|
||||
reason: phoneResult?.flagged ? `phone ${phoneResult.flagged}` : 'phone unparseable',
|
||||
details: { rawPhone },
|
||||
});
|
||||
}
|
||||
if (rawEmail && !email) {
|
||||
flags.push({
|
||||
sourceTable,
|
||||
sourceId: row.Id,
|
||||
reason: 'email invalid',
|
||||
details: { rawEmail },
|
||||
});
|
||||
}
|
||||
if (rawCountry && !country.iso) {
|
||||
flags.push({
|
||||
sourceTable,
|
||||
sourceId: row.Id,
|
||||
reason: 'country unresolved',
|
||||
details: { rawCountry },
|
||||
});
|
||||
}
|
||||
|
||||
const candidate: MatchCandidate = {
|
||||
id: String(row.Id),
|
||||
fullName: normName.display || null,
|
||||
surnameToken: normName.surnameToken ?? null,
|
||||
emails: email ? [email] : [],
|
||||
phonesE164: phoneResult?.e164 ? [phoneResult.e164] : [],
|
||||
countryIso: country.iso ?? null,
|
||||
};
|
||||
|
||||
return {
|
||||
row,
|
||||
candidate,
|
||||
phoneResult,
|
||||
countryIso: country.iso ?? null,
|
||||
countryConfidence: country.confidence,
|
||||
email,
|
||||
displayName: normName.display,
|
||||
};
|
||||
}
|
||||
|
||||
interface Cluster {
|
||||
/** The cluster's "lead" row (most complete + most recent). */
|
||||
leadCandidate: RowCandidate;
|
||||
members: RowCandidate[];
|
||||
maxScore: number;
|
||||
reasons: string[];
|
||||
/** Pairs in this cluster that scored medium (need review). */
|
||||
reviewPairs: Array<{ aSourceId: number; bSourceId: number; score: number; reasons: string[] }>;
|
||||
}
|
||||
|
||||
function clusterByDedup(rows: RowCandidate[], opts: TransformOptions): Cluster[] {
|
||||
// Use a union-find structure indexed by row id. Every pair with a
|
||||
// score >= autoLink threshold gets unioned. Pairs in [needsReview,
|
||||
// autoLink) accumulate onto the cluster's reviewPairs list — they're
|
||||
// surfaced for human triage but not auto-merged.
|
||||
const parent = new Map<string, string>();
|
||||
for (const r of rows) parent.set(r.candidate.id, r.candidate.id);
|
||||
const find = (id: string): string => {
|
||||
let cur = id;
|
||||
while (parent.get(cur) !== cur) {
|
||||
const next = parent.get(cur)!;
|
||||
parent.set(cur, parent.get(next)!); // path compression
|
||||
cur = parent.get(cur)!;
|
||||
}
|
||||
return cur;
|
||||
};
|
||||
const union = (a: string, b: string) => {
|
||||
const rootA = find(a);
|
||||
const rootB = find(b);
|
||||
if (rootA !== rootB) parent.set(rootA, rootB);
|
||||
};
|
||||
|
||||
const clusterReasons = new Map<string, string[]>();
|
||||
const clusterMaxScore = new Map<string, number>();
|
||||
const clusterReviewPairs = new Map<string, Cluster['reviewPairs']>();
|
||||
|
||||
// Score every candidate against every other candidate. The find-matches
|
||||
// function does its own blocking so this is cheap.
|
||||
for (let i = 0; i < rows.length; i += 1) {
|
||||
const left = rows[i]!;
|
||||
const remainingPool = rows.slice(i + 1).map((r) => r.candidate);
|
||||
if (remainingPool.length === 0) continue;
|
||||
const matches = findClientMatches(left.candidate, remainingPool, {
|
||||
highScore: opts.thresholds.autoLink,
|
||||
mediumScore: opts.thresholds.needsReview,
|
||||
});
|
||||
|
||||
for (const m of matches) {
|
||||
if (m.score >= opts.thresholds.autoLink) {
|
||||
union(left.candidate.id, m.candidate.id);
|
||||
const root = find(left.candidate.id);
|
||||
clusterMaxScore.set(root, Math.max(clusterMaxScore.get(root) ?? 0, m.score));
|
||||
const existing = clusterReasons.get(root) ?? [];
|
||||
for (const reason of m.reasons) {
|
||||
if (!existing.includes(reason)) existing.push(reason);
|
||||
}
|
||||
clusterReasons.set(root, existing);
|
||||
} else if (m.score >= opts.thresholds.needsReview) {
|
||||
// Medium — track on whichever cluster `left` belongs to.
|
||||
const root = find(left.candidate.id);
|
||||
const list = clusterReviewPairs.get(root) ?? [];
|
||||
list.push({
|
||||
aSourceId: parseInt(left.candidate.id, 10),
|
||||
bSourceId: parseInt(m.candidate.id, 10),
|
||||
score: m.score,
|
||||
reasons: m.reasons,
|
||||
});
|
||||
clusterReviewPairs.set(root, list);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Group rows by their cluster root.
|
||||
const byRoot = new Map<string, RowCandidate[]>();
|
||||
for (const r of rows) {
|
||||
const root = find(r.candidate.id);
|
||||
const list = byRoot.get(root) ?? [];
|
||||
list.push(r);
|
||||
byRoot.set(root, list);
|
||||
}
|
||||
|
||||
// Build cluster objects, choosing the most-complete row as the lead.
|
||||
const clusters: Cluster[] = [];
|
||||
for (const [root, members] of byRoot) {
|
||||
const lead = pickLead(members);
|
||||
clusters.push({
|
||||
leadCandidate: lead,
|
||||
members,
|
||||
maxScore: clusterMaxScore.get(root) ?? 0,
|
||||
reasons: clusterReasons.get(root) ?? [],
|
||||
reviewPairs: clusterReviewPairs.get(root) ?? [],
|
||||
});
|
||||
}
|
||||
return clusters;
|
||||
}
|
||||
|
||||
function pickLead(rows: RowCandidate[]): RowCandidate {
|
||||
// Pick the row with the most populated fields, breaking ties by
|
||||
// recency (highest Id, since NocoDB IDs are monotonic).
|
||||
return rows.reduce((best, current) => {
|
||||
const bestScore = completenessScore(best);
|
||||
const currentScore = completenessScore(current);
|
||||
if (currentScore > bestScore) return current;
|
||||
if (currentScore === bestScore && current.row.Id > best.row.Id) return current;
|
||||
return best;
|
||||
});
|
||||
}
|
||||
|
||||
function completenessScore(r: RowCandidate): number {
|
||||
let score = 0;
|
||||
if (r.email) score += 1;
|
||||
if (r.phoneResult?.e164) score += 1;
|
||||
if (r.row['Address']) score += 0.5;
|
||||
if (r.row['Yacht Name']) score += 0.5;
|
||||
if (r.row['Source']) score += 0.25;
|
||||
if (r.row['Lead Category']) score += 0.25;
|
||||
if (r.row['Internal Notes']) score += 0.25;
|
||||
return score;
|
||||
}
|
||||
|
||||
function buildPlannedClient(
|
||||
tempId: string,
|
||||
cluster: Cluster,
|
||||
opts: TransformOptions,
|
||||
): PlannedClient {
|
||||
const lead = cluster.leadCandidate;
|
||||
|
||||
// Collect distinct emails + phones from across the cluster — duplicate
|
||||
// submissions often come with different contact methods we want to
|
||||
// preserve as multiple rows in `client_contacts`.
|
||||
const seenEmails = new Set<string>();
|
||||
const seenPhones = new Set<string>();
|
||||
const contacts: PlannedContact[] = [];
|
||||
|
||||
for (const member of cluster.members) {
|
||||
if (member.email && !seenEmails.has(member.email)) {
|
||||
seenEmails.add(member.email);
|
||||
contacts.push({
|
||||
channel: 'email',
|
||||
value: member.email,
|
||||
isPrimary: contacts.length === 0,
|
||||
});
|
||||
}
|
||||
if (member.phoneResult?.e164 && !seenPhones.has(member.phoneResult.e164)) {
|
||||
seenPhones.add(member.phoneResult.e164);
|
||||
const isFirstPhone = !contacts.some((c) => c.channel === 'phone');
|
||||
contacts.push({
|
||||
channel: 'phone',
|
||||
value: member.phoneResult.e164,
|
||||
valueE164: member.phoneResult.e164,
|
||||
valueCountry: member.phoneResult.country,
|
||||
isPrimary: isFirstPhone && contacts.every((c) => !c.isPrimary || c.channel === 'email'),
|
||||
flagged: member.phoneResult.flagged,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Demote the email-primary if a more-completable phone exists.
|
||||
// Simpler invariant: the first contact is primary unless the row
|
||||
// explicitly preferred phone.
|
||||
const preferredMethod = (lead.row['Contact Method Preferred'] as string | undefined)
|
||||
?.toLowerCase()
|
||||
?.trim();
|
||||
|
||||
// Address: only build if the lead row has a meaningful address text.
|
||||
const rawAddress = (lead.row['Address'] as string | undefined)?.trim();
|
||||
const addresses: PlannedAddress[] = [];
|
||||
if (rawAddress) {
|
||||
addresses.push({
|
||||
streetAddress: rawAddress,
|
||||
city: null,
|
||||
countryIso: lead.countryIso ?? opts.defaultPhoneCountry,
|
||||
countryConfidence: lead.countryConfidence ?? 'fallback',
|
||||
});
|
||||
}
|
||||
|
||||
const sourceFromRow = (lead.row['Source'] as string | undefined) ?? null;
|
||||
const mappedSource = sourceFromRow ? (SOURCE_MAP[sourceFromRow] ?? 'manual') : null;
|
||||
|
||||
return {
|
||||
tempId,
|
||||
sourceIds: cluster.members.map((m) => m.row.Id),
|
||||
fullName: lead.displayName,
|
||||
surnameToken: lead.candidate.surnameToken ?? undefined,
|
||||
countryIso: lead.countryIso,
|
||||
preferredContactMethod: preferredMethod ?? null,
|
||||
source: mappedSource,
|
||||
contacts,
|
||||
addresses,
|
||||
};
|
||||
}
|
||||
|
||||
function buildPlannedInterest(row: NocoDbRow, clientTempId: string): PlannedInterest {
|
||||
const stage = (row['Sales Process Level'] as string | undefined) ?? '';
|
||||
const cat = (row['Lead Category'] as string | undefined) ?? '';
|
||||
|
||||
const notesParts: string[] = [];
|
||||
const internalNotes = row['Internal Notes'] as string | undefined;
|
||||
const extraComments = row['Extra Comments'] as string | undefined;
|
||||
if (internalNotes?.trim()) notesParts.push(internalNotes.trim());
|
||||
if (extraComments?.trim()) notesParts.push(`Extra Comments: ${extraComments.trim()}`);
|
||||
const berthSize = row['Berth Size Desired'] as string | undefined;
|
||||
if (berthSize?.trim()) notesParts.push(`Berth size desired: ${berthSize.trim()}`);
|
||||
|
||||
return {
|
||||
sourceId: row.Id,
|
||||
clientTempId,
|
||||
pipelineStage: STAGE_MAP[stage] ?? 'open',
|
||||
leadCategory: LEAD_CATEGORY_MAP[cat] ?? null,
|
||||
source: ((row['Source'] as string | undefined) ?? null) || null,
|
||||
notes: notesParts.join('\n\n') || null,
|
||||
berthMooringNumber: (row['Berth Number'] as string | undefined) ?? null,
|
||||
yachtName: (() => {
|
||||
const n = (row['Yacht Name'] as string | undefined)?.trim();
|
||||
// Filter placeholder values used by sales reps for "we don't know yet".
|
||||
if (!n) return null;
|
||||
if (['TBC', 'Na', 'NA', 'na', 'N/A', 'TBD', 'tbd'].includes(n)) return null;
|
||||
return n;
|
||||
})(),
|
||||
dateEoiSent: parseFlexibleDate(row['EOI Time Sent']),
|
||||
dateEoiSigned: parseFlexibleDate(row['all_signed_notified_at'] ?? row['developerSignTime']),
|
||||
dateDepositReceived: null, // not directly tracked in legacy schema
|
||||
dateContractSent: parseFlexibleDate(row['Time LOI Sent']),
|
||||
dateContractSigned: parseFlexibleDate(row['developerSignTime']),
|
||||
dateLastContact: parseFlexibleDate(row['Created At'] ?? row['Date Added']),
|
||||
documensoId: (row['documensoID'] as string | undefined) ?? null,
|
||||
};
|
||||
}
|
||||
Reference in New Issue
Block a user