Adds the live dedup pipeline on top of the P1 library + P3 migration
script. The new `client/interest` model now actively prevents duplicate
client records at creation time and gives admins a queue to triage
the borderline pairs the at-create check missed.
Three layers, per design §7:
Layer 1 — At-create suggestion
==============================
`GET /api/v1/clients/match-candidates`
Accepts free-text email / phone / name from the in-flight client
form, normalizes them via the dedup library, and returns scored
matches against the port's live client pool. Filters out
low-confidence noise (the background scoring queue picks those up
separately). Strict port scoping; never leaks across tenants.
`<DedupSuggestionPanel>` (`src/components/clients/dedup-suggestion-panel.tsx`)
Debounced React Query hook. Renders nothing for short inputs or
no useful match. On a high-confidence match it interrupts visually
with an amber-tinted card and a "Use this client" primary button.
Medium confidence falls back to a softer "possible match — check
before creating" treatment.
`<ClientForm>`
Renders the panel above the form (create path only — skipped on
edit). New `onUseExistingClient` callback fires when the user
picks the existing client; the form closes and the parent decides
what to do (typically: navigate to that client's detail page or
open the create-interest dialog pre-filled).
Layer 2 — Merge service
=======================
`mergeClients` (`src/lib/services/client-merge.service.ts`)
The atomic merge primitive that everything else calls. Single
transaction. Per §6 of the design:
- Locks both rows (FOR UPDATE) so concurrent merges of the same
loser fail with a clear error rather than racing.
- Snapshots the full loser state (contacts / addresses / notes /
tags / interest+reservation IDs / relationship rows) into the
`client_merge_log.merge_details` JSONB column for the eventual
undo flow.
- Reattaches every loser-side row to the winner: interests,
reservations, contacts (skipping duplicates by `(channel, value)`),
addresses, notes, tags (deduped), relationships.
- Optional `fieldChoices` — per-scalar overrides letting the user
keep the loser's value for fullName / nationality / preferences /
timezone / source.
- Marks the loser archived with `mergedIntoClientId` set (a redirect
pointer for stragglers; never hard-deleted within the undo window).
- Resolves any matching `client_merge_candidates` row to status='merged'.
- Writes audit log entry.
Schema additions:
- `clients.merged_into_client_id` (nullable text, indexed) — the
redirect pointer set on archive.
Tests: 6 cases against a real DB — happy path moves rows + writes log;
self-merge / cross-port / already-merged refused; duplicate-contact
deduped on reattach; fieldChoices copies loser values to winner.
Layer 3 — Admin review queue
============================
`GET /api/v1/admin/duplicates`
Pending merge candidates (status='pending') for the current port,
with both client summaries hydrated for side-by-side rendering.
Skips pairs where one side is already archived/merged.
`POST /api/v1/admin/duplicates/[id]/merge`
Confirms a candidate. Body picks the winner; the other side
becomes the loser. Calls into `mergeClients` — the only path that
writes `client_merge_log`.
`POST /api/v1/admin/duplicates/[id]/dismiss`
Marks the candidate dismissed. Future scoring runs skip the same
pair until a score change recreates the row.
`<DuplicatesReviewQueue>` (`/admin/duplicates`)
Side-by-side card UI for each pending pair. Click a card to pick
the winner; the other side is automatically the loser. Toolbar:
"Merge into selected" + "Dismiss". No per-field merge editor in
this PR — that's a future polish; the simple "pick the better row"
flow handles ~80% of cases.
Test coverage
=============
11 new integration tests (76 added in this branch total):
- 6 mergeClients (atomicity, refusal cases, contact dedup,
fieldChoices)
- 5 match-candidates API (shape, port scoping, confidence tiers,
Pattern F false-positive guard)
Full vitest: 926/926 passing (was 858 before the dedup branch).
Lint: clean. tsc: clean for new files (only pre-existing errors in
unrelated `tests/integration/` files remain, same as before this PR).
Out of scope, deferred
======================
- Background scoring cron that populates `client_merge_candidates`
(the queue is empty until this lands; manual seeding works for
now via the at-create flow).
- Side-by-side per-field merge editor with checkboxes (the simple
"pick the winner" UX shipped here covers ~80% of real cases).
- Admin settings UI for tuning the dedup thresholds. Defaults from
the design (90 / 50) are baked in for now.
- `unmergeClients` (the snapshot is captured in client_merge_log;
the undo endpoint just hasn't been wired yet).
These are all natural follow-up PRs that don't block shipping the
runtime UX.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
250 lines
9.5 KiB
TypeScript
250 lines
9.5 KiB
TypeScript
import {
|
|
pgTable,
|
|
text,
|
|
boolean,
|
|
integer,
|
|
timestamp,
|
|
jsonb,
|
|
index,
|
|
uniqueIndex,
|
|
primaryKey,
|
|
} from 'drizzle-orm/pg-core';
|
|
import { sql } from 'drizzle-orm';
|
|
import { ports } from './ports';
|
|
|
|
export const clients = pgTable(
|
|
'clients',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
portId: text('port_id')
|
|
.notNull()
|
|
.references(() => ports.id),
|
|
fullName: text('full_name').notNull(),
|
|
/** ISO-3166-1 alpha-2 nationality code. */
|
|
nationalityIso: text('nationality_iso'),
|
|
preferredContactMethod: text('preferred_contact_method'), // email, phone, whatsapp
|
|
preferredLanguage: text('preferred_language'),
|
|
/** IANA timezone, e.g. 'Europe/Warsaw'. Validated client + server. */
|
|
timezone: text('timezone'),
|
|
source: text('source'), // website, manual, referral, broker
|
|
sourceDetails: text('source_details'),
|
|
archivedAt: timestamp('archived_at', { withTimezone: true }),
|
|
/** When this client was merged into another (the "loser" of a dedup
|
|
* merge), this points at the surviving client. Used by the
|
|
* /admin/duplicates review queue to redirect any stragglers, and by
|
|
* the unmerge flow to restore. Null for live clients. */
|
|
mergedIntoClientId: text('merged_into_client_id'),
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [
|
|
index('idx_clients_port').on(table.portId),
|
|
index('idx_clients_name').on(table.portId, table.fullName),
|
|
index('idx_clients_archived').on(table.portId, table.archivedAt),
|
|
index('idx_clients_nationality_iso').on(table.nationalityIso),
|
|
index('idx_clients_merged_into').on(table.mergedIntoClientId),
|
|
],
|
|
);
|
|
|
|
export const clientContacts = pgTable(
|
|
'client_contacts',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
clientId: text('client_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
channel: text('channel').notNull(), // email, phone, whatsapp, other
|
|
value: text('value').notNull(),
|
|
/** E.164-normalized phone number (only set when channel='phone'/'whatsapp'). */
|
|
valueE164: text('value_e164'),
|
|
/** ISO-3166-1 alpha-2 of the country this number was parsed against. */
|
|
valueCountry: text('value_country'),
|
|
label: text('label'), // primary, secondary, work, personal, broker, assistant
|
|
isPrimary: boolean('is_primary').notNull().default(false),
|
|
notes: text('notes'),
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [
|
|
index('idx_cc_client').on(table.clientId),
|
|
index('idx_cc_email')
|
|
.on(table.channel, table.value)
|
|
.where(sql`${table.channel} = 'email'`),
|
|
index('idx_cc_phone')
|
|
.on(table.channel, table.value)
|
|
.where(sql`${table.channel} = 'phone'`),
|
|
],
|
|
);
|
|
|
|
export const clientRelationships = pgTable(
|
|
'client_relationships',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
portId: text('port_id')
|
|
.notNull()
|
|
.references(() => ports.id),
|
|
clientAId: text('client_a_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
clientBId: text('client_b_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
relationshipType: text('relationship_type').notNull(), // referred_by, broker_for, family_member, same_vessel, custom
|
|
description: text('description'),
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [index('idx_cr_port').on(table.portId)],
|
|
);
|
|
|
|
export const clientNotes = pgTable(
|
|
'client_notes',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
clientId: text('client_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
authorId: text('author_id').notNull(), // user ID
|
|
content: text('content').notNull(),
|
|
mentions: text('mentions').array(), // array of mentioned user IDs
|
|
isLocked: boolean('is_locked').notNull().default(false),
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [index('idx_cn_client').on(table.clientId)],
|
|
);
|
|
|
|
export const clientTags = pgTable(
|
|
'client_tags',
|
|
{
|
|
clientId: text('client_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
tagId: text('tag_id').notNull(), // references tags.id — defined later in system.ts
|
|
},
|
|
(table) => [primaryKey({ columns: [table.clientId, table.tagId] })],
|
|
);
|
|
|
|
export const clientMergeLog = pgTable(
|
|
'client_merge_log',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
portId: text('port_id')
|
|
.notNull()
|
|
.references(() => ports.id),
|
|
survivingClientId: text('surviving_client_id')
|
|
.notNull()
|
|
.references(() => clients.id),
|
|
mergedClientId: text('merged_client_id').notNull(), // the client that was merged away (may no longer exist)
|
|
mergedBy: text('merged_by').notNull(), // user ID
|
|
mergeDetails: jsonb('merge_details').notNull(), // which fields were kept from which record
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [index('idx_cml_port').on(table.portId)],
|
|
);
|
|
|
|
/**
|
|
* Pairs of clients flagged by the background scoring job as potential
|
|
* duplicates. The `/admin/duplicates` review queue reads from here.
|
|
*
|
|
* Lifecycle:
|
|
* - Background job inserts a row when a pair scores >= the
|
|
* `dedup_review_queue_threshold` system setting.
|
|
* - User reviews in the admin UI and either merges (status='merged')
|
|
* or dismisses (status='dismissed').
|
|
* - Subsequent runs of the scoring job skip pairs already
|
|
* `dismissed` so the same false-positive doesn't keep reappearing.
|
|
* A future score increase recreates the row.
|
|
*
|
|
* Pairs are stored canonically with `clientAId < clientBId` (string
|
|
* comparison) so the same pair only generates one row regardless of
|
|
* scoring direction.
|
|
*/
|
|
export const clientMergeCandidates = pgTable(
|
|
'client_merge_candidates',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
portId: text('port_id')
|
|
.notNull()
|
|
.references(() => ports.id),
|
|
clientAId: text('client_a_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
clientBId: text('client_b_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
score: integer('score').notNull(),
|
|
/** Human-readable rule list, e.g. ["email match", "phone match"]. */
|
|
reasons: jsonb('reasons').notNull(),
|
|
status: text('status').notNull().default('pending'), // pending | dismissed | merged
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
resolvedAt: timestamp('resolved_at', { withTimezone: true }),
|
|
resolvedBy: text('resolved_by'),
|
|
},
|
|
(table) => [
|
|
index('idx_cmc_port_status').on(table.portId, table.status),
|
|
// Same pair shouldn't surface twice — enforce uniqueness on the
|
|
// canonical (a < b) ordering.
|
|
uniqueIndex('idx_cmc_pair').on(table.portId, table.clientAId, table.clientBId),
|
|
],
|
|
);
|
|
|
|
export const clientAddresses = pgTable(
|
|
'client_addresses',
|
|
{
|
|
id: text('id')
|
|
.primaryKey()
|
|
.$defaultFn(() => crypto.randomUUID()),
|
|
clientId: text('client_id')
|
|
.notNull()
|
|
.references(() => clients.id, { onDelete: 'cascade' }),
|
|
portId: text('port_id')
|
|
.notNull()
|
|
.references(() => ports.id, { onDelete: 'cascade' }),
|
|
label: text('label').notNull().default('Primary'),
|
|
streetAddress: text('street_address'),
|
|
city: text('city'),
|
|
/** ISO 3166-2 subdivision code (e.g. 'PL-MZ', 'US-CA'). Optional. */
|
|
subdivisionIso: text('subdivision_iso'),
|
|
postalCode: text('postal_code'),
|
|
/** ISO-3166-1 alpha-2 country code. */
|
|
countryIso: text('country_iso'),
|
|
isPrimary: boolean('is_primary').notNull().default(true),
|
|
createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
|
|
updatedAt: timestamp('updated_at', { withTimezone: true }).notNull().defaultNow(),
|
|
},
|
|
(table) => [
|
|
index('idx_ca_client').on(table.clientId),
|
|
index('idx_ca_port').on(table.portId),
|
|
uniqueIndex('idx_ca_primary')
|
|
.on(table.clientId)
|
|
.where(sql`${table.isPrimary} = true`),
|
|
],
|
|
);
|
|
|
|
export type Client = typeof clients.$inferSelect;
|
|
export type NewClient = typeof clients.$inferInsert;
|
|
export type ClientContact = typeof clientContacts.$inferSelect;
|
|
export type NewClientContact = typeof clientContacts.$inferInsert;
|
|
export type ClientRelationship = typeof clientRelationships.$inferSelect;
|
|
export type NewClientRelationship = typeof clientRelationships.$inferInsert;
|
|
export type ClientNote = typeof clientNotes.$inferSelect;
|
|
export type NewClientNote = typeof clientNotes.$inferInsert;
|
|
export type ClientMergeLog = typeof clientMergeLog.$inferSelect;
|
|
export type NewClientMergeLog = typeof clientMergeLog.$inferInsert;
|
|
export type ClientAddress = typeof clientAddresses.$inferSelect;
|
|
export type NewClientAddress = typeof clientAddresses.$inferInsert;
|
|
export type ClientMergeCandidate = typeof clientMergeCandidates.$inferSelect;
|
|
export type NewClientMergeCandidate = typeof clientMergeCandidates.$inferInsert;
|