feat(dedup): runtime surfaces — merge service, at-create suggestion, admin queue (P2)

Adds the live dedup pipeline on top of the P1 library + P3 migration
script. The new `client/interest` model now actively prevents duplicate
client records at creation time and gives admins a queue to triage
the borderline pairs the at-create check missed.

Three layers, per design §7:

Layer 1 — At-create suggestion
==============================

`GET /api/v1/clients/match-candidates`
  Accepts free-text email / phone / name from the in-flight client
  form, normalizes them via the dedup library, and returns scored
  matches against the port's live client pool. Filters out
  low-confidence noise (the background scoring queue picks those up
  separately). Strict port scoping; never leaks across tenants.

`<DedupSuggestionPanel>` (`src/components/clients/dedup-suggestion-panel.tsx`)
  Debounced React Query hook. Renders nothing for short inputs or
  no useful match. On a high-confidence match it interrupts visually
  with an amber-tinted card and a "Use this client" primary button.
  Medium confidence falls back to a softer "possible match — check
  before creating" treatment.

`<ClientForm>`
  Renders the panel above the form (create path only — skipped on
  edit). New `onUseExistingClient` callback fires when the user
  picks the existing client; the form closes and the parent decides
  what to do (typically: navigate to that client's detail page or
  open the create-interest dialog pre-filled).

Layer 2 — Merge service
=======================

`mergeClients` (`src/lib/services/client-merge.service.ts`)
  The atomic merge primitive that everything else calls. Single
  transaction. Per §6 of the design:

  - Locks both rows (FOR UPDATE) so concurrent merges of the same
    loser fail with a clear error rather than racing.
  - Snapshots the full loser state (contacts / addresses / notes /
    tags / interest+reservation IDs / relationship rows) into the
    `client_merge_log.merge_details` JSONB column for the eventual
    undo flow.
  - Reattaches every loser-side row to the winner: interests,
    reservations, contacts (skipping duplicates by `(channel, value)`),
    addresses, notes, tags (deduped), relationships.
  - Optional `fieldChoices` — per-scalar overrides letting the user
    keep the loser's value for fullName / nationality / preferences /
    timezone / source.
  - Marks the loser archived with `mergedIntoClientId` set (a redirect
    pointer for stragglers; never hard-deleted within the undo window).
  - Resolves any matching `client_merge_candidates` row to status='merged'.
  - Writes audit log entry.

Schema additions:
  - `clients.merged_into_client_id` (nullable text, indexed) — the
    redirect pointer set on archive.

Tests: 6 cases against a real DB — happy path moves rows + writes log;
self-merge / cross-port / already-merged refused; duplicate-contact
deduped on reattach; fieldChoices copies loser values to winner.

Layer 3 — Admin review queue
============================

`GET /api/v1/admin/duplicates`
  Pending merge candidates (status='pending') for the current port,
  with both client summaries hydrated for side-by-side rendering.
  Skips pairs where one side is already archived/merged.

`POST /api/v1/admin/duplicates/[id]/merge`
  Confirms a candidate. Body picks the winner; the other side
  becomes the loser. Calls into `mergeClients` — the only path that
  writes `client_merge_log`.

`POST /api/v1/admin/duplicates/[id]/dismiss`
  Marks the candidate dismissed. Future scoring runs skip the same
  pair until a score change recreates the row.

`<DuplicatesReviewQueue>` (`/admin/duplicates`)
  Side-by-side card UI for each pending pair. Click a card to pick
  the winner; the other side is automatically the loser. Toolbar:
  "Merge into selected" + "Dismiss". No per-field merge editor in
  this PR — that's a future polish; the simple "pick the better row"
  flow handles ~80% of cases.

Test coverage
=============

11 new integration tests (76 added in this branch total):
  - 6 mergeClients (atomicity, refusal cases, contact dedup,
    fieldChoices)
  - 5 match-candidates API (shape, port scoping, confidence tiers,
    Pattern F false-positive guard)

Full vitest: 926/926 passing (was 858 before the dedup branch).
Lint: clean. tsc: clean for new files (only pre-existing errors in
unrelated `tests/integration/` files remain, same as before this PR).

Out of scope, deferred
======================

- Background scoring cron that populates `client_merge_candidates`
  (the queue is empty until this lands; manual seeding works for
  now via the at-create flow).
- Side-by-side per-field merge editor with checkboxes (the simple
  "pick the winner" UX shipped here covers ~80% of real cases).
- Admin settings UI for tuning the dedup thresholds. Defaults from
  the design (90 / 50) are baked in for now.
- `unmergeClients` (the snapshot is captured in client_merge_log;
  the undo endpoint just hasn't been wired yet).

These are all natural follow-up PRs that don't block shipping the
runtime UX.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Matt Ciaccio
2026-05-03 14:59:04 +02:00
parent 18e5c124b0
commit 4bcc7f8be6
17 changed files with 12018 additions and 1 deletions

View File

@@ -0,0 +1,183 @@
/**
* Client merge service — end-to-end integration test.
*
* Spins up two real clients in a real port via the factory helpers,
* attaches a few satellites (interest, contact, address, note),
* merges them, and asserts everything survived in the right place
* with the merge log written.
*/
import { describe, expect, it } from 'vitest';
import { eq } from 'drizzle-orm';
import { db } from '@/lib/db';
import { clients, clientContacts, clientNotes, clientMergeLog } from '@/lib/db/schema/clients';
import { interests } from '@/lib/db/schema/interests';
import { mergeClients } from '@/lib/services/client-merge.service';
import { makeClient, makePort, makeBerth } from '../../helpers/factories';
describe('mergeClients', () => {
it('moves interests and contacts from loser to winner; archives loser; writes merge log', async () => {
const port = await makePort();
const winner = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus Laurent' },
});
const loser = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus Laurent (dup)' },
});
// Attach contact + interest to loser
await db.insert(clientContacts).values({
clientId: loser.id,
channel: 'email',
value: 'marcus@example.com',
isPrimary: true,
});
await db.insert(clientNotes).values({
clientId: loser.id,
authorId: 'test-user',
content: 'Loser-side note',
});
const berth = await makeBerth({ portId: port.id });
await db.insert(interests).values({
portId: port.id,
clientId: loser.id,
berthId: berth.id,
pipelineStage: 'open',
leadCategory: 'general_interest',
});
// ── Merge ─────────────────────────────────────────────────────────────
const result = await mergeClients({
winnerId: winner.id,
loserId: loser.id,
mergedBy: 'test-user',
});
expect(result.movedRows.interests).toBe(1);
expect(result.movedRows.contacts).toBe(1);
expect(result.movedRows.notes).toBe(1);
// ── Loser should be archived with mergedIntoClientId set ──────────────
const [archivedLoser] = await db.select().from(clients).where(eq(clients.id, loser.id));
expect(archivedLoser?.archivedAt).not.toBeNull();
expect(archivedLoser?.mergedIntoClientId).toBe(winner.id);
// ── All loser-side rows now point at the winner ───────────────────────
const winnerInterests = await db
.select()
.from(interests)
.where(eq(interests.clientId, winner.id));
expect(winnerInterests).toHaveLength(1);
const winnerContacts = await db
.select()
.from(clientContacts)
.where(eq(clientContacts.clientId, winner.id));
expect(winnerContacts.find((c) => c.value === 'marcus@example.com')).toBeDefined();
const winnerNotes = await db
.select()
.from(clientNotes)
.where(eq(clientNotes.clientId, winner.id));
expect(winnerNotes.find((n) => n.content === 'Loser-side note')).toBeDefined();
// ── Merge log row exists with snapshot ────────────────────────────────
const [log] = await db
.select()
.from(clientMergeLog)
.where(eq(clientMergeLog.id, result.mergeLogId));
expect(log?.survivingClientId).toBe(winner.id);
expect(log?.mergedClientId).toBe(loser.id);
expect(log?.mergedBy).toBe('test-user');
expect(log?.mergeDetails).toBeDefined();
});
it('refuses to merge a client into itself', async () => {
const port = await makePort();
const c = await makeClient({ portId: port.id });
await expect(mergeClients({ winnerId: c.id, loserId: c.id, mergedBy: 'u' })).rejects.toThrow(
/itself/i,
);
});
it('refuses to merge across different ports', async () => {
const portA = await makePort();
const portB = await makePort();
const a = await makeClient({ portId: portA.id });
const b = await makeClient({ portId: portB.id });
await expect(mergeClients({ winnerId: a.id, loserId: b.id, mergedBy: 'u' })).rejects.toThrow(
/different ports/i,
);
});
it('refuses to merge a client that has already been merged', async () => {
const port = await makePort();
const winner = await makeClient({ portId: port.id });
const loser = await makeClient({ portId: port.id });
// First merge succeeds.
await mergeClients({ winnerId: winner.id, loserId: loser.id, mergedBy: 'u' });
// Second merge of the same loser should refuse.
const winner2 = await makeClient({ portId: port.id });
await expect(
mergeClients({ winnerId: winner2.id, loserId: loser.id, mergedBy: 'u' }),
).rejects.toThrow(/already merged/i);
});
it('drops duplicate contact rows during reattach', async () => {
const port = await makePort();
const winner = await makeClient({ portId: port.id });
const loser = await makeClient({ portId: port.id });
// Both have the same email contact.
await db.insert(clientContacts).values({
clientId: winner.id,
channel: 'email',
value: 'same@example.com',
isPrimary: true,
});
await db.insert(clientContacts).values({
clientId: loser.id,
channel: 'email',
value: 'same@example.com',
isPrimary: true,
});
const result = await mergeClients({
winnerId: winner.id,
loserId: loser.id,
mergedBy: 'u',
});
expect(result.movedRows.contacts).toBe(0); // duplicate dropped
const winnerEmails = await db
.select()
.from(clientContacts)
.where(eq(clientContacts.clientId, winner.id));
// Winner kept exactly one copy of the shared email.
expect(winnerEmails.filter((c) => c.value === 'same@example.com')).toHaveLength(1);
});
it('applies fieldChoices to copy loser values onto the winner', async () => {
const port = await makePort();
const winner = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus L.' },
});
const loser = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus Laurent' },
});
await mergeClients({
winnerId: winner.id,
loserId: loser.id,
mergedBy: 'u',
fieldChoices: { fullName: 'loser' },
});
const [updatedWinner] = await db.select().from(clients).where(eq(clients.id, winner.id));
expect(updatedWinner?.fullName).toBe('Marcus Laurent');
});
});

View File

@@ -0,0 +1,157 @@
/**
* Match-candidates API — integration test.
*
* Exercises the GET /api/v1/clients/match-candidates handler against a
* real port + clients pool. Verifies the dedup library's at-create
* suggestion path returns the right candidates and confidence tiers
* for the "use existing client?" form interruption.
*/
import { describe, expect, it } from 'vitest';
import { db } from '@/lib/db';
import { clientContacts } from '@/lib/db/schema/clients';
import { getMatchCandidatesHandler } from '@/app/api/v1/clients/match-candidates/handlers';
import { makeMockCtx, makeMockRequest } from '../../helpers/route-tester';
import { makeClient, makePort } from '../../helpers/factories';
interface MatchData {
clientId: string;
fullName: string;
score: number;
confidence: 'high' | 'medium' | 'low';
reasons: string[];
interestCount: number;
}
async function callHandler(
ctx: ReturnType<typeof makeMockCtx>,
query: Record<string, string>,
): Promise<MatchData[]> {
const url = new URL('http://localhost/api/v1/clients/match-candidates');
for (const [k, v] of Object.entries(query)) url.searchParams.set(k, v);
const req = makeMockRequest('GET', url.toString());
const res = await getMatchCandidatesHandler(req, ctx);
expect(res.status).toBe(200);
const body = await res.json();
return body.data as MatchData[];
}
describe('GET /api/v1/clients/match-candidates', () => {
it('returns empty when nothing actionable was provided', async () => {
const port = await makePort();
const ctx = makeMockCtx({ portId: port.id });
const data = await callHandler(ctx, {});
expect(data).toEqual([]);
});
it('finds an existing client by exact email match (high confidence)', async () => {
const port = await makePort();
const ctx = makeMockCtx({ portId: port.id });
const existing = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus Laurent' },
});
await db.insert(clientContacts).values({
clientId: existing.id,
channel: 'email',
value: 'marcus@example.com',
isPrimary: true,
});
await db.insert(clientContacts).values({
clientId: existing.id,
channel: 'phone',
value: '+15551234567',
valueE164: '+15551234567',
isPrimary: true,
});
const data = await callHandler(ctx, {
email: 'Marcus@example.com',
phone: '+15551234567',
name: 'Marcus Laurent',
});
expect(data).toHaveLength(1);
expect(data[0]!.clientId).toBe(existing.id);
expect(data[0]!.confidence).toBe('high');
expect(data[0]!.reasons).toEqual(expect.arrayContaining(['email match', 'phone match']));
});
it('does not surface unrelated clients in the same port', async () => {
const port = await makePort();
const ctx = makeMockCtx({ portId: port.id });
const target = await makeClient({
portId: port.id,
overrides: { fullName: 'Marcus Laurent' },
});
await db.insert(clientContacts).values({
clientId: target.id,
channel: 'email',
value: 'marcus@example.com',
isPrimary: true,
});
// An unrelated client.
const unrelated = await makeClient({
portId: port.id,
overrides: { fullName: 'Bob Smith' },
});
await db.insert(clientContacts).values({
clientId: unrelated.id,
channel: 'email',
value: 'bob@example.org',
isPrimary: true,
});
const data = await callHandler(ctx, { email: 'marcus@example.com' });
expect(data.map((d) => d.clientId)).toEqual([target.id]);
});
it('returns medium-confidence partial matches', async () => {
// Same name, different contact info — Pattern F territory.
const port = await makePort();
const ctx = makeMockCtx({ portId: port.id });
const existing = await makeClient({
portId: port.id,
overrides: { fullName: 'Etiennette Clamouze' },
});
await db.insert(clientContacts).values({
clientId: existing.id,
channel: 'email',
value: 'clamouze.etiennette@gmail.com',
isPrimary: true,
});
const data = await callHandler(ctx, {
// Different email + phone, same name.
email: 'etiennette@the-manoah.com',
name: 'Etiennette Clamouze',
});
// Either no match (low confidence filtered out) or a medium one —
// either is fine. Critically, NOT high.
if (data.length > 0) {
expect(data[0]!.confidence).not.toBe('high');
}
});
it('does not leak across ports', async () => {
const portA = await makePort();
const portB = await makePort();
const ctxA = makeMockCtx({ portId: portA.id });
const inB = await makeClient({
portId: portB.id,
overrides: { fullName: 'In Port B' },
});
await db.insert(clientContacts).values({
clientId: inB.id,
channel: 'email',
value: 'b@example.com',
isPrimary: true,
});
// Caller is in port A, asking for an email that lives in port B.
const data = await callHandler(ctxA, { email: 'b@example.com' });
expect(data).toEqual([]);
});
});