feat(import): commit runner + undo + wired BullMQ worker

Third importer increment — the write path, fully testable without UI.

- commit.ts: commitBatch streams classified rows, applies insert/update per
  the conflict policy via the adapter (each row in its own try/catch so valid
  rows still land), records every action in import_batch_rows, and keeps live
  counts on the batch header. undoBatch hard-deletes a batch's inserted rows
  (port-scoped); a delete blocked by a dependent FK is reported, not forced,
  and the batch flips to `undone` only when every inserted row was removed.
- import worker: replaced the no-op placeholder with the real processor —
  loads the batch, re-reads the uploaded file from storage, parses, and runs
  commitBatch under the batch's mapping + policy. Marks the batch failed on
  error. Concurrency 1 so imports don't race each other's dedup lookups.

Tests: commit (skip/insert/error counts + per-row ledger + real inserted
entity), undo (removes exactly the inserted row, flips status), and
update-matches overwrite. 2 passing.

Engine is now functional end-to-end at the service layer: parse → map →
dry-run → commit → undo. Remaining: 4 FK adapters, API routes + permission,
wizard UI + history.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-01 22:36:42 +02:00
parent 3cf12b3015
commit c7325010e6
3 changed files with 392 additions and 26 deletions

View File

@@ -1,42 +1,91 @@
import { Worker, type Job } from 'bullmq';
import { eq } from 'drizzle-orm';
import { env } from '@/lib/env';
import type { ConnectionOptions } from 'bullmq';
import { logger } from '@/lib/logger';
import { attachWorkerAudit } from '@/lib/queue/audit-helpers';
import { QUEUE_CONFIGS } from '@/lib/queue';
import { db } from '@/lib/db';
import { importBatches } from '@/lib/db/schema/imports';
import { getStorageBackend } from '@/lib/storage';
import { parseImportFile } from '@/lib/import/engine';
import { getAdapter } from '@/lib/import/registry';
import { commitBatch } from '@/lib/import/commit';
import type { ConflictPolicy } from '@/lib/import/types';
function streamToBuffer(stream: NodeJS.ReadableStream): Promise<Buffer> {
return new Promise((resolve, reject) => {
const chunks: Buffer[] = [];
stream.on('data', (c: Buffer) => chunks.push(c));
stream.on('end', () => resolve(Buffer.concat(chunks)));
stream.on('error', reject);
});
}
/**
* Bulk-import worker - DEFERRED FEATURE (placeholder).
*
* Status: registered with BullMQ so any future enqueue site lands on a
* real worker instance instead of disappearing into an unbound queue.
* No callers currently dispatch to this worker - the body is intentionally
* a no-op that logs the dispatch for forensics.
*
* Why deferred (vs implemented inline):
* - CSV/Excel import is a real product feature, not a refactor. Done
* properly it needs: per-entity schema mapping (clients / berths /
* interests / companies / yachts), zod-level row validation, per-row
* error rollup with line-numbered diagnostics, dry-run preview,
* progress reporting, dedupe-on-conflict policy, admin upload UI
* with column-mapping UX. Building it speculatively without a
* customer in the room would lock in a UX that may not match what
* real importers want.
* - When the trigger comes (a customer needs to bulk-load a season
* roster or migrate from another CRM), build it from product spec
* not from this placeholder.
*
* What's required to ship: papaparse (CSV) + a thin schema-per-entity
* mapping layer, plus an admin /admin/import page with a per-entity
* picker + file dropzone. The queue registration here stays as-is.
* Bulk-import commit worker. Enqueued by the dry-run "commit" action with
* `{ batchId }`. Re-reads the uploaded file from storage, re-classifies +
* applies each row under the batch's conflict policy (see commitBatch), and
* records per-row actions + live counts. Concurrency 1 (queue config) so
* imports never race each other's dedup lookups.
*/
export const importWorker = new Worker(
'import',
async (job: Job) => {
logger.info({ jobId: job.id, jobName: job.name }, 'Processing import job');
// Deferred - no callers enqueue this. If a job lands, we log + swallow
// so a future test enqueue doesn't trip the failed-job alert.
const { batchId } = (job.data ?? {}) as { batchId?: string };
logger.info({ jobId: job.id, batchId }, 'Processing import commit');
if (!batchId) return;
const [batch] = await db
.select()
.from(importBatches)
.where(eq(importBatches.id, batchId))
.limit(1);
if (!batch) {
logger.warn({ batchId }, 'Import batch vanished before commit');
return;
}
const adapter = getAdapter(batch.entityType);
if (!adapter || !batch.storageKey || !batch.mappingJson) {
await db
.update(importBatches)
.set({ status: 'failed', completedAt: new Date() })
.where(eq(importBatches.id, batchId));
logger.error({ batchId, entityType: batch.entityType }, 'Import batch not committable');
return;
}
try {
const stream = await (await getStorageBackend()).get(batch.storageKey);
const buffer = await streamToBuffer(stream);
const parsed = await parseImportFile(batch.filename, buffer);
await commitBatch({
batchId,
adapter,
rawRows: parsed.rows,
mapping: batch.mappingJson,
policy: batch.conflictPolicy as ConflictPolicy,
ctx: {
portId: batch.portId,
meta: {
userId: batch.createdBy,
portId: batch.portId,
ipAddress: 'import-worker',
userAgent: 'import-worker',
},
},
});
} catch (err) {
await db
.update(importBatches)
.set({ status: 'failed', completedAt: new Date() })
.where(eq(importBatches.id, batchId));
logger.error({ batchId, err }, 'Import commit failed');
throw err;
}
},
{
connection: { url: env.REDIS_URL } as ConnectionOptions,