fix(audit): import cluster — M27 (commit idempotency), M25 (in-file dedup preview), M26 (undo destructive-update reporting), L33 (mapping/mooring), L35 (port-auth doc)

M25 DB unique-index backstop deferred: needs a migration (column + backfill +
insert-stamp trigger + dedup) — tracked as a follow-up. The classify in-file
dedup (preview accuracy) ships now.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-02 12:41:00 +02:00
parent 9305c030de
commit 25988dbfad
6 changed files with 269 additions and 24 deletions

View File

@@ -48,6 +48,27 @@ export const importWorker = new Worker(
return;
}
// Idempotency guard (audit M27): only a batch still awaiting commit may be
// committed. A re-enqueue (future commit endpoint, operator re-trigger, or
// any stray duplicate job) of an already-committing/completed/failed/undone
// batch must NOT re-run — re-processing appends a second full set of
// import_batch_rows, so undo later sees both run-1 inserts and run-2 skips
// and the header counts no longer reconcile with the ledger undo trusts.
// commitBatch itself also gates the status transition with a conditional
// UPDATE (defense in depth against a TOCTOU race with another worker), but
// this early return avoids the wasted file re-read + parse in the common
// case. NOTE for the future authorization boundary (audit L35): when a
// commit/dry-run API route lands it MUST re-derive portId from the session
// and assert batch.portId === session.portId before enqueuing, and gate on
// an `import` permission — this worker trusts batch.portId wholesale.
if (batch.status !== 'dry_run' && batch.status !== 'uploaded') {
logger.warn(
{ batchId, status: batch.status },
'Import batch already past the commit gate — skipping re-run',
);
return;
}
const adapter = getAdapter(batch.entityType);
if (!adapter || !batch.storageKey || !batch.mappingJson) {
await db
@@ -69,6 +90,12 @@ export const importWorker = new Worker(
mapping: batch.mappingJson,
policy: batch.conflictPolicy as ConflictPolicy,
ctx: {
// Trust boundary (audit L35): portId is taken from the persisted
// batch and trusted. Safe only because batches are created
// server-side with no client-supplied portId today. The commit/
// dry-run API route, when it lands, MUST re-derive portId from the
// session, assert batch.portId === session.portId, and gate on an
// `import` permission before enqueuing this job. See ImportCtx.portId.
portId: batch.portId,
meta: {
userId: batch.createdBy,