feat(documents): importer for organized S3/filesystem buckets

One-shot script that walks an existing organized bucket tree, builds
matching document_folders rows mirroring the path, then inserts
documents + files rows pointing at the existing storage keys verbatim
— no path rewrite. For migrating from a legacy MinIO bucket whose
folder structure is already the source of truth.

Idempotency:
  • Folders: sibling-name unique index swallows duplicate creates;
    we reuse the row on ConflictError.
  • Documents: skipped when (port_id, fileStoragePath) already exists.

Adds StorageBackend.listByPrefix (recursive readdir on filesystem;
listObjectsV2 stream-drain on s3) — the first one-shot caller, not
a hot path. Pure parseImportPath helper extracted to its own module
and unit-tested for trailing slashes, empty intermediate segments,
prefix mismatch, and special-character folder names (8 tests).

Audit log per imported doc carries source='organized-bucket-importer',
storageKey, and folderSegments so the documents inspector can filter
on imports later.

CLI:
  pnpm tsx scripts/import-organized-documents.ts \\
      --port-slug <slug> \\
      --bucket-prefix "legacy-imports/" \\
      (--dry-run | --apply) [--uploaded-by <userId>]

Folds in Prettier post-hook drift on documents.service.ts +
download handler — same lint-staged formatting the earlier commits
already absorbed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-10 16:53:51 +02:00
parent e790ff708b
commit ef63e86fde
8 changed files with 495 additions and 9 deletions

View File

@@ -72,6 +72,15 @@ export interface StorageBackend {
/** Generate a short-lived URL the browser can GET from. */
presignDownload(key: string, opts: PresignOpts): Promise<{ url: string; expiresAt: Date }>;
/**
* Recursively list keys under `prefix`. Returns the relative key for each
* object, sorted alphabetically. Empty prefix means "the entire bucket /
* storage root". Used by one-shot importers (e.g. organized-bucket
* document import) that need to walk a flat key namespace; not meant for
* runtime hot paths.
*/
listByPrefix(prefix: string): Promise<string[]>;
readonly name: StorageBackendName;
}