Files
pn-new-crm/tests/unit/document-import.test.ts
Matt ef63e86fde feat(documents): importer for organized S3/filesystem buckets
One-shot script that walks an existing organized bucket tree, builds
matching document_folders rows mirroring the path, then inserts
documents + files rows pointing at the existing storage keys verbatim
— no path rewrite. For migrating from a legacy MinIO bucket whose
folder structure is already the source of truth.

Idempotency:
  • Folders: sibling-name unique index swallows duplicate creates;
    we reuse the row on ConflictError.
  • Documents: skipped when (port_id, fileStoragePath) already exists.

Adds StorageBackend.listByPrefix (recursive readdir on filesystem;
listObjectsV2 stream-drain on s3) — the first one-shot caller, not
a hot path. Pure parseImportPath helper extracted to its own module
and unit-tested for trailing slashes, empty intermediate segments,
prefix mismatch, and special-character folder names (8 tests).

Audit log per imported doc carries source='organized-bucket-importer',
storageKey, and folderSegments so the documents inspector can filter
on imports later.

CLI:
  pnpm tsx scripts/import-organized-documents.ts \\
      --port-slug <slug> \\
      --bucket-prefix "legacy-imports/" \\
      (--dry-run | --apply) [--uploaded-by <userId>]

Folds in Prettier post-hook drift on documents.service.ts +
download handler — same lint-staged formatting the earlier commits
already absorbed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 16:53:51 +02:00

61 lines
1.9 KiB
TypeScript

import { describe, expect, it } from 'vitest';
import { parseImportPath } from '@/lib/services/document-import';
describe('parseImportPath', () => {
it('splits a nested key into folders + filename', () => {
expect(parseImportPath('legacy', 'legacy/Deals 2026/Q1/contract.pdf')).toEqual({
folderSegments: ['Deals 2026', 'Q1'],
filename: 'contract.pdf',
});
});
it('returns empty folderSegments for a file at the prefix root', () => {
expect(parseImportPath('legacy', 'legacy/index.pdf')).toEqual({
folderSegments: [],
filename: 'index.pdf',
});
});
it('tolerates trailing slashes on the prefix', () => {
expect(parseImportPath('legacy/', 'legacy/Deals/x.pdf')).toEqual({
folderSegments: ['Deals'],
filename: 'x.pdf',
});
expect(parseImportPath('legacy///', 'legacy/Deals/x.pdf')).toEqual({
folderSegments: ['Deals'],
filename: 'x.pdf',
});
});
it('collapses empty intermediate segments', () => {
expect(parseImportPath('legacy', 'legacy/a//b/c.pdf')).toEqual({
folderSegments: ['a', 'b'],
filename: 'c.pdf',
});
});
it('handles an empty prefix as "list-the-whole-bucket"', () => {
expect(parseImportPath('', 'Folder/file.pdf')).toEqual({
folderSegments: ['Folder'],
filename: 'file.pdf',
});
});
it('preserves special characters in folder names', () => {
expect(parseImportPath('', "Q1 — Year's End/contract & rider.pdf")).toEqual({
folderSegments: ["Q1 — Year's End"],
filename: 'contract & rider.pdf',
});
});
it('throws when the key is not under the prefix', () => {
expect(() => parseImportPath('legacy', 'other/x.pdf')).toThrow(/not under prefix/);
});
it('throws when the relative path has no filename', () => {
expect(() => parseImportPath('legacy', 'legacy/')).toThrow(/no filename/);
expect(() => parseImportPath('', '')).toThrow(/no filename/);
});
});