Skip to content

Super long zip file fails to extract 1 file from directory #345

@15baraniana

Description

@15baraniana

We have a 12gb zip file and use the unzipper package to find a specific file in that zip and extract that to the file system. It seems to correctly find all the files (50 files) in the directory but when I filter down to the one I want by path and try to extract that it fails.

Here is my simplified code implementation.

    const directory: CentralDirectory = await Open.file(zipPath);

    console.log('Files in zip:');
    directory.files.forEach((file) => {
      console.log(file.path);
    });

    const targetFileName = 'test.txt';
    const file = directory.files.find((f) => f.path === targetFileName);

    if (!file) {
      console.error(`File ${targetFileName} not found in zip`);
      return;
    }
    console.log('file', file);
    console.log(`Found ${targetFileName}, extracting...`);

    await new Promise((resolve, reject) => {
      file
        .stream()
        .pipe(
          fs.createWriteStream(`/Users/aribaranian/Desktop/${targetFileName}`)
        )
        .on('error', reject)
        .on('finish', resolve);
    });

This code works well for smaller zip files but for this big one, it throws the following error. Including some of the console logs for reference of the file.

file {
  signature: 33639248,
  versionMadeBy: 788,
  versionsNeededToExtract: 45,
  flags: 8,
  compressionMethod: 8,
  lastModifiedTime: 16682,
  lastModifiedDate: 22735,
  crc32: 3622192553,
  compressedSize: 324262805,
  uncompressedSize: 352829440,
  fileNameLength: 36,
  extraFieldLength: 44,
  fileCommentLength: 0,
  diskNumber: 0,
  internalFileAttributes: 0,
  externalFileAttributes: 2176057344,
  offsetToLocalFileHeader: null,
  lastModifiedDateTime: 2024-06-15T08:09:20.000Z,
  pathBuffer: <Buffer 32 32 36 34 30 32 30 2d 31 30 30 2d 35 30 30 2d 4d 41 49 4e 2d 41 52 43 48 2d 47 48 5f 52 32 31 2e 72 76 74>,
  path: '2264020-100-500-MAIN-ARCH-GH_R21.rvt',
  isUnicode: false,
  extra: {
    signature: 1,
    partsize: 8,
    uncompressedSize: 4382034189,
    compressedSize: null,
    offset: null,
    disknum: null
  },
  comment: '',
  type: 'File',
  stream: [Function (anonymous)],
  buffer: [Function (anonymous)]
}
Found 2264020-100-500-MAIN-ARCH-GH_R21.rvt, extracting...
Error parsing zip: TypeError [ERR_INVALID_ARG_TYPE]: The "start" argument must be of type number. Received null

The only unusual thing I see is the offsetToLocalHeader: null but I cant find out why. I have tried increasing the tailSize but that doesn't work. I also tried examining the file in a hex editor to find the EOCD signature and it was ~22 bytes from the end.

Any help would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions