Skip to content

ZIP/TAR entries with leading path separator are missing #791

@jpsnyder

Description

@jpsnyder

I've encountered a zip file that has its filename include a leading path separator / which are missing when you try to iterate through the sub_file_entries of the root file path.
It looks like dfvfs explicitly assumes sub file entries do not include a path separator, which explains why they are missing: https://github.com/log2timeline/dfvfs/blob/main/dfvfs/vfs/zip_directory.py#L23

However, while it might be formally incorrect, there is nothing stopping someone from creating a zip file with leading path separators (like using arcname in zipfile). DFVFS should handle both scenarios for filenames with and without the lead path separator.

import zipfile
from dfvfs.resolver import resolver
from dfvfs.path import zip_path_spec, os_path_spec

input_path = "test.zip"
with zipfile.ZipFile(input_path, "w") as zf:
    zf.writestr("/missing.txt", b"data")
    zf.writestr("not_missing.txt", b"data")

path_spec = zip_path_spec.ZipPathSpec(location="/", parent=os_path_spec.OSPathSpec(location=input_path))
file_entry = resolver.Resolver().OpenFileEntry(path_spec)
for entry in file_entry.sub_file_entries:
    print(entry.path_spec.comparable)  # only prints not_missing.txt

EDIT: This same issues also happens with TAR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions