I've encountered a zip file that has its filename include a leading path separator / which are missing when you try to iterate through the sub_file_entries of the root file path.
It looks like dfvfs explicitly assumes sub file entries do not include a path separator, which explains why they are missing: https://github.com/log2timeline/dfvfs/blob/main/dfvfs/vfs/zip_directory.py#L23
However, while it might be formally incorrect, there is nothing stopping someone from creating a zip file with leading path separators (like using arcname in zipfile). DFVFS should handle both scenarios for filenames with and without the lead path separator.
import zipfile
from dfvfs.resolver import resolver
from dfvfs.path import zip_path_spec, os_path_spec
input_path = "test.zip"
with zipfile.ZipFile(input_path, "w") as zf:
zf.writestr("/missing.txt", b"data")
zf.writestr("not_missing.txt", b"data")
path_spec = zip_path_spec.ZipPathSpec(location="/", parent=os_path_spec.OSPathSpec(location=input_path))
file_entry = resolver.Resolver().OpenFileEntry(path_spec)
for entry in file_entry.sub_file_entries:
print(entry.path_spec.comparable) # only prints not_missing.txt
EDIT: This same issues also happens with TAR
I've encountered a zip file that has its
filenameinclude a leading path separator/which are missing when you try to iterate through the sub_file_entries of the root file path.It looks like dfvfs explicitly assumes sub file entries do not include a path separator, which explains why they are missing: https://github.com/log2timeline/dfvfs/blob/main/dfvfs/vfs/zip_directory.py#L23
However, while it might be formally incorrect, there is nothing stopping someone from creating a zip file with leading path separators (like using
arcnameinzipfile). DFVFS should handle both scenarios for filenames with and without the lead path separator.EDIT: This same issues also happens with TAR