Skip to content

Difference in access time for shorter vs longer kerchunk files #538

@kthyng

Description

@kthyng

Hi! I use kerchunk all the time and love it, so thank you! I have a question. I find pretty different access times that get longer the longer the kerchunk file is.

1 year kerchunk file:

Image

24 year kerchunk file (but same number of times and values accessed as in the 1 year kerchunk file):

Image

(The comparisons I am running in each plot are trying out subchunking since these files are uncompressed netCDF4 files.)

Is this expected behavior? Is there anything I can do to counteract this effect? For example, would it be better if I used 24 1-year kerchunk files? I've read through a bunch of issues here and I wonder if there is a flag that would help with this.

Thank you for any help!

Edited to add: These are parquet kerchunk files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions