Describe the bug
I am the author of PureHDF, a C# library to read and write HDF5 files without a dependency to the HDF5 C library. Recently I received this merge request where a user of that lib complains that files written by PureHDF cannot be opened anymore in e.g. HDFView and all other tools that rely on the C lib.
The merge request proposes to use the minimum number of bytes required to encode the chunk dimensions instead of using a hardcoded value of 8 bytes.
For now I rejected this proposal because according to the spec the Dimension Size Encoded Length of the Chunked Storage Property Description is described as This is the size in bytes used to encode Dimension Size.. So the spec does not state that this has to be the minimal size required to encode the dimension lengths and so with PureHDF I opted to just always use 8 bytes.
The problem now is that a commit 6 months ago introduced a check to ensure the minimal number of bytes is used and otherwise it would throw an error: https://github.com/HDFGroup/hdf5/blame/develop/src/H5Dchunk.c#L858
And now all files written with that fixed number of 8 bytes cannot be read anymore.
If you confirm that my interpretation of the spec is correct in that implementations are free to choose the number of bytes to encode the chunk dimension sizes, then it would be great if you can remove the recently introduced check linked above for file reading operations
Expected behavior
When reading, the HDF5 lib should accept the actual value of the Dimension Size Encoded Length field encoded in the file instead of expecting it to be the minimal length required to encode the dimensions to ensure compliance with the spec.
Additional context
Describe the bug
I am the author of PureHDF, a C# library to read and write HDF5 files without a dependency to the HDF5 C library. Recently I received this merge request where a user of that lib complains that files written by PureHDF cannot be opened anymore in e.g. HDFView and all other tools that rely on the C lib.
The merge request proposes to use the minimum number of bytes required to encode the chunk dimensions instead of using a hardcoded value of 8 bytes.
For now I rejected this proposal because according to the spec the
Dimension Size Encoded Lengthof theChunked Storage Property Descriptionis described asThis is the size in bytes used to encode Dimension Size.. So the spec does not state that this has to be the minimal size required to encode the dimension lengths and so with PureHDF I opted to just always use 8 bytes.The problem now is that a commit 6 months ago introduced a check to ensure the minimal number of bytes is used and otherwise it would throw an error: https://github.com/HDFGroup/hdf5/blame/develop/src/H5Dchunk.c#L858
And now all files written with that fixed number of 8 bytes cannot be read anymore.
If you confirm that my interpretation of the spec is correct in that implementations are free to choose the number of bytes to encode the chunk dimension sizes, then it would be great if you can remove the recently introduced check linked above for file reading operations
Expected behavior
When reading, the HDF5 lib should accept the actual value of the
Dimension Size Encoded Lengthfield encoded in the file instead of expecting it to be the minimal length required to encode the dimensions to ensure compliance with the spec.Additional context