Add custom H5 resolver#125
Conversation
|
@axelboc @loichuder Mind to have a look? |
|
I think adding an abstraction layer is a good idea if we want to replace It is not only the file access that relies on |
|
You shall see that it converts a nominal path to an IO object, which is fed to h5py, the h5py.File is still returned and read by other utilities, not replacing anything. |
Ah yes, got it. Sorry, missed this. |
loichuder
left a comment
There was a problem hiding this comment.
As I said, I think the resolver approach is sensible. I think we can improve the current implementation if you agree.
I'll suggest to instead pass the resolver as an arg of open_file_with_error_fallback (and of get_content_from_file upstream), with a default fallback to the H5pyResolver. Then, it will be up to the consumer app to handle the lifecycle of the resolver.
resolver could replace h5py_options since the options of h5py can then be passed via the resolver.extra_options instead. That would also make #80 easier since apps will have full control over the resolver and its options.
What do you think?
@contextmanager
def open_file_with_error_fallback(
filepath: str | Path,
create_error: Callable[[int, str], Exception],
resolver: H5FileResolver = H5FileResolver(),
) -> Iterator[h5py.File]: ...If I understand correctly, you are suggesting the above interface. @router.get("/data")
def get_data(
file: str = Depends(add_base_path),
path: str = "/",
dtype: str = "origin",
format: str = "json",
flatten: bool = False,
selection=None,
):
"""`/data` endpoint handler"""
with get_content_from_file(file, path, create_error) as content:
if not isinstance(content, DatasetContent):
raise TypeError(f"{content.path} is not a dataset")
data = content.data(selection, flatten, dtype)
h5grove_response = encode(data, format)
return Response(
content=h5grove_response.content, headers=h5grove_response.headers
)I think the interface can be changed to the one you proposed, but the injection mechanism should be retained such that users can inject custom resolver. So If downstream users have two options to provide custom behavior if they are using the utilities directly. |
|
I see, you want to inject it at the I'd prefer the injection to be more explicit. The I'll think about it and get back to you. |
|
Yes, we want to inject custom behaviour at app level. |
|
Extra options from #80 may likely be global as well. |
After reconsidering, I do not think relying on the native S3 support in h5py/HDF5 is a practical approach for several reasons:
ros3driver requires the underlying HDF5 library to be compiled with that feature enabled. Since h5py does not provide this by default, a custom build would almost always be necessary, which adds significant setup complexity.A more practical solution may be to introduce an additional abstraction layer responsible for converting file paths into file objects.
This approach: