feat: get_loaded_kernels()#428
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| backend=backend, | ||
| ) | ||
| if not reload: | ||
| for loaded_kernel in get_loaded_kernels(): |
There was a problem hiding this comment.
Doing the lookup here makes the cached loading only work in get_kernel and not all other paths that import it. Maybe it makes more sense to do this in _import_from_path for now? We could always add a second cache keyed on the repository information if we'd like to skip network access as well in a future PR.
There was a problem hiding this comment.
Ok as you prefer. When profiling get_kernel I noticed 95% of the time is spent in install_kernel that's why I thought it was the thing we wanted to cache. I only tried with flash-attn-3 though. Do some kernels take a long time to (re-)load (the actual loading part in _import_from_path when called a second time)?
There was a problem hiding this comment.
I haven't done any benchmarking, so I'm not sure.
| for so_path in file_path.parent.iterdir(): | ||
| if so_path.is_file() and so_path.name.endswith('.so'): | ||
| op_namespace = so_path.name.split('.')[0] |
There was a problem hiding this comment.
This is fragile. E.g. it doesn't work for no-arch kernels and IIRC we also have a different dylib extension name on macOS/Windows.
If we want a kernel to reveal its ops name, I think we should have a standardized API for it in kernels themselves. I think externally probing these things without standardization will break with various kernels (like it does here with noarch). The interface of kernels themselves is how downstream code should interact with kernels. kernels is only a client to fetch + load kernels, not to expose/probe kernel internals (if we can avoid it).
There was a problem hiding this comment.
Ok I see. How would you envision such a standardized API for the kernels to expose their op namespace? (metadata.json could be a good candidate given that it's auto-generated, which I'm not sure about)
danieldk
left a comment
There was a problem hiding this comment.
Looks great!
I think we could add this new function to the docs in a follow-up PR, but we could do that after we possibly add the metadata as well.
New
get_loaded_kernels()APIEnables:
get_kernel(internal) caching based on repo infos (done in this PR)Example "kernels-included package" -> https://huggingface.co/cbensimon/FLUX.2-klein-4B-sm90-cu128-glibc235-r52/tree/main/package