Skip to content

net: support krunkit network.mode: shared#1560

Merged
abiosoft merged 2 commits into
abiosoft:mainfrom
acelinkio:feat/sharedNetworkKrunkit
Apr 20, 2026
Merged

net: support krunkit network.mode: shared#1560
abiosoft merged 2 commits into
abiosoft:mainfrom
acelinkio:feat/sharedNetworkKrunkit

Conversation

@acelinkio
Copy link
Copy Markdown
Contributor

This change should help addresses #1559. This change should allow for l2 announcements to work when using tools like metallb, cilium l2 announcements.

both lima & krunkit support socket_vmnet

Verified that the templating adds another interface.

🍎 ❯ grep -A10 "^networks:" ~/.colima/_lima/colima-mtest/lima.yaml
networks:
    - lima: user-v2
    - socket: /Users/ep/.colima/mtest/daemon/vmnet.sock
      interface: col0
      metric: 300
provision:
    - mode: system
      script: sysctl -w fs.inotify.max_user_watches=1048576
    - mode: system
      script: grep '127.0.0.1 colima-mtest' /etc/hosts || echo '127.0.0.1 colima-mtest' >> /etc/hosts
    - mode: system

Will test tomorrow with finishing up a Kubernetes cluster and seeing if the L2 announcements work as expected.

One step closer to having gpu accelerated pods running in Kubernetes!

Signed-off-by: Arlan Lloyd <arlanlloyd@gmail.com>
@acelinkio
Copy link
Copy Markdown
Contributor Author

Confirmed that network.mode: shared works following this change. I was able to successfully use L2Annoucements in my Kubernetes cluster without issue.

Thanks again for this project! Next item to start testing is if I can get a GPU accelerated pod running in Kubernetes.

@acelinkio
Copy link
Copy Markdown
Contributor Author

acelinkio commented Apr 19, 2026

I think this PR can be merged. Verified functionality of network and GPU passthru. Going to dump some of my progress here.

I was able to get the GPU Pass Thru following the same demo code here https://github.com/medyagh/ai-playground-minikube/tree/main/macos with one small change.

squat/generic-device-plugin image tag was not specified so its pulling a newer version to create that container. That version uses devic.es/dri: 1 instead of squat.ai/dri: 1 for devices.

🍎 ❯ kubectl exec pod/uvkc-gpu -it -- /uvkc/mad_throughput
2026-04-18T15:58:20+00:00
Running /uvkc/mad_throughput
Run on (4 X 48 MHz CPU s)
Load Average: 2.19, 0.78, 0.47
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                               Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------------------------
Virtio-GPU Venus (Apple M5 Pro)/mad_throughput_f32/1048576/100000/manual_time      511694 us         2334 us            1 FLOps=4.09845T/s
Virtio-GPU Venus (Apple M5 Pro)/mad_throughput_f32/1048576/200000/manual_time     1022130 us         4107 us            1 FLOps=4.10349T/s
Virtio-GPU Venus (Apple M5 Pro)/mad_throughput_f16/1048576/100000/manual_time      280612 us         1820 us            3 FLOps=7.4735T/s
Virtio-GPU Venus (Apple M5 Pro)/mad_throughput_f16/1048576/200000/manual_time      558286 us         2800 us            1 FLOps=7.51282T/s
llvmpipe (LLVM 20.1.7, 128 bits)/mad_throughput_f32/1048576/100000/manual_time   22940306 us         27.6 us            1 FLOps=91.4178G/s
llvmpipe (LLVM 20.1.7, 128 bits)/mad_throughput_f32/1048576/200000/manual_time   45849904 us         23.7 us            1 FLOps=91.479G/s
llvmpipe (LLVM 20.1.7, 128 bits)/mad_throughput_f16/1048576/100000/manual_time   72468475 us         23.0 us            1 FLOps=28.9388G/s
llvmpipe (LLVM 20.1.7, 128 bits)/mad_throughput_f16/1048576/200000/manual_time  368624292 us         87.0 us            1 FLOps=11.3783G/s

However when I go to use the ghcr.io/ggml-org/llama.cpp:light-vulkan image to run a model I am getting

root@llama-server-7988bcff84-p9l4t:/app# ./llama-cli -ngl 99
load_backend: loaded CPU backend from /app/libggml-cpu-armv8.2_2.so
warning: no usable GPU found, --gpu-layers option will be ignored
warning: one possible reason is that llama.cpp was compiled without GPU support
warning: consult docs/build.md for compilation instructions
error: --model is required
root@llama-server-7988bcff84-p9l4t:/app# ./llama-cli --list-devices
load_backend: loaded CPU backend from /app/libggml-cpu-armv8.2_2.so
Available devices:
root@llama-server-7988bcff84-p9l4t:/app# apt-get update && apt-get install -y vulkan-tools
...
...
...
root@llama-server-7988bcff84-p9l4t:/app# vulkaninfo
ERROR at ./vulkaninfo/./vulkaninfo.h:613:vkCreateInstance failed with ERROR_OUT_OF_HOST_MEMORY

still troubleshooting, but that is unrelated to this PR.

EDIT:
GPU PassThru working with
macos -> colima -> krunit -> k3s/containerd -> pod. issue was with the container image. Use quay.io/ramalama/ramalama:0.19

@abiosoft
Copy link
Copy Markdown
Owner

This was actually an oversight from me.
Thanks :)

@abiosoft abiosoft merged commit 156799f into abiosoft:main Apr 20, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants