fix: decode pod device annotations with init containers#104
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: FouoF The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Could you please rebase this PR against latest main? Please keep the MLU decode call from #100: DecodeMLUContainerDevices(fmt.Sprintf("%s_%s", str, instance), nodeName)nodeName is already passed separately, so it should not be appended into the formatted string. |
HAMi encodes vgpu-devices-allocated entries for init containers and regular containers in order; align decoding and container mapping with that index layout so GPU usage shows correctly in the WebUI. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Jifei Wang <jifei.wang@dynamia.ai>
📝 WalkthroughWalkthroughFixes container device index misalignment for pods with InitContainer device alignment
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/internal/provider/util/util_test.go (1)
353-403: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for
fetchContainerInfotoo.This test locks down
DecodePodDevices, but the user-facing fix also depends onserver/internal/data/pod.goapplying the same init-container offset when it maps decoded slots back ontopod.Spec.Containers. A direct test forpodRepo.fetchContainerInfo(or the caller that populatesbiz.PodInfo.Ctrs) would catch regressions there that still pass this utility-level test.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/internal/provider/util/util_test.go` around lines 353 - 403, Add test coverage for the init-container offset in the higher-level pod mapping path, not just DecodePodDevices. Introduce a focused test for fetchContainerInfo (or the caller that builds biz.PodInfo.Ctrs) in server/internal/data/pod.go, using a pod with init containers and one main container, and assert the decoded device slot is mapped onto the main container after the offset is applied. Keep the existing util test, but ensure the new test exercises the container-index translation logic in fetchContainerInfo so regressions there are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@server/internal/provider/util/util_test.go`:
- Around line 353-403: Add test coverage for the init-container offset in the
higher-level pod mapping path, not just DecodePodDevices. Introduce a focused
test for fetchContainerInfo (or the caller that builds biz.PodInfo.Ctrs) in
server/internal/data/pod.go, using a pod with init containers and one main
container, and assert the decoded device slot is mapped onto the main container
after the offset is applied. Keep the existing util test, but ensure the new
test exercises the container-index translation logic in fetchContainerInfo so
regressions there are caught.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2a4f6f80-f0af-41d1-a07f-ae181d300c6d
📒 Files selected for processing (3)
server/internal/data/pod.goserver/internal/provider/util/util.goserver/internal/provider/util/util_test.go
HAMi encodes vgpu-devices-allocated entries for init containers and regular containers in order; align decoding and container mapping with that index layout so GPU usage shows correctly in the WebUI.
fix Project-HAMi/HAMi#1932
Summary by CodeRabbit
Bug Fixes
Tests