|
| 1 | +# 2026-04-25 RunPod Cheapest Search for Kimi K2.6 GGUF Q2 |
| 2 | + |
| 3 | +## Goal |
| 4 | + |
| 5 | +Find the cheapest currently available RunPod topology that can plausibly run `unsloth/Kimi-K2.6-GGUF:UD-Q2_K_XL`, then start it and benchmark if the runtime becomes reachable. |
| 6 | + |
| 7 | +## Selection Rule |
| 8 | + |
| 9 | +Use even-GPU shapes only, and prefer the lowest-cost topology that clears the rough aggregate-VRAM floor for the `340 GB` GGUF artifact. |
| 10 | + |
| 11 | +## Live Cheap-First Search |
| 12 | + |
| 13 | +### 1. 8x A40 |
| 14 | + |
| 15 | +- Status in inventory: available, low stock |
| 16 | +- Result: `HTTP 500 There are no instances currently available` |
| 17 | +- Outcome: could not allocate |
| 18 | + |
| 19 | +### 2. 8x RTX A6000 |
| 20 | + |
| 21 | +- Status in inventory: advertised in several regions, but without a strong stock signal |
| 22 | +- Result: `HTTP 500 There are no instances currently available` |
| 23 | +- Outcome: could not allocate |
| 24 | + |
| 25 | +### 3. 2x MI300X |
| 26 | + |
| 27 | +| Field | Value | |
| 28 | +| --- | --- | |
| 29 | +| Pod ID | `k9p5qwst0txevv` | |
| 30 | +| Cost | `$3.98/hr` | |
| 31 | +| Machine ID | `j03rnq2tcsxu` | |
| 32 | +| Result | allocated | |
| 33 | + |
| 34 | +Observed behavior: |
| 35 | + |
| 36 | +- `desiredStatus: RUNNING` |
| 37 | +- `uptimeSeconds: 0` |
| 38 | +- no public routing |
| 39 | +- SSH never reached `ready` |
| 40 | + |
| 41 | +Outcome: cheapest allocatable option, but dead host. |
| 42 | + |
| 43 | +### 4. 4x H100 80GB |
| 44 | + |
| 45 | +- Status in inventory: high-level stock existed, but not in the REST-allowed regions that were tested |
| 46 | +- Result: `HTTP 500 There are no instances currently available` |
| 47 | +- Outcome: could not allocate |
| 48 | + |
| 49 | +### 5. 4x H100 NVL |
| 50 | + |
| 51 | +| Field | Value | |
| 52 | +| --- | --- | |
| 53 | +| Pod ID | `j6q4iu80tj922e` | |
| 54 | +| Cost | `$10.36/hr` | |
| 55 | +| Machine ID | `o7h99o28jtin` | |
| 56 | +| Result | allocated | |
| 57 | + |
| 58 | +Observed behavior: |
| 59 | + |
| 60 | +- new machine ID, unlike the recycled RTX PRO 6000 community host |
| 61 | +- SSH metadata appeared at `38.143.35.131:12908` |
| 62 | +- direct SSH still returned `Connection refused` |
| 63 | +- `uptimeSeconds` remained `0` |
| 64 | + |
| 65 | +Outcome: allocates, but still dead before runtime. |
| 66 | + |
| 67 | +### 6. 4x H200 |
| 68 | + |
| 69 | +- Status in inventory: low stock in some regions |
| 70 | +- Result: `HTTP 500 There are no instances currently available` |
| 71 | +- Outcome: could not allocate |
| 72 | + |
| 73 | +### 7. 4x RTX PRO 6000 Secure |
| 74 | + |
| 75 | +| Field | Value | |
| 76 | +| --- | --- | |
| 77 | +| Pod ID | `c3j21r1pd9wpa2` | |
| 78 | +| Cost | `$7.56/hr` | |
| 79 | +| Machine ID | `67fbuhb2qnz1` | |
| 80 | +| Datacenter | `EUR-IS-1` | |
| 81 | +| Result | allocated | |
| 82 | + |
| 83 | +Observed behavior: |
| 84 | + |
| 85 | +- first Secure RTX PRO 6000 host tested, so this was a different pool than the recycled dead community host |
| 86 | +- SSH metadata appeared at `157.157.221.30:52123` |
| 87 | +- direct SSH still returned `Connection refused` |
| 88 | +- `uptimeSeconds` remained `0` |
| 89 | + |
| 90 | +Outcome: cheapest allocatable NVIDIA path found today, but still dead before runtime. |
| 91 | + |
| 92 | +## Practical Conclusion |
| 93 | + |
| 94 | +As of 2026-04-25, the cheapest allocatable RunPod topology found for this model was: |
| 95 | + |
| 96 | +- `2x MI300X` at `$3.98/hr` |
| 97 | + |
| 98 | +The cheapest allocatable NVIDIA topology found was: |
| 99 | + |
| 100 | +- `4x RTX PRO 6000 Secure` at `$7.56/hr` |
| 101 | + |
| 102 | +Neither became reachable enough to run inference, so no successful benchmark could be produced from the cheap-first search. |
| 103 | + |
| 104 | +## Cleanup |
| 105 | + |
| 106 | +Every failed pod from this search was deleted after its readiness window. |
0 commit comments