|
| 1 | +- name: virtualization.vi.state |
| 2 | + rules: |
| 3 | + - alert: D8VirtualizationClusterVirtualImageStuckInPendingPhase |
| 4 | + expr: d8_virtualization_clustervirtualimage_status_phase{phase="Pending"} == 1 |
| 5 | + labels: |
| 6 | + severity_level: "9" |
| 7 | + tier: cluster |
| 8 | + for: 60m |
| 9 | + annotations: |
| 10 | + plk_protocol_version: "1" |
| 11 | + plk_markup_format: "markdown" |
| 12 | + plk_create_group_if_not_exists__d8_virtualization_clustervirtualimage_stuck_in_pending_phase: "D8VirtualizationClusterVirtualImageStuckInPendingPhase,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 13 | + plk_grouped_by__d8_virtualization_clustervirtualimage_stuck_in_pending_phase: "D8VirtualizationClusterVirtualImageStuckInPendingPhase,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 14 | + summary: ClusterVirtualImage is stuck in the `Pending` phase for a long time. |
| 15 | + description: | |
| 16 | + The virtual image `{{ $labels.name }}` has been stuck in the `Pending` phase for more than 60 minutes. |
| 17 | +
|
| 18 | + ### Common Causes |
| 19 | +
|
| 20 | + - Missing or not ready ClusterVirtualImage, ClusterClusterVirtualImage, VirtualDisk or ClusterVirtualImageSnapshot |
| 21 | + - Scheduling issues on the node |
| 22 | + - Cluster resource shortage (CPU, memory) |
| 23 | + - Exhausted quotas (e.g., CPU, memory limits) |
| 24 | +
|
| 25 | + ### Recommended Actions |
| 26 | +
|
| 27 | + 1. Check virtual image status: |
| 28 | + ```bash |
| 29 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status}" | jq |
| 30 | + ``` |
| 31 | +
|
| 32 | + 2. Inspect conditions for details: |
| 33 | + ```bash |
| 34 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status.conditions}" | jq |
| 35 | + ``` |
| 36 | +
|
| 37 | + 3. Check related events: |
| 38 | + ```bash |
| 39 | + d8 k get events --field-selector involvedObject.name={{ $labels.name }} |
| 40 | + ``` |
| 41 | +
|
| 42 | + 4. Check if the source ClusterVirtualImage, ClusterClusterVirtualImage or ClusterVirtualImageSnapshot exists and is Ready: |
| 43 | + ```bash |
| 44 | + d8 k -A get vd, vi, cvi, vis |
| 45 | + ``` |
| 46 | +
|
| 47 | +
|
| 48 | + - alert: D8VirtualizationClusterVirtualImageStuckInWaitForUserUploadPhase |
| 49 | + expr: d8_virtualization_clustervirtualimage_status_phase{phase="WaitForUserUpload"} == 1 |
| 50 | + labels: |
| 51 | + severity_level: "9" |
| 52 | + tier: cluster |
| 53 | + for: 60m |
| 54 | + annotations: |
| 55 | + plk_protocol_version: "1" |
| 56 | + plk_markup_format: "markdown" |
| 57 | + plk_create_group_if_not_exists__d8_virtualization_clustervirtualimage_stuck_in_waitforuserupload_phase: "D8VirtualizationClusterVirtualImageStuckInWaitForUserUploadPhase,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 58 | + plk_grouped_by__d8_virtualization_clustervirtualimage_stuck_in_waitforuserupload_phase: "D8VirtualizationClusterVirtualImageStuckInWaitForUserUploadPhase,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 59 | + summary: ClusterVirtualImage is stuck in the `WaitForUserUpload` phase for a long time. |
| 60 | + description: | |
| 61 | + The cluster virtual image `{{ $labels.name }}` has been waiting for a user image upload for more than 60 minutes. |
| 62 | +
|
| 63 | + This means that no image was uploaded to provision the cluster virtual image. |
| 64 | +
|
| 65 | + ### What You Need to Do |
| 66 | +
|
| 67 | + Upload the required image image using one of the provided URLs: |
| 68 | +
|
| 69 | + - From outside the cluster: |
| 70 | + ```bash |
| 71 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status.imageUploadURLs.external}" |
| 72 | + ``` |
| 73 | +
|
| 74 | + - From inside the cluster (node): |
| 75 | + ```bash |
| 76 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status.imageUploadURLs.inCluster}" |
| 77 | + ``` |
| 78 | +
|
| 79 | + - Use `curl`, `wget`, or any HTTP client with `PUT` method and appropriate content-type (`application/octet-stream`) to upload the image. |
| 80 | +
|
| 81 | + Example: |
| 82 | + ```bash |
| 83 | + curl -X PUT --data-binary @image.qcow2 \ |
| 84 | + -H "Content-Type: application/octet-stream" \ |
| 85 | + $(d8 k get cvi {{ $labels.name }} -o jsonpath="{.status.imageUploadURLs.external}") |
| 86 | + ``` |
| 87 | +
|
| 88 | +
|
| 89 | + - alert: D8VirtualizationClusterVirtualImageFailed |
| 90 | + expr: d8_virtualization_clustervirtualimage_status_phase{phase="Failed"} == 1 |
| 91 | + labels: |
| 92 | + severity_level: "6" |
| 93 | + tier: cluster |
| 94 | + for: 0m |
| 95 | + annotations: |
| 96 | + plk_protocol_version: "1" |
| 97 | + plk_markup_format: "markdown" |
| 98 | + plk_create_group_if_not_exists__d8_virtualization_clustervirtualimage__failed: "D8VirtualizationClusterVirtualImageFailed,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 99 | + plk_grouped_by__d8_virtualization_clustervirtualimage__failed: "D8VirtualizationClusterVirtualImageFailed,tier=~tier,prometheus=deckhouse,kubernetes=~kubernetes" |
| 100 | + summary: ClusterVirtualImage in the `Failed` phase. |
| 101 | + description: | |
| 102 | + The virtual image `{{ $labels.name }}` in the `Failed` phase. |
| 103 | +
|
| 104 | + This may indicate one or more of the following issues: |
| 105 | +
|
| 106 | + - Wrong image URL |
| 107 | + - Wrong container image |
| 108 | + - Network issues |
| 109 | + - Storage issues |
| 110 | +
|
| 111 | + ### Recommended Actions |
| 112 | +
|
| 113 | + 1. Check the full status of the cluster virtual image: |
| 114 | + ```bash |
| 115 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status}" | jq |
| 116 | + ``` |
| 117 | +
|
| 118 | + 2. Inspect the condition for details: |
| 119 | + ```bash |
| 120 | + d8 k get cvi {{ $labels.name }} -o jsonpath="{.status.conditions}" | jq |
| 121 | + ``` |
| 122 | +
|
| 123 | + 3. Review events related to this cluster virtual image: |
| 124 | + ```bash |
| 125 | + d8 k get events --field-selector involvedObject.name={{ $labels.name }} |
| 126 | + ``` |
0 commit comments