|
| 1 | +--- |
| 2 | +title: Hypervisor |
| 3 | +--- |
| 4 | + |
| 5 | +# Hypervisor |
| 6 | +The hypervisor is the foundation of the CobaltCore architecture, providing the virtualization layer that allows multiple virtual machines to run on a single physical server. |
| 7 | + |
| 8 | +[[toc]] |
| 9 | + |
| 10 | +## Components of the Hypervisor |
| 11 | + |
| 12 | +```mermaid |
| 13 | +block-beta |
| 14 | + columns 1 |
| 15 | + block:vms |
| 16 | + columns 3 |
| 17 | + vm ["Virtual Machine 1"] |
| 18 | + vm2 ["Virtual Machine 2"] |
| 19 | + end |
| 20 | + libvirt ["LibVirt"] |
| 21 | + block:containers |
| 22 | + columns 3 |
| 23 | + kvm ["Node Agent"] |
| 24 | + nova ["Nova Agent"] |
| 25 | + neutron ["Neutron Agent"] |
| 26 | + ha ["HA Agent"] |
| 27 | + end |
| 28 | + block:os |
| 29 | + gl ["GardenLinux"] |
| 30 | + end |
| 31 | + hv ["Hypervisor Baremetal Node"] |
| 32 | +``` |
| 33 | + |
| 34 | +Components of the hypervisor include: |
| 35 | + |
| 36 | +- [**KVM Node Agent**](#kvm-node-agent): Responsible for managing the node lifecycle and integration with the Kubernetes cluster. |
| 37 | +- **Nova Agent**: Handles the compute services, including scheduling and resource allocation for virtual machines. |
| 38 | +- **Neutron Agent**: Manages networking services, providing connectivity between virtual machines and external networks. |
| 39 | +- **HA Agent**: Ensures high availability of critical workloads by monitoring and managing failover processes. |
| 40 | +- **GardenLinux**: The Linux based operating system that runs on the hypervisor, providing a lightweight and secure environment for virtual machines. |
| 41 | + |
| 42 | +## Interactions and Dependencies |
| 43 | + |
| 44 | +CobaltCore's hypervisor components interact with each other and with the underlying system. |
| 45 | +Communication is facilitated through various APIs and services, usally via Unix domain sockets or TCP based protocols. |
| 46 | +Key system components include: |
| 47 | + |
| 48 | +- **LibVirt**: The virtualization API that allows the Node Agent and Nova Agent to manage virtual machines. |
| 49 | +- **Linux Networking**: Manages network traffic and security rules for virtual machines. Also provides connectivity between virtual machines and external networks. |
| 50 | +- **os_vif**: A library that provides virtual interface management for OpenStack, allowing the Nova Agent to plug interfaces into the networking stack. |
| 51 | +- **systemd**: The init system that manages services and processes on the hypervisor node. |
| 52 | +- **Journald**: The logging system that collects and manages logs from various components of the hypervisor. Part of the systemd suite. |
| 53 | + |
| 54 | +```mermaid |
| 55 | +graph TD |
| 56 | + subgraph "Hypervisor Baremetal Node" |
| 57 | + systemd["systemd"] |
| 58 | + libvirt["LibVirt"] |
| 59 | + network["Networking (iptables)"] |
| 60 | + os_vif["os_vif"] |
| 61 | + journal["Journald"] |
| 62 | + subgraph "Virtual Machines" |
| 63 | + vm1["Virtual Machine 1"] |
| 64 | + vm2["Virtual Machine 2"] |
| 65 | + end |
| 66 | + subgraph "Containerized Agents" |
| 67 | + kvm_agent["KVM Node Agent"] |
| 68 | + nova_agent["Nova Agent"] |
| 69 | + neutron_agent["Neutron Agent"] |
| 70 | + ha_agent["HA Agent"] |
| 71 | + logs_collector["Logs Collector"] |
| 72 | + end |
| 73 | + end |
| 74 | +
|
| 75 | + libvirt --> vm1 |
| 76 | + libvirt --> vm2 |
| 77 | + kvm_agent -- accesses --> libvirt |
| 78 | + kvm_agent -- accesses --> systemd |
| 79 | + nova_agent -- manages compute --> libvirt |
| 80 | + nova_agent -- plugs interfaces --> os_vif |
| 81 | + os_vif -- configures --> network |
| 82 | + neutron_agent -- manages networking --> network |
| 83 | + ha_agent -- ensures reliability --> libvirt |
| 84 | + logs_collector -- collects logs --> journal |
| 85 | +``` |
| 86 | + |
| 87 | +## KVM HA Agent |
| 88 | +::: tip Source Code |
| 89 | +[github.com/cobaltcore-dev/kvm-ha-agent](https://github.com/cobaltcore-dev/kvm-ha-agent) |
| 90 | +::: |
| 91 | + |
| 92 | +The **KVM High Availability Agent (kvm-ha-agent)** is a lightweight Go-based application designed to monitor and report the state of KVM hypervisors and their virtual machines. It integrates with libvirt to capture events, instances and system uptime, sending telemetry data to a central high-availability [service](https://github.com/cobaltcore-dev/kvm-ha-service) for further processing. |
| 93 | + |
| 94 | +### Features |
| 95 | + |
| 96 | +- **Libvirt Event Monitoring**: |
| 97 | + - Subscribes to various libvirt domain events: |
| 98 | + - Lifecycle changes |
| 99 | + - Reboots |
| 100 | + - Watchdog triggers |
| 101 | + - I/O errors |
| 102 | + - Control Errors |
| 103 | + - Agent lifecycle |
| 104 | + - Memory failures |
| 105 | + - Monitoring and reporting can be configured via `ConfigMap`. |
| 106 | +- **Uptime Reporting**: Periodically reports the system uptime of the host. |
| 107 | +- **Instances Reporting**: Periodically reports the instances that exists on the hypervisor. |
| 108 | + |
| 109 | +### Overview |
| 110 | + |
| 111 | +```mermaid |
| 112 | +graph TB; |
| 113 | + subgraph controlplane [Controlplane] |
| 114 | + kvm_ha_service(KVM-HA-Service) |
| 115 | + end |
| 116 | +
|
| 117 | + subgraph host [Host] |
| 118 | + subgraph kvm_ha_agent [KVM-HA-Agent] |
| 119 | + libvirt_events(Libvirt events); |
| 120 | + libvirt_events ---> |reports to|kvm_ha_service; |
| 121 | + libvirt_instances(Libvirt instances); |
| 122 | + libvirt_instances ---> |reports to|kvm_ha_service; |
| 123 | + uptime(Uptime); |
| 124 | + uptime ---> |reports to|kvm_ha_service; |
| 125 | + end |
| 126 | + end |
| 127 | +``` |
| 128 | + |
| 129 | +## KVM Node Agent |
| 130 | + |
| 131 | +::: tip Source Code |
| 132 | +[github.com/cobaltcore-dev/kvm-node-agent](https://github.com/cobaltcore-dev/kvm-node-agent) |
| 133 | +::: |
| 134 | + |
| 135 | +The **KVM Node Agent** is a lightweight Go-based application that runs on each hypervisor node. |
| 136 | +It is responsible for managing the lifecycle of the node and its integration with the Kubernetes cluster. |
| 137 | +The agent ensures that the node is properly configured and ready for use, |
| 138 | +and exposes information about the libvirt hypervisor to the Kubernetes API. |
| 139 | + |
| 140 | +It provides following Custom Resource Definitions (CRDs): |
| 141 | + |
| 142 | +- **hypervisors.kvm.cloud.sap**: |
| 143 | + - Represents a hypervisor node in the Kubernetes cluster. |
| 144 | + - Contains metadata about the node, such as its name, status, and configuration. |
| 145 | + - GardenLinux version, kernel version, and other relevant details. |
| 146 | + - Hardware model, CPU and memory information, and other relevant details. |
| 147 | + - List of running virtual machines on the hypervisor. |
| 148 | + - Status of hypervisor related systemd services. |
| 149 | + |
| 150 | +Example CRD: |
| 151 | + |
| 152 | +```shell |
| 153 | +$ kubectl get hypervisors.kvm.cloud.sap |
| 154 | +NAME NODE VERSION INSTANCES HARDWARE KERNEL AGE |
| 155 | +node001-bb234 node001-bb234 Garden Linux 1933.0 11 PowerEdge R860 6.12.38-amd64 4d8h |
| 156 | +node006-bb123 node006-bb123 Garden Linux 1933.0 0 ProLiant DL560 Gen11 6.12.38-amd64 27h |
| 157 | +``` |
| 158 | + |
| 159 | +- **migrations.kvm.cloud.sap**: |
| 160 | + - Represents a migration operation for a virtual machine. |
| 161 | + - Contains metadata about the migration, such as the source and destination hypervisors, status, and progress. |
| 162 | + - Used to track the progress of migrations and ensure that they are completed successfully. |
| 163 | + |
| 164 | +Example CRD: |
| 165 | +```shell |
| 166 | +$ kubectl get migrations.kvm.cloud.sap |
| 167 | +NAME ORIGIN DESTINATION TYPE OPERATION STARTED ELAPSED DATA TOTAL DATA PROCESSED DATA REMAINING MEMORY TX MEMORY DIRTY RATE MEMORY ITERATION |
| 168 | +12e479eb-6bef-4fdb-bfdc-0388df68bed9 node002-bb086 node008-bb086 completed migration_in 74d 2.755s 4.0 GiB 10.4 MiB 0 B 16.5 MiB/s 0/s 3 |
| 169 | +13679335-291c-4405-9e08-5911032599dd node007-bb086 node009-bb086 completed migration_out 78d 2.766s 4.0 GiB 398.0 MiB 718.8 MiB 588.9 MiB/s 0/s 1 |
| 170 | +1552e60a-bdba-4850-84da-07dd635bce2c node006-bb086 node003-bb087 completed migration_out 22d 35.078s 64.0 GiB 58.4 GiB 0 B 1.8 GiB/s 1250/s 4 |
| 171 | +``` |
0 commit comments