Skip to content

Commit c2e8e9a

Browse files
committed
update
1 parent 788410f commit c2e8e9a

20 files changed

Lines changed: 1374 additions & 288 deletions

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
node_modules
22
docs/.vitepress/dist
33
docs/.vitepress/cache
4-
bin
4+
bin
5+
build

REUSE.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ path = [
1313
".dockerignore",
1414
".editorconfig",
1515
"docs/.vitepress/**",
16-
"docs/**.md"
16+
"docs/**.md",
17+
"hack/**"
1718
]
1819
SPDX-FileCopyrightText = "SAP SE or an SAP affiliate company"
1920
SPDX-License-Identifier = "Apache-2.0"

docs/.vitepress/config.mts

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
import { defineConfig } from 'vitepress'
22
import { withMermaid } from "vitepress-plugin-mermaid";
3+
import { generateSidebar } from 'vitepress-sidebar';
34

45

56
// https://vitepress.dev/reference/site-config
6-
export default withMermaid(defineConfig({
7+
export default defineConfig(withMermaid({
78
title: "CobaltCore",
89
description: "Opinionated OpenStack distribution that builds upon IronCore’s foundation to support non-cloud-native workloads",
910
base: "/docs/",
@@ -12,11 +13,21 @@ export default withMermaid(defineConfig({
1213
// https://vitepress.dev/reference/default-theme-config
1314
nav: [
1415
{ text: 'Home', link: '/' },
15-
{ text: 'Architecture', link: '/architecture' }
16+
{
17+
text: 'Projects',
18+
items: [
19+
{ text: 'ApeiroRA', link: 'https://apeirora.eu/' },
20+
{ text: 'Gardener', link: 'https://gardener.cloud/' },
21+
{ text: 'GardenLinux', link: 'https://github.com/gardenlinux/gardenlinux' },
22+
{ text: 'IronCore', link: 'https://ironcore.dev/' },
23+
{ text: 'IronCore Metal Operator', link: 'https://ironcore.dev/metal-operator/' },
24+
{ text: 'OpenStack', link: 'https://www.openstack.org/' },
25+
]
26+
},
1627
],
1728

1829
editLink: {
19-
pattern: 'https://github.com/cobaltcore-dev/.github/blob/main/docs/:path',
30+
pattern: 'https://github.com/cobaltcore-dev/docs/edit/main/docs/:path',
2031
text: 'Edit this page on GitHub'
2132
},
2233

@@ -30,18 +41,17 @@ export default withMermaid(defineConfig({
3041
provider: 'local'
3142
},
3243

33-
sidebar: [
34-
{
35-
text: 'Contents',
36-
items: [
37-
{ text: 'Architecture', link: '/architecture' },
38-
{ text: 'API Reference', link: '/api' }
39-
]
40-
}
41-
],
44+
sidebar: generateSidebar({
45+
documentRootPath: '/docs/',
46+
capitalizeFirst: true,
47+
useTitleFromFileHeading: false,
48+
useTitleFromFrontmatter: true,
49+
useFolderLinkFromIndexFile: true,
50+
useFolderTitleFromIndexFile: true,
51+
}),
4252

4353
socialLinks: [
4454
{ icon: 'github', link: 'https://github.com/cobaltcore-dev' }
45-
]
46-
}
55+
],
56+
},
4757
}))

docs/.vitepress/theme/custom.css

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,18 @@
3434
img[src='/search.png'] {
3535
width: 100%;
3636
aspect-ratio: 1 / 1;
37+
}
38+
39+
img[title="apeirora_logo"] {
40+
width: 400px;
41+
display: block;
42+
margin-top: 2em;
43+
margin-left: auto;
44+
margin-right: auto;
45+
}
46+
47+
html.dark img[title="apeirora_logo"] {
48+
box-shadow: inset 0 0 0.5px 1px hsla(0, 0%, 100%, 0.075),
49+
0 3.5px 6px hsla(0, 0%, 100%, 0.44);
50+
border-radius: 12px;
3751
}

docs/api/index.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1+
---
2+
title: API Reference
3+
---
4+
15
# API Reference
26

37
::: warning
4-
API Reference is not available yet.
8+
Under construction
59
:::

docs/architecture/cluster.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
title: Kubernetes Cluster
3+
---
4+
5+
# Kubernetes Cluster
6+
7+
The CobaltCore cluster is a Kubernetes-based environment designed to manage hypervisor nodes and their associated workloads.
8+
It provides a robust framework for deploying, scaling, and maintaining virtual machines across multiple hypervisor nodes.
9+
10+
The cluster is provisioned using [IronCore](https://ironcore.dev/), which automates the discovery, provisioning, and evacuation of hypervisor nodes.
11+
12+
Components of the cluster, which are not required to be run on every hypervisor node, are deployed as Kubernetes Deployments.
13+
14+
## Hypervisor Operator
15+
16+
::: tip Source Code
17+
[github.com/cobaltcore-dev/openstack-hypervisor-operator](https://github.com/cobaltcore-dev/openstack-hypervisor-operator)
18+
:::
19+
20+
The Kubernetes operator that manages the lifecycle of hypervisor nodes.
21+
It ensures a newly discovered node is properly configured and integrated into the cluster.
22+
After the initial onboarding, the operator runs a final check to ensure the node is ready for use.
23+
The operator also handles the evacuation of nodes in case of failures or maintenance.
24+
25+
## HA Service
26+
27+
::: tip Source Code
28+
[github.com/cobaltcore-dev/kvm-ha-service](https://github.com/cobaltcore-dev/kvm-ha-service)
29+
:::
30+
31+
The **KVM High Availability Service** is a central component that monitors the health and status of hypervisor nodes and their virtual machines.
32+
It collects telemetry data from the KVM HA Agent, processes it, and provides insights into the state of the hypervisors and their workloads.
33+
It is responsible for ensuring that critical workloads remain operational even in the event of failures.
34+
35+
```mermaid
36+
graph LR;
37+
subgraph application [Application]
38+
source(Sources tasks);
39+
monitoring(Monitoring tasks);
40+
hypervisors(Hypervisors task);
41+
config("Configuration (YAML)")
42+
end
43+
44+
monitoring --> |evacuate| nova;
45+
46+
endpoints("http(s) endpoints") ---|pull metrics| source;
47+
senders("http(s) senders") ---|push telemetry| source;
48+
49+
subgraph database [Database]
50+
sqlite
51+
end
52+
53+
source ---> |add telemetry| database;
54+
monitoring <--> |check telemetry| database;
55+
56+
hypervisors ---> database;
57+
58+
hypervisors ---|refresh hypervisors| nova;
59+
60+
subgraph hypervisor [Hypervisors]
61+
Hypervisor1(Hypervisor 1);
62+
HypervisorN(Hypervisor n);
63+
end
64+
65+
subgraph openstack [Openstack]
66+
nova --- Hypervisor1;
67+
nova --- HypervisorN;
68+
end
69+
```

docs/architecture/hypervisor.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
---
2+
title: Hypervisor
3+
---
4+
5+
# Hypervisor
6+
The hypervisor is the foundation of the CobaltCore architecture, providing the virtualization layer that allows multiple virtual machines to run on a single physical server.
7+
8+
[[toc]]
9+
10+
## Components of the Hypervisor
11+
12+
```mermaid
13+
block-beta
14+
columns 1
15+
block:vms
16+
columns 3
17+
vm ["Virtual Machine 1"]
18+
vm2 ["Virtual Machine 2"]
19+
end
20+
libvirt ["LibVirt"]
21+
block:containers
22+
columns 3
23+
kvm ["Node Agent"]
24+
nova ["Nova Agent"]
25+
neutron ["Neutron Agent"]
26+
ha ["HA Agent"]
27+
end
28+
block:os
29+
gl ["GardenLinux"]
30+
end
31+
hv ["Hypervisor Baremetal Node"]
32+
```
33+
34+
Components of the hypervisor include:
35+
36+
- [**KVM Node Agent**](#kvm-node-agent): Responsible for managing the node lifecycle and integration with the Kubernetes cluster.
37+
- **Nova Agent**: Handles the compute services, including scheduling and resource allocation for virtual machines.
38+
- **Neutron Agent**: Manages networking services, providing connectivity between virtual machines and external networks.
39+
- **HA Agent**: Ensures high availability of critical workloads by monitoring and managing failover processes.
40+
- **GardenLinux**: The Linux based operating system that runs on the hypervisor, providing a lightweight and secure environment for virtual machines.
41+
42+
## Interactions and Dependencies
43+
44+
CobaltCore's hypervisor components interact with each other and with the underlying system.
45+
Communication is facilitated through various APIs and services, usally via Unix domain sockets or TCP based protocols.
46+
Key system components include:
47+
48+
- **LibVirt**: The virtualization API that allows the Node Agent and Nova Agent to manage virtual machines.
49+
- **Linux Networking**: Manages network traffic and security rules for virtual machines. Also provides connectivity between virtual machines and external networks.
50+
- **os_vif**: A library that provides virtual interface management for OpenStack, allowing the Nova Agent to plug interfaces into the networking stack.
51+
- **systemd**: The init system that manages services and processes on the hypervisor node.
52+
- **Journald**: The logging system that collects and manages logs from various components of the hypervisor. Part of the systemd suite.
53+
54+
```mermaid
55+
graph TD
56+
subgraph "Hypervisor Baremetal Node"
57+
systemd["systemd"]
58+
libvirt["LibVirt"]
59+
network["Networking (iptables)"]
60+
os_vif["os_vif"]
61+
journal["Journald"]
62+
subgraph "Virtual Machines"
63+
vm1["Virtual Machine 1"]
64+
vm2["Virtual Machine 2"]
65+
end
66+
subgraph "Containerized Agents"
67+
kvm_agent["KVM Node Agent"]
68+
nova_agent["Nova Agent"]
69+
neutron_agent["Neutron Agent"]
70+
ha_agent["HA Agent"]
71+
logs_collector["Logs Collector"]
72+
end
73+
end
74+
75+
libvirt --> vm1
76+
libvirt --> vm2
77+
kvm_agent -- accesses --> libvirt
78+
kvm_agent -- accesses --> systemd
79+
nova_agent -- manages compute --> libvirt
80+
nova_agent -- plugs interfaces --> os_vif
81+
os_vif -- configures --> network
82+
neutron_agent -- manages networking --> network
83+
ha_agent -- ensures reliability --> libvirt
84+
logs_collector -- collects logs --> journal
85+
```
86+
87+
## KVM HA Agent
88+
::: tip Source Code
89+
[github.com/cobaltcore-dev/kvm-ha-agent](https://github.com/cobaltcore-dev/kvm-ha-agent)
90+
:::
91+
92+
The **KVM High Availability Agent (kvm-ha-agent)** is a lightweight Go-based application designed to monitor and report the state of KVM hypervisors and their virtual machines. It integrates with libvirt to capture events, instances and system uptime, sending telemetry data to a central high-availability [service](https://github.com/cobaltcore-dev/kvm-ha-service) for further processing.
93+
94+
### Features
95+
96+
- **Libvirt Event Monitoring**:
97+
- Subscribes to various libvirt domain events:
98+
- Lifecycle changes
99+
- Reboots
100+
- Watchdog triggers
101+
- I/O errors
102+
- Control Errors
103+
- Agent lifecycle
104+
- Memory failures
105+
- Monitoring and reporting can be configured via `ConfigMap`.
106+
- **Uptime Reporting**: Periodically reports the system uptime of the host.
107+
- **Instances Reporting**: Periodically reports the instances that exists on the hypervisor.
108+
109+
### Overview
110+
111+
```mermaid
112+
graph TB;
113+
subgraph controlplane [Controlplane]
114+
kvm_ha_service(KVM-HA-Service)
115+
end
116+
117+
subgraph host [Host]
118+
subgraph kvm_ha_agent [KVM-HA-Agent]
119+
libvirt_events(Libvirt events);
120+
libvirt_events ---> |reports to|kvm_ha_service;
121+
libvirt_instances(Libvirt instances);
122+
libvirt_instances ---> |reports to|kvm_ha_service;
123+
uptime(Uptime);
124+
uptime ---> |reports to|kvm_ha_service;
125+
end
126+
end
127+
```
128+
129+
## KVM Node Agent
130+
131+
::: tip Source Code
132+
[github.com/cobaltcore-dev/kvm-node-agent](https://github.com/cobaltcore-dev/kvm-node-agent)
133+
:::
134+
135+
The **KVM Node Agent** is a lightweight Go-based application that runs on each hypervisor node.
136+
It is responsible for managing the lifecycle of the node and its integration with the Kubernetes cluster.
137+
The agent ensures that the node is properly configured and ready for use,
138+
and exposes information about the libvirt hypervisor to the Kubernetes API.
139+
140+
It provides following Custom Resource Definitions (CRDs):
141+
142+
- **hypervisors.kvm.cloud.sap**:
143+
- Represents a hypervisor node in the Kubernetes cluster.
144+
- Contains metadata about the node, such as its name, status, and configuration.
145+
- GardenLinux version, kernel version, and other relevant details.
146+
- Hardware model, CPU and memory information, and other relevant details.
147+
- List of running virtual machines on the hypervisor.
148+
- Status of hypervisor related systemd services.
149+
150+
Example CRD:
151+
152+
```shell
153+
$ kubectl get hypervisors.kvm.cloud.sap
154+
NAME NODE VERSION INSTANCES HARDWARE KERNEL AGE
155+
node001-bb234 node001-bb234 Garden Linux 1933.0 11 PowerEdge R860 6.12.38-amd64 4d8h
156+
node006-bb123 node006-bb123 Garden Linux 1933.0 0 ProLiant DL560 Gen11 6.12.38-amd64 27h
157+
```
158+
159+
- **migrations.kvm.cloud.sap**:
160+
- Represents a migration operation for a virtual machine.
161+
- Contains metadata about the migration, such as the source and destination hypervisors, status, and progress.
162+
- Used to track the progress of migrations and ensure that they are completed successfully.
163+
164+
Example CRD:
165+
```shell
166+
$ kubectl get migrations.kvm.cloud.sap
167+
NAME ORIGIN DESTINATION TYPE OPERATION STARTED ELAPSED DATA TOTAL DATA PROCESSED DATA REMAINING MEMORY TX MEMORY DIRTY RATE MEMORY ITERATION
168+
12e479eb-6bef-4fdb-bfdc-0388df68bed9 node002-bb086 node008-bb086 completed migration_in 74d 2.755s 4.0 GiB 10.4 MiB 0 B 16.5 MiB/s 0/s 3
169+
13679335-291c-4405-9e08-5911032599dd node007-bb086 node009-bb086 completed migration_out 78d 2.766s 4.0 GiB 398.0 MiB 718.8 MiB 588.9 MiB/s 0/s 1
170+
1552e60a-bdba-4850-84da-07dd635bce2c node006-bb086 node003-bb087 completed migration_out 22d 35.078s 64.0 GiB 58.4 GiB 0 B 1.8 GiB/s 1250/s 4
171+
```

0 commit comments

Comments
 (0)