Skip to content

Commit 413e4cc

Browse files
committed
WIP: configure alloy in control-plane
1 parent 4180218 commit 413e4cc

12 files changed

Lines changed: 374 additions & 563 deletions

File tree

common/roles/defaults/defaults/main.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,10 @@ metal_stack_release:
161161
metal_helm_chart_tag: "helm-charts.metal-stack.metal-control-plane.tag"
162162
logging_chart_version: "helm-charts.logging.loki.version"
163163
logging_chart_repo: "helm-charts.logging.loki.repository"
164-
logging_alloy_version: "helm-charts.logging.alloy.version"
165-
logging_alloy_repo: "helm-charts.logging.alloy.repository"
164+
logging_alloy_chart_version: "helm-charts.logging.alloy.version"
165+
logging_alloy_chart_repo: "helm-charts.logging.alloy.repository"
166+
gardener_logging_alloy_chart_version: "helm-charts.logging.alloy.version"
167+
gardener_logging_alloy_chart_repo: "helm-charts.logging.alloy.repository"
166168
logging_promtail_chart_version: "helm-charts.logging.promtail.version"
167169
logging_promtail_chart_repo: "helm-charts.logging.promtail.repository"
168170
gardener_logging_promtail_chart_version: "helm-charts.logging.promtail.version"

control-plane/roles/gardener-logging/README.md

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
# gardener-logging
22

3-
This role deploys a promtail into a Gardener shooted seed. It is expected that the [logging role](../logging/) was deployed into the metal-stack control plane before executing this role.
3+
Deploys Alloy (replacing Promtail) into Gardener shooted seeds and optionally into the garden cluster itself. Alloy collects pod logs via the Kubernetes API and forwards them to the Loki instance in the metal-stack control plane.
4+
5+
Expects the [logging role](../logging/) to have been deployed first.
6+
7+
## Configuration
8+
9+
The Alloy River config is generated from structured variables at deploy time. Override individual variables to customize behavior, or bypass the template entirely with `gardener_logging_alloy_config_raw`.
410

511
## Variables
612

@@ -20,3 +26,27 @@ The following variables can be set to configure the role:
2026
| gardener_logging_ingress_loki_basic_auth_user | | The basic auth user for the external loki ingress |
2127
| gardener_logging_deploy_to_garden_cluster | | Deploys promtail also into the garden cluster |
2228
| gardener_logging_shooted_seeds | | Shooted seed names on which to deploy promtails that log to loki |
29+
30+
### Alloy
31+
32+
| Name | Mandatory | Description |
33+
| --- | --- | --- |
34+
| gardener_logging_alloy_chart_version | yes | Helm chart version for alloy (release vector) |
35+
| gardener_logging_alloy_chart_repo | yes | Repository for alloy (release vector) |
36+
| gardener_logging_alloy_port | | Alloy listen port (default: `12345`) |
37+
| gardener_logging_alloy_loki_write_endpoints | | List of Loki push endpoints. Each entry: `{url, basic_auth?: {username, password}}` (default: HTTPS to `gardener_logging_ingress_dns`) |
38+
| gardener_logging_alloy_cluster_label | | Value for the `cluster=` external label on all log streams (default: `gardener_logging_shooted_seed.name`) |
39+
| gardener_logging_alloy_meta_monitoring_enabled | | Create a `ServiceMonitor` for alloy metrics and forward alloy's own logs to Loki. Requires kube-prometheus-stack in the seed cluster first (default: `false`) |
40+
| gardener_logging_alloy_config_raw | | Full Alloy River config string override. When set, bypasses all structured vars above. |
41+
42+
## Migration from Promtail
43+
44+
Alloy replaces Promtail as the log collector. Promtail releases are still deployed in parallel during the transition period (`# TODO remove promtail` markers in the task files).
45+
46+
Key differences from the Promtail helm chart:
47+
48+
| Promtail | Alloy |
49+
| --- | --- |
50+
| `config.clients[].url` + `basic_auth` | `gardener_logging_alloy_loki_write_endpoints[].url` + `basic_auth` |
51+
| `-client.external-labels=cluster=…` extraArg | `gardener_logging_alloy_cluster_label``external_labels` in River config |
52+
| `pipelineStages: [cri, docker]` | Not needed — `loki.source.kubernetes` uses the Kubernetes API |

control-plane/roles/gardener-logging/defaults/main.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,24 @@ gardener_logging_garden_name: "{{ gardener_defaults_garden_name }}"
77
gardener_logging_ingress_loki_basic_auth_user: promtail
88
gardener_logging_ingress_loki_basic_auth_password:
99

10+
# Loki push endpoints (same format as the partition alloy role)
11+
gardener_logging_alloy_loki_write_endpoints:
12+
- url: "https://{{ gardener_logging_ingress_dns }}/loki/api/v1/push"
13+
basic_auth:
14+
username: "{{ gardener_logging_ingress_loki_basic_auth_user }}"
15+
password: "{{ gardener_logging_ingress_loki_basic_auth_password }}"
16+
17+
# Value for the cluster= external label attached to all log streams
18+
gardener_logging_alloy_cluster_label: "{{ gardener_logging_shooted_seed.name }}"
19+
20+
# Enable meta-monitoring: expose alloy metrics via ServiceMonitor.
21+
# Requires kube-prometheus-stack to be deployed in the shooted seed cluster first.
22+
gardener_logging_alloy_meta_monitoring_enabled: false
23+
gardener_logging_alloy_port: 12345
24+
25+
# Full Alloy River config override. When set, bypasses the seed-alloy-config.alloy.j2 template.
26+
# gardener_logging_alloy_config_raw: |
27+
1028
gardener_logging_deploy_to_garden_cluster: true
1129
gardener_logging_shooted_seeds: []
1230
# - name: my-shooted-seed

control-plane/roles/gardener-logging/tasks/gardener-shooted-seed.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,21 @@
77
set_fact:
88
_shoot_kubeconfig: "{{ virtual_garden_kubeconfig | string | shoot_admin_kubeconfig('garden', gardener_logging_shooted_seed.name) | from_yaml }}"
99

10+
- name: Build alloy config
11+
set_fact:
12+
gardener_logging_alloy_config: "{{ lookup('template', 'seed-alloy-config.alloy.j2') if (gardener_logging_alloy_config_raw | default('') | length == 0) else gardener_logging_alloy_config_raw }}"
13+
14+
- name: Deploy alloy
15+
kubernetes.core.helm:
16+
name: alloy
17+
chart_repo_url: "{{ gardener_logging_alloy_chart_repo }}"
18+
chart_version: "{{ gardener_logging_alloy_chart_version }}"
19+
chart_ref: alloy
20+
namespace: "{{ gardener_logging_namespace }}"
21+
values: "{{ lookup('template', 'seed-alloy-values.yaml') | from_yaml }}"
22+
kubeconfig: "{{ _shoot_kubeconfig }}"
23+
create_namespace: true
24+
1025
- name: Deploy Promtail
1126
kubernetes.core.helm:
1227
name: promtail

control-plane/roles/gardener-logging/tasks/main.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,30 @@
99
that:
1010
- gardener_logging_promtail_chart_repo is defined
1111
- gardener_logging_promtail_chart_version is defined
12+
- gardener_logging_alloy_chart_repo is defined
13+
- gardener_logging_alloy_chart_version is defined
14+
15+
- name: Build alloy config for garden cluster
16+
set_fact:
17+
gardener_logging_alloy_config: "{{ lookup('template', 'seed-alloy-config.alloy.j2') if (gardener_logging_alloy_config_raw | default('') | length == 0) else gardener_logging_alloy_config_raw }}"
18+
when: gardener_logging_deploy_to_garden_cluster
19+
vars:
20+
gardener_logging_shooted_seed:
21+
name: "{{ gardener_logging_garden_name }}"
22+
23+
- name: Deploy alloy to garden cluster
24+
kubernetes.core.helm:
25+
name: alloy
26+
chart_repo_url: "{{ gardener_logging_alloy_chart_repo }}"
27+
chart_version: "{{ gardener_logging_alloy_chart_version }}"
28+
chart_ref: alloy
29+
namespace: "{{ gardener_logging_namespace }}"
30+
values: "{{ lookup('template', 'seed-alloy-values.yaml') | from_yaml }}"
31+
create_namespace: true
32+
when: gardener_logging_deploy_to_garden_cluster
33+
vars:
34+
gardener_logging_shooted_seed:
35+
name: "{{ gardener_logging_garden_name }}"
1236

1337
- name: Deploy Promtail
1438
kubernetes.core.helm:
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
logging {
2+
level = "info"
3+
format = "logfmt"
4+
{% if gardener_logging_alloy_meta_monitoring_enabled %}
5+
write_to = [loki.relabel.alloy_self.receiver]
6+
{% endif %}
7+
}
8+
9+
{% if gardener_logging_alloy_meta_monitoring_enabled %}
10+
loki.relabel "alloy_self" {
11+
forward_to = [loki.write.default.receiver]
12+
13+
rule {
14+
target_label = "job"
15+
replacement = "alloy"
16+
}
17+
}
18+
{% endif %}
19+
20+
discovery.kubernetes "pods" {
21+
role = "pod"
22+
}
23+
24+
discovery.relabel "pods" {
25+
targets = discovery.kubernetes.pods.targets
26+
27+
rule {
28+
source_labels = ["__meta_kubernetes_namespace"]
29+
target_label = "namespace"
30+
}
31+
32+
rule {
33+
source_labels = ["__meta_kubernetes_pod_name"]
34+
target_label = "pod"
35+
}
36+
37+
rule {
38+
source_labels = ["__meta_kubernetes_pod_container_name"]
39+
target_label = "container"
40+
}
41+
42+
rule {
43+
source_labels = ["__meta_kubernetes_pod_label_app"]
44+
target_label = "app"
45+
}
46+
}
47+
48+
loki.source.kubernetes "pods" {
49+
targets = discovery.relabel.pods.output
50+
forward_to = [loki.write.default.receiver]
51+
}
52+
53+
loki.write "default" {
54+
{% for endpoint in gardener_logging_alloy_loki_write_endpoints %}
55+
endpoint {
56+
url = "{{ endpoint.url }}"
57+
{% if endpoint.basic_auth is defined %}
58+
basic_auth {
59+
username = "{{ endpoint.basic_auth.username }}"
60+
password = "{{ endpoint.basic_auth.password }}"
61+
}
62+
{% endif %}
63+
}
64+
{% endfor %}
65+
external_labels = {
66+
cluster = "{{ gardener_logging_alloy_cluster_label }}",
67+
}
68+
}
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Source with all the defaults: https://raw.githubusercontent.com/grafana/alloy/main/operations/helm/charts/alloy/values.yaml
2+
alloy:
3+
configMap:
4+
# -- Create a new ConfigMap for the config file.
5+
create: true
6+
# -- Content to assign to the new ConfigMap. This is passed into `tpl` allowing for templating from values.
7+
content: |-
8+
{{ gardener_logging_alloy_config | indent(6) }}
9+
10+
# -- Port to listen for traffic on.
11+
listenPort: {{ gardener_logging_alloy_port }}
12+
13+
# -- Enables sending Grafana Labs anonymous usage stats to help improve Grafana
14+
# Alloy.
15+
enableReporting: false
16+
17+
controller:
18+
# -- Tolerations to apply to Grafana Alloy pods.
19+
tolerations:
20+
- key: node-role.kubernetes.io/master
21+
operator: Exists
22+
effect: NoSchedule
23+
- key: node-role.kubernetes.io/control-plane
24+
operator: Exists
25+
effect: NoSchedule
26+
27+
serviceMonitor:
28+
enabled: {{ gardener_logging_alloy_meta_monitoring_enabled | lower }}

control-plane/roles/logging/README.md

Lines changed: 46 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11
# logging
22

3-
This role is designed to set up logging using Ansible.
4-
The role includes tasks to install and configure the following logging tools:
3+
Deploys the control-plane logging stack into the Kubernetes control-plane cluster.
54

6-
- Loki
7-
- Logging ingress for Loki
8-
- Promtail for monitoring the control plane cluster
5+
Components:
6+
7+
- **Loki** — log storage and query backend
8+
- **Alloy** — log collector (DaemonSet), replaces Promtail. Collects pod logs via the Kubernetes API (`loki.source.kubernetes`) and forwards them to Loki.
9+
- Loki ingress with optional TLS and basic auth
10+
11+
## Configuration
12+
13+
The Alloy River config is generated from structured variables at deploy time. Override individual variables to customize behavior, or bypass the template entirely with `logging_alloy_config_raw`.
914

1015
## Variables
1116

@@ -29,4 +34,39 @@ The following variables can be set to configure the role:
2934
| logging_ingress_loki_basic_auth_password_salt | | The basic auth password salt used for stable password hashes |
3035
| logging_ingress_loki_basic_auth_password | | The basic auth password for the external loki ingress |
3136
| logging_ingress_loki_basic_auth_user | | The basic auth user for the external loki ingress |
32-
| logging_alloy_config | | The config to use for alloy |
37+
38+
### Alloy
39+
40+
| Name | Mandatory | Description |
41+
| --- | --- | --- |
42+
| logging_alloy_chart_version | yes | Helm chart version for alloy (release vector) |
43+
| logging_alloy_chart_repo | yes | Repository for alloy (release vector) |
44+
| logging_alloy_port | | Alloy listen port (default: `12345`) |
45+
| logging_alloy_loki_write_endpoints | | List of Loki push endpoints. Each entry: `{url, basic_auth?: {username, password}}` (default: `http://loki:3100/loki/api/v1/push`) |
46+
| logging_alloy_cluster_label | | Value for the `cluster=` external label on all log streams (default: `{{ metal_control_plane_stage_name }}`) |
47+
| logging_alloy_eventrouter_enabled | | Include the eventrouter `stage.match` pipeline (default: `true`) |
48+
| logging_alloy_meta_monitoring_enabled | | Forward alloy's own logs to Loki and create a `ServiceMonitor` for metrics. Requires kube-prometheus-stack to be deployed first (default: `false`) |
49+
| logging_alloy_config_raw | | Full Alloy River config string override. When set, bypasses all structured vars above. |
50+
51+
## Meta-monitoring
52+
53+
When `logging_alloy_meta_monitoring_enabled: true`:
54+
55+
- Alloy's own internal logs are forwarded to Loki with `job=alloy`
56+
- The alloy chart creates a `ServiceMonitor` — kube-prometheus-stack picks it up automatically (no label selector required since `serviceMonitorSelectorNilUsesHelmValues: false`)
57+
58+
Deploy the monitoring role first, then set this to `true` and reapply.
59+
60+
## Migration from Promtail
61+
62+
Alloy replaces Promtail as the log collector. The `promtail` Helm release is still deployed in parallel during the transition period.
63+
64+
Key differences from the Promtail helm chart:
65+
66+
| Promtail | Alloy |
67+
| --- | --- |
68+
| `config.clients[].url` | `logging_alloy_loki_write_endpoints[].url` |
69+
| `-client.external-labels=cluster=…` extraArg | `logging_alloy_cluster_label` var → `external_labels` in River config |
70+
| `pipelineStages: [cri, docker]` | Not needed — `loki.source.kubernetes` uses the Kubernetes API, CRI framing is already stripped |
71+
| `pipelineStages: [match(eventrouter)]` | `logging_alloy_eventrouter_enabled: true` |
72+
| `serviceMonitor.enabled` commented out (monitoring dependency) | `logging_alloy_meta_monitoring_enabled: false` by default, same reason |

control-plane/roles/logging/defaults/main.yaml

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,27 @@
11
---
22
logging_namespace: monitoring
33
logging_alloy_port: 12345
4-
logging_alloy_config: |
5-
logging {
6-
level = "info"
7-
format = "logfmt"
8-
}
4+
5+
# Loki push endpoints (same format as the partition alloy role)
6+
logging_alloy_loki_write_endpoints:
7+
- url: "http://loki:3100/loki/api/v1/push"
8+
# basic_auth:
9+
# username: "promtail"
10+
# password: "secret"
11+
12+
# Value for the cluster= external label attached to all log streams
13+
logging_alloy_cluster_label: "{{ metal_control_plane_stage_name }}"
14+
15+
# Whether to include the eventrouter pipeline stage (parses eventrouter JSON and promotes namespace label)
16+
logging_alloy_eventrouter_enabled: true
17+
18+
# Enable meta-monitoring: forward alloy's own logs to Loki and expose metrics via ServiceMonitor.
19+
# Requires kube-prometheus-stack (monitoring role) to be deployed first.
20+
logging_alloy_meta_monitoring_enabled: false
21+
22+
# Full Alloy River config override. When set, bypasses all structured vars above.
23+
# logging_alloy_config_raw: |
24+
925
logging_ingress_dns: "loki.{{ metal_control_plane_ingress_dns }}"
1026
logging_ingress_loki_tls: yes
1127
logging_ingress_loki_basic_auth_user: promtail

control-plane/roles/logging/tasks/main.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@
3535
helm_chart_version: "{{ logging_chart_version }}"
3636
helm_value_file_template: "loki-values.yaml"
3737

38+
- name: Build alloy config
39+
set_fact:
40+
logging_alloy_config: "{{ lookup('template', 'alloy-config.alloy.j2') if (logging_alloy_config_raw | default('') | length == 0) else logging_alloy_config_raw }}"
41+
3842
- name: Deploy alloy
3943
include_role:
4044
name: ansible-common/roles/helm-chart

0 commit comments

Comments
 (0)