Skip to content

Commit c5d4723

Browse files
authored
Merge branch 'master' into holmes-upgrade-bugfix
2 parents 5becffc + bcc7c1a commit c5d4723

7 files changed

Lines changed: 290 additions & 56 deletions

File tree

docs/configuration/alertmanager-integration/_testing_integration.rst

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,43 @@ Verify it Works
33

44
Send a dummy alert to AlertManager:
55

6-
.. code-block:: bash
6+
.. tab-set::
77

8-
robusta demo-alert
8+
.. tab-item:: Robusta CLI
9+
10+
If you have the Robusta CLI installed, you can send a test alert using the following command:
11+
12+
.. code-block:: bash
13+
14+
robusta demo-alert
15+
16+
.. tab-item:: Robusta UI
17+
18+
In the Robusta UI, go to the "Clusters" tab, choose the right cluster and click "Simulate Alert".
19+
20+
.. image:: /images/robusta-ui-simulate-alert-1.png
21+
:alt: Choose the cluster
22+
:width: 900
23+
:align: center
24+
25+
Then
26+
27+
1. Check **Send alert with no resource**.
28+
2. Provide a name for the alert in the **Alert name (identifier)** field (e.g., "Testing Prod AlertManager").
29+
3. Select **Alert Manager** under the "Send alert to" section.
30+
4. Click the **Simulate Alert** button to send the test alert.
31+
32+
.. image:: /images/robusta-ui-simulate-alert-2.png
33+
:alt: Send Test Alert
34+
:width: 600
35+
:align: center
936

1037
If everything is setup properly, this alert will reach Robusta. It will show up in the Robusta UI, Slack, and other configured sinks.
1138

39+
.. note::
40+
41+
It might take a few minutes for the alert to arrive due to AlertManager's `group_wait` and `group_interval` settings. More info `here <https://prometheus.io/docs/alerting/latest/configuration/#:~:text=How%20long%20to%20wait%20before%20sending%20a%20notification%20about%20new%20alerts%20that%0A%23%20are%20added%20to%20a%20group%20of%20alerts%20for%20which%20an%20initial%20notification%20has%0A%23%20already%20been%20sent>`_.
42+
1243
.. details:: I configured AlertManager, but I'm not receiving alerts?
1344
:class: warning
1445

docs/configuration/alertmanager-integration/victoria-metrics.rst

Lines changed: 30 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,26 +8,48 @@ You will need to configure two integrations: one to send alerts to Robusta and a
88
Send Alerts to Robusta
99
============================
1010

11-
To configure, edit AlertManager's configuration:
11+
Add the following to your Victoria Metrics Alertmanager configuration (e.g., Helm values file or VMAlertmanagerConfig CRD):
12+
13+
.. code-block:: yaml
14+
15+
receivers:
16+
- name: 'robusta'
17+
webhook_configs:
18+
- url: 'http://<ROBUSTA-HELM-RELEASE-NAME>-runner.<NAMESPACE>.svc.cluster.local/api/alerts'
19+
send_resolved: true # (3)
20+
21+
route: # (1)
22+
routes:
23+
- receiver: 'robusta'
24+
group_by: [ '...' ]
25+
group_wait: 1s
26+
group_interval: 1s
27+
matchers:
28+
- severity =~ ".*"
29+
repeat_interval: 4h
30+
continue: true # (2)
31+
32+
.. code-annotations::
33+
1. Put Robusta's route as the first route, to guarantee it receives alerts. If you can't do so, you must guarantee all previous routes has ``continue: true`` set.
34+
2. Keep sending alerts to receivers defined after Robusta.
35+
3. Important, so Robusta knows when alerts are resolved.
1236

13-
.. include:: ./_alertmanager-config.rst
1437

1538
.. include:: ./_testing_integration.rst
1639

17-
Configure Metric Querying
40+
Configure Metrics Querying
1841
====================================
1942

20-
Metrics querying lets Robusta pull metrics and create silences.
43+
Robusta can query metrics and create silences using Victoria Metrics. If both are in the same Kubernetes cluster, Robusta can auto-detect the Victoria Metrics service. To verify, go to the "Apps" tab in Robusta, select an application, and check for usage graphs.
2144

22-
Add the following to ``generated_values.yaml`` and :ref:`update Robusta <Simple Upgrade>`.
45+
If auto-detection fails you must add the ``prometheus_url`` parameter and :ref:`update Robusta <Simple Upgrade>`.
2346

2447
.. code-block:: yaml
2548
2649
globalConfig: # this line should already exist
2750
# add the lines below
28-
alertmanager_url: "http://ALERT_MANAGER_SERVICE_NAME.NAMESPACE.svc.cluster.local:9093"
29-
grafana_url: ""
30-
prometheus_url: "http://VICTORIA_METRICS_SERVICE_NAME.NAMESPACE.svc.cluster.local:8429"
51+
alertmanager_url: "http://<VM_ALERT_MANAGER_SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:9093" # Example:"http://vmalertmanager-victoria-metrics-vm.default.svc.cluster.local:9093/"
52+
prometheus_url: "http://VM_Metrics_SERVICE_NAME.NAMESPACE.svc.cluster.local:8429" # Example:"http://vmsingle-vmks-victoria-metrics-k8s-stack.default.svc.cluster.local:8429"
3153
# Add any labels that are relevant to the specific cluster (optional)
3254
# prometheus_additional_labels:
3355
# cluster: 'CLUSTER_NAME_HERE'

docs/configuration/holmesgpt/toolsets/grafanaloki.rst

Lines changed: 101 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
.. _toolset_grafana_loki:
22

3-
Loki (Grafana)
4-
===============
3+
Loki
4+
====
55

6-
By enabling this toolset, HolmesGPT will fetch node and pods logs from `Loki <https://grafana.com/oss/loki/>`_
7-
by proxying through a `Grafana <https://grafana.com/oss/grafana/>`_ instance.
6+
By enabling this toolset, HolmesGPT will fetch node and pods logs from `Loki <https://grafana.com/oss/loki/>`_.
7+
Loki can be accessed directly or by proxying through a `Grafana <https://grafana.com/oss/grafana/>`_ instance.
88

99
You **should** enable this toolset to replace the default :ref:`kubernetes/logs <toolset_kubernetes_logs>`
1010
toolset if all your kubernetes/pod logs are consolidated inside Loki. It will make it easier for HolmesGPT
@@ -13,8 +13,14 @@ to fetch incident logs, including the ability to precisely consult past logs.
1313

1414
.. include:: ./_toolsets_that_provide_logging.inc.rst
1515

16+
17+
Proxying through Grafana
18+
^^^^^^^^^^^^^^^^^^^^^^^^
19+
20+
This is the recommended approach because we intend to add more capabilities to the toolset that are only available with Grafana.
21+
1622
Prerequisites
17-
^^^^^^^^^^^^^
23+
-------------
1824

1925
A `Grafana service account token <https://grafana.com/docs/grafana/latest/administration/service-accounts/>`_
2026
with the following permissions:
@@ -24,8 +30,7 @@ with the following permissions:
2430

2531
Check out this `video <https://www.loom.com/share/f969ab3af509444693802254ab040791?sid=aa8b3c65-2696-4f69-ae47-bb96e8e03c47>`_ on creating a Grafana service account token.
2632

27-
Getting Grafana URL
28-
-----------------------
33+
**Getting Grafana URL**
2934

3035
You can find the Grafana URL required for Loki in your Grafana cloud account settings.
3136

@@ -34,8 +39,7 @@ You can find the Grafana URL required for Loki in your Grafana cloud account set
3439
:align: center
3540

3641

37-
Obtaining the datasource UID
38-
-----------------------------------
42+
**Obtaining the datasource UID**
3943

4044
You may have multiple Loki data sources setup in Grafana. HolmesGPT uses a single Loki datasource to
4145
fetch the logs and it needs to know the UID of this datasource.
@@ -74,8 +78,8 @@ A simple way to get the datasource UID is to access the Grafana API by running t
7478
# with UID "klja8hsa-8a9c-4b35-1230-7baab22b02ee"
7579
7680
77-
Configuration
78-
^^^^^^^^^^^^^
81+
Configuration (grafana proxy)
82+
-----------------------------
7983

8084

8185
.. md-tab-set::
@@ -92,9 +96,6 @@ Configuration
9296
api_key: <your grafana API key>
9397
url: https://xxxxxxx.grafana.net # Your Grafana cloud account URL
9498
grafana_datasource_uid: <the UID of the loki data source in Grafana>
95-
labels:
96-
pod: "pod"
97-
namespace: "namespace"
9899
99100
kubernetes/logs:
100101
enabled: false # Disable HolmesGPT's default logging mechanism
@@ -115,19 +116,102 @@ Configuration
115116
api_key: <your grafana API key>
116117
url: https://xxxxxxx.grafana.net # Your Grafana cloud account URL
117118
grafana_datasource_uid: <the UID of the loki data source in Grafana>
118-
labels:
119-
pod: "pod"
120-
namespace: "namespace"
121119
122120
kubernetes/logs:
123121
enabled: false # Disable HolmesGPT's default logging mechanism
124122
123+
Direct connection
124+
^^^^^^^^^^^^^^^^^
125+
126+
The toolset can directly connect to a Loki instance without proxying through a Grafana instance.
127+
This is done by not setting the ``grafana_datasource_uid`` field. Not setting this field makes HolmesGPT
128+
assume that it is directly connecting to Loki.
129+
130+
Configuration (direct connection)
131+
---------------------------------
132+
133+
.. md-tab-set::
134+
135+
.. md-tab-item:: Robusta Helm Chat
136+
137+
.. code-block:: yaml
138+
139+
holmes:
140+
toolsets:
141+
grafana/loki:
142+
enabled: true
143+
config:
144+
url: http://loki.logging
145+
headers:
146+
X-Scope-OrgID: "<tenant id>" # Set the X-Scope-OrgID if loki multitenancy is enabled
147+
148+
kubernetes/logs:
149+
enabled: false # Disable HolmesGPT's default logging mechanism
150+
151+
152+
.. include:: ./_toolset_configuration.inc.rst
153+
154+
.. md-tab-item:: Holmes CLI
155+
156+
Add the following to **~/.holmes/config.yaml**, creating the file if it doesn't exist:
157+
158+
.. code-block:: yaml
159+
160+
toolsets:
161+
grafana/loki:
162+
enabled: true
163+
config:
164+
url: http://loki.logging
165+
headers:
166+
X-Scope-OrgID: "<tenant id>" # Set the X-Scope-OrgID if loki multitenancy is enabled
167+
168+
kubernetes/logs:
169+
enabled: false # Disable HolmesGPT's default logging mechanism
170+
171+
172+
Advanced configuration
173+
^^^^^^^^^^^^^^^^^^^^^^
125174

126175
**Search labels**
127176

128177
You can tweak the labels used by the toolset to identify kubernetes resources. This is only needed if your
129178
Loki logs settings for ``pod``, and ``namespace`` differ from the defaults in the example above.
130179

180+
181+
.. md-tab-set::
182+
183+
.. md-tab-item:: Robusta Helm Chat
184+
185+
.. code-block:: yaml
186+
187+
holmes:
188+
toolsets:
189+
grafana/loki:
190+
enabled: true
191+
config:
192+
url: ...
193+
labels:
194+
pod: "pod"
195+
namespace: "namespace"
196+
197+
.. include:: ./_toolset_configuration.inc.rst
198+
199+
.. md-tab-item:: Holmes CLI
200+
201+
Add the following to **~/.holmes/config.yaml**, creating the file if it doesn't exist:
202+
203+
.. code-block:: yaml
204+
205+
toolsets:
206+
grafana/loki:
207+
enabled: true
208+
config:
209+
url: ...
210+
labels:
211+
pod: "pod"
212+
namespace: "namespace"
213+
214+
131215
Use the following commands to list Loki's labels and determine which ones to use:
132216

133217
.. code-block:: bash

0 commit comments

Comments
 (0)