Skip to content

Commit daafc7c

Browse files
committed
docs
1 parent 21018ed commit daafc7c

4 files changed

Lines changed: 248 additions & 675 deletions

File tree

docs/configuration/alertmanager-integration/grafana-cloud-mimir.rst

Lines changed: 10 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,8 @@ You'll need credentials for Grafana API Access (used by both Robusta Runner and
3535
Find Your Cluster Name
3636
^^^^^^^^^^^^^^^^^^^^^^
3737

38-
The cluster name is used to identify your specific cluster in Prometheus queries:
38+
If your grafana setup covers multiple clusters, the cluster name is required and used to
39+
identify your specific cluster in Prometheus queries:
3940

4041
1. Go to Grafana → Explore
4142
2. Run query: ``up{cluster!=""}``
@@ -52,10 +53,7 @@ Using the Grafana API, list your datasources:
5253
curl -H "Authorization: Bearer YOUR_GLSA_TOKEN" \
5354
"https://YOUR-INSTANCE.grafana.net/api/datasources" | jq
5455
55-
Note the UIDs for:
56-
57-
* Prometheus datasource UID (typically ``grafanacloud-prom``)
58-
* AlertManager datasource UID (typically ``grafanacloud-ngalertmanager``)
56+
Note the UID for Prometheus datasource UID (typically ``grafanacloud-prom``)
5957

6058
Step 2: Configure Robusta Runner
6159
=================================
@@ -114,39 +112,16 @@ Step 3: Configure Holmes Prometheus Toolset
114112

115113
Holmes requires additional configuration to work with Grafana Cloud's Mimir backend.
116114

117-
Update Holmes Configuration
118-
^^^^^^^^^^^^^^^^^^^^^^^^^^^
119-
120-
Add to your ``generated_values.yaml`` under the ``holmes`` section:
121-
122-
.. code-block:: yaml
123-
124-
holmes:
125-
enableHolmesGPT: true
126-
additionalEnvVars:
127-
- name: MODEL
128-
value: YOUR_LLM_MODEL # e.g., gpt-4o, azure/gpt-4o
129-
130-
# Holmes-specific toolsets configuration
131-
toolsets:
132-
prometheus/metrics:
133-
enabled: true
134-
config:
135-
prometheus_url: https://YOUR-INSTANCE.grafana.net/api/datasources/proxy/uid/PROMETHEUS_DATASOURCE_UID
136-
fetch_labels_with_labels_api: false # Important for Mimir
137-
fetch_metadata_with_series_api: true # Important for Mimir
138-
headers:
139-
Authorization: Bearer YOUR_GLSA_TOKEN
140-
# X-Scope-Org-Id is usually not needed when using proxy endpoint
141-
142-
.. note::
115+
For detailed instructions on configuring Holmes with Grafana Cloud, see the **Grafana Cloud (Mimir) Configuration** section in :doc:`/configuration/holmesgpt/toolsets/prometheus`.
143116

144-
The ``fetch_labels_with_labels_api: false`` and ``fetch_metadata_with_series_api: true`` settings are important for compatibility with Mimir's API implementation.
117+
The key configuration points for Grafana Cloud are:
145118

146-
Apply Holmes Configuration
147-
^^^^^^^^^^^^^^^^^^^^^^^^^^
119+
* Use the proxy endpoint URL format: ``https://YOUR-INSTANCE.grafana.net/api/datasources/proxy/uid/PROMETHEUS_DATASOURCE_UID``
120+
* Set ``fetch_labels_with_labels_api: false`` (important for Mimir compatibility)
121+
* Set ``fetch_metadata_with_series_api: true`` (important for Mimir compatibility)
122+
* Use Bearer authentication with your service account token
148123

149-
Apply the changes and restart Holmes:
124+
After updating your ``generated_values.yaml`` with the Holmes configuration, apply the changes:
150125

151126
.. code-block:: bash
152127

docs/configuration/resource-recommender.rst

Lines changed: 87 additions & 110 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ Click ``New API Key``. Choose a name for your key, and check the ``KRR Read`` ca
172172
.. image:: /images/krr-api-key.png
173173
:width: 500px
174174

175-
GET /api/krr-recommendations
175+
GET /api/query/krr/recommendations
176176
----------------------------------------------------
177177

178178
Retrieves KRR resource recommendations for a specific cluster and namespace.
@@ -187,17 +187,17 @@ Retrieves KRR resource recommendations for a specific cluster and namespace.
187187
- Type
188188
- Description
189189
- Required
190-
* - ``cluster``
190+
* - ``account_id``
191191
- STRING
192-
- The cluster name to get recommendations for
192+
- The account ID associated with the API key
193193
- Yes
194-
* - ``namespace``
194+
* - ``cluster_id``
195195
- STRING
196-
- The namespace to get recommendations for (use "*" for all namespaces)
196+
- The cluster ID to get recommendations for
197197
- Yes
198-
* - ``limit``
199-
- INTEGER
200-
- Maximum number of recommendations to return (default: 100)
198+
* - ``namespace``
199+
- STRING
200+
- The namespace to filter recommendations (optional)
201201
- No
202202

203203
**Request Headers**
@@ -217,7 +217,7 @@ Retrieves KRR resource recommendations for a specific cluster and namespace.
217217

218218
.. code-block:: bash
219219
220-
curl -X GET "https://api.robusta.dev/api/krr-recommendations?cluster=my-cluster&namespace=default&limit=50" \
220+
curl -X GET "https://api.robusta.dev/api/query/krr/recommendations?account_id=YOUR_ACCOUNT_ID&cluster_id=my-cluster&namespace=default" \
221221
-H "Authorization: Bearer YOUR_API_KEY" \
222222
-H "Content-Type: application/json"
223223
@@ -226,125 +226,100 @@ Retrieves KRR resource recommendations for a specific cluster and namespace.
226226
.. code-block:: json
227227
228228
{
229-
"scans": [
229+
"cluster_id": "my-cluster",
230+
"scan_id": "12345-67890-abcde",
231+
"scan_date": "2024-01-07T12:00:00Z",
232+
"scan_state": "success",
233+
"results": [
230234
{
231-
"object": {
232-
"cluster": "my-cluster",
233-
"name": "nginx-deployment",
234-
"container": "nginx",
235-
"namespace": "default",
236-
"kind": "Deployment",
237-
"allocations": {
238-
"requests": {
239-
"cpu": 0.1,
240-
"memory": 128
241-
},
242-
"limits": {
243-
"cpu": 0.5,
244-
"memory": 512
245-
}
246-
},
247-
"warnings": [],
248-
"current_pod_count": 3
249-
},
250-
"recommended": {
251-
"requests": {
252-
"cpu": {
253-
"value": 0.05,
254-
"severity": "WARNING"
255-
},
256-
"memory": {
257-
"value": 64,
258-
"severity": "OK"
259-
}
260-
},
261-
"limits": {
262-
"cpu": {
263-
"value": 0.2,
264-
"severity": "WARNING"
265-
},
266-
"memory": {
267-
"value": 256,
268-
"severity": "OK"
269-
}
270-
},
271-
"info": {
272-
"cpu": "CPU usage is consistently low",
273-
"memory": "Memory usage is within acceptable range"
274-
}
275-
},
276-
"severity": "WARNING",
277-
"metrics": {
278-
"cpu": {
279-
"query": "avg(container_cpu_usage_seconds_total)",
280-
"start_time": "2024-01-01T00:00:00Z",
281-
"end_time": "2024-01-07T00:00:00Z",
282-
"step": "1h"
283-
},
284-
"memory": {
285-
"query": "avg(container_memory_usage_bytes)",
286-
"start_time": "2024-01-01T00:00:00Z",
287-
"end_time": "2024-01-07T00:00:00Z",
288-
"step": "1h"
289-
}
290-
}
235+
"cluster_id": "my-cluster",
236+
"namespace": "default",
237+
"name": "nginx-deployment",
238+
"kind": "Deployment",
239+
"container": "nginx",
240+
"priority": "MEDIUM",
241+
"current_cpu_request": 100,
242+
"recommended_cpu_request": 50,
243+
"current_cpu_limit": 500,
244+
"recommended_cpu_limit": 200,
245+
"current_memory_request": 134217728,
246+
"recommended_memory_request": 67108864,
247+
"current_memory_limit": 536870912,
248+
"recommended_memory_limit": 268435456,
249+
"pods_count": 3
291250
}
292-
],
293-
"score": 75,
294-
"resources": ["cpu", "memory"],
295-
"description": "Resource recommendations based on 7-day usage analysis",
296-
"strategy": {
297-
"name": "simple",
298-
"settings": {
299-
"history_duration": 336,
300-
"cpu_percentile": 99,
301-
"memory_buffer_percentage": 15
302-
}
303-
}
251+
]
304252
}
305253
306254
**Response Fields**
307255

308256
.. list-table::
309-
:widths: 25 15 60
257+
:widths: 30 15 55
310258
:header-rows: 1
311259

312260
* - Field
313261
- Type
314262
- Description
315-
* - ``scans``
263+
* - ``cluster_id``
264+
- STRING
265+
- The cluster ID for which recommendations were generated
266+
* - ``scan_id``
267+
- STRING
268+
- Unique identifier for this KRR scan
269+
* - ``scan_date``
270+
- STRING
271+
- Timestamp when the scan was completed (ISO 8601 format)
272+
* - ``scan_state``
273+
- STRING
274+
- State of the scan (e.g., "success")
275+
* - ``results``
316276
- ARRAY
317-
- Array of KRR scan results for workloads
318-
* - ``scans[].object.name``
277+
- Array of KRR recommendations for workloads
278+
* - ``results[].cluster_id``
279+
- STRING
280+
- Cluster ID of the workload
281+
* - ``results[].namespace``
282+
- STRING
283+
- Namespace of the workload
284+
* - ``results[].name``
319285
- STRING
320286
- Name of the Kubernetes workload
321-
* - ``scans[].object.kind``
287+
* - ``results[].kind``
322288
- STRING
323289
- Type of Kubernetes resource (Deployment, StatefulSet, etc.)
324-
* - ``scans[].object.namespace``
325-
- STRING
326-
- Namespace of the workload
327-
* - ``scans[].object.container``
290+
* - ``results[].container``
328291
- STRING
329292
- Container name within the workload
330-
* - ``scans[].object.allocations``
331-
- OBJECT
332-
- Current CPU/memory requests and limits
333-
* - ``scans[].recommended.requests``
334-
- OBJECT
335-
- Recommended CPU/memory requests with severity
336-
* - ``scans[].recommended.limits``
337-
- OBJECT
338-
- Recommended CPU/memory limits with severity
339-
* - ``scans[].severity``
293+
* - ``results[].priority``
340294
- STRING
341-
- Overall severity: CRITICAL, WARNING, OK, GOOD, UNKNOWN
342-
* - ``score``
295+
- Priority level of the recommendation
296+
* - ``results[].current_cpu_request``
297+
- NUMBER
298+
- Current CPU request in millicores
299+
* - ``results[].recommended_cpu_request``
300+
- NUMBER
301+
- Recommended CPU request in millicores
302+
* - ``results[].current_cpu_limit``
303+
- NUMBER
304+
- Current CPU limit in millicores
305+
* - ``results[].recommended_cpu_limit``
306+
- NUMBER
307+
- Recommended CPU limit in millicores
308+
* - ``results[].current_memory_request``
309+
- NUMBER
310+
- Current memory request in bytes
311+
* - ``results[].recommended_memory_request``
312+
- NUMBER
313+
- Recommended memory request in bytes
314+
* - ``results[].current_memory_limit``
315+
- NUMBER
316+
- Current memory limit in bytes
317+
* - ``results[].recommended_memory_limit``
318+
- NUMBER
319+
- Recommended memory limit in bytes
320+
* - ``results[].pods_count``
343321
- INTEGER
344-
- Overall efficiency score (0-100)
345-
* - ``strategy``
346-
- OBJECT
347-
- KRR strategy and settings used for recommendations
322+
- Number of pods for this workload
348323

349324
**Error Responses**
350325

@@ -361,15 +336,17 @@ Retrieves KRR resource recommendations for a specific cluster and namespace.
361336
.. code-block:: json
362337
363338
{
364-
"error": "Missing required parameter: cluster"
339+
"msg": "Bad query parameters [error details]",
340+
"error_code": "BAD_PARAMETER"
365341
}
366342
367-
**404 Not Found**
343+
**500 Internal Server Error**
368344

369345
.. code-block:: json
370346
371347
{
372-
"error": "Cluster 'my-cluster' not found or no data available"
348+
"msg": "Failed to query KRR recommendations",
349+
"error_code": "UNEXPECTED_ERROR"
373350
}
374351
375352
Reference

0 commit comments

Comments
 (0)