diff --git a/docs/configuration/holmesgpt/builtin_toolsets.rst b/docs/configuration/holmesgpt/builtin_toolsets.rst index dd8b131a6..d58bc9bb1 100644 --- a/docs/configuration/holmesgpt/builtin_toolsets.rst +++ b/docs/configuration/holmesgpt/builtin_toolsets.rst @@ -19,7 +19,8 @@ Builtin Toolsets toolsets/kafka toolsets/kubernetes toolsets/notion - toolsets/opensearch + toolsets/opensearch_logs + toolsets/opensearch_status toolsets/prometheus toolsets/rabbitmq toolsets/robusta @@ -105,9 +106,14 @@ by the user by providing credentials or API keys to external systems. :link: toolsets/notion :link-type: doc - .. grid-item-card:: :octicon:`cpu;1em;` OpenSearch + .. grid-item-card:: :octicon:`cpu;1em;` OpenSearch logs :class-card: sd-bg-light sd-bg-text-light - :link: toolsets/opensearch + :link: toolsets/opensearch_logs + :link-type: doc + + .. grid-item-card:: :octicon:`cpu;1em;` OpenSearch status + :class-card: sd-bg-light sd-bg-text-light + :link: toolsets/opensearch_status :link-type: doc .. grid-item-card:: :octicon:`cpu;1em;` Prometheus diff --git a/docs/configuration/holmesgpt/index.rst b/docs/configuration/holmesgpt/index.rst index cd848498e..337245d92 100644 --- a/docs/configuration/holmesgpt/index.rst +++ b/docs/configuration/holmesgpt/index.rst @@ -365,7 +365,7 @@ Builtin toolsets Built-in toolsets cover essential areas like pod status inspection, node health analysis, application diagnostics, and resource utilization monitoring. These toolsets include access to -Kubernetes events and logs, AWS, Grafana, Opensearch, etc. See the full list :doc:`here `. +Kubernetes events and logs, AWS, Grafana, OpenSearch, etc. See the full list :doc:`here `. Custom toolsets ---------------- diff --git a/docs/configuration/holmesgpt/toolsets/_disable_default_logging_toolset.inc.rst b/docs/configuration/holmesgpt/toolsets/_disable_default_logging_toolset.inc.rst new file mode 100644 index 000000000..3f976c8df --- /dev/null +++ b/docs/configuration/holmesgpt/toolsets/_disable_default_logging_toolset.inc.rst @@ -0,0 +1,31 @@ + +Disabling the Default Logging Toolset +***************************************** + +The default HolmesGPT logging tool **must** be disabled if you use a different datasource for logs. +HolmesGPT may still use kubectl to fetch logs and never call your datasource if ``kubernetes/logs`` is not disabled. +To disable the default logging toolset, add the following to your holmes configuration: + +.. md-tab-set:: + + .. md-tab-item:: Robusta Helm Chart + + .. code-block:: yaml + + holmes: + toolsets: + kubernetes/logs: + enabled: false + + + .. include:: ./_toolset_configuration.inc.rst + + .. md-tab-item:: Holmes CLI + + Add the following to **~/.holmes/config.yaml**, creating the file if it doesn't exist: + + .. code-block:: yaml + + toolsets: + kubernetes/logs: + enabled: false \ No newline at end of file diff --git a/docs/configuration/holmesgpt/toolsets/_toolsets_that_provide_logging.inc.rst b/docs/configuration/holmesgpt/toolsets/_toolsets_that_provide_logging.inc.rst index 2f5866198..95ff64816 100644 --- a/docs/configuration/holmesgpt/toolsets/_toolsets_that_provide_logging.inc.rst +++ b/docs/configuration/holmesgpt/toolsets/_toolsets_that_provide_logging.inc.rst @@ -1,5 +1,6 @@ HolmesGPT provides several out-of-the-box alternatives for log access. You can select from these options: -* :ref:`kubernetes/logs `: Access logs with ``kubectl logs`` commands. **This is the default toolset.** +* :ref:`kubernetes/logs `: Access logs directly through Kubernetes. **This is the default toolset.** * :ref:`coralogix/logs `: Access logs through Coralogix. * :ref:`grafana/loki `: Access Loki logs by proxying through a Grafana instance. +* :ref:`opensearch/logs `: Access logs through OpenSearch. diff --git a/docs/configuration/holmesgpt/toolsets/coralogix_logs.rst b/docs/configuration/holmesgpt/toolsets/coralogix_logs.rst index 651a7be3c..531f9f55c 100644 --- a/docs/configuration/holmesgpt/toolsets/coralogix_logs.rst +++ b/docs/configuration/holmesgpt/toolsets/coralogix_logs.rst @@ -3,10 +3,10 @@ Coralogix logs ============== -By enabling this toolset, HolmesGPT will fetch node and pods logs from `Coralogix `_. +By enabling this toolset, HolmesGPT will fetch pod logs from `Coralogix `_. You **should** enable this toolset to replace the default :ref:`kubernetes/logs ` -toolset if all your kubernetes/pod logs are consolidated inside Coralogix. It will make it easier for HolmesGPT +toolset if all your kubernetes pod logs are consolidated inside Coralogix. It will make it easier for HolmesGPT to fetch incident logs, including the ability to precisely consult past logs. @@ -43,7 +43,7 @@ Configuration team_hostname: my-team # Your team's hostname in coralogix, without the domain part kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled .. include:: ./_toolset_configuration.inc.rst @@ -63,13 +63,13 @@ Configuration team_hostname: my-team # Your team's hostname in coralogix kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled Advanced Configuration ^^^^^^^^^^^^^^^^^^^^^^ Frequent logs and archive -**************************** +************************* By default, holmes fetched the logs from the `Frequent search `_ tier and only fetch logs from the `Archive` tier if the frequent search returned no result. @@ -100,10 +100,10 @@ Here is a description of each possible log retrieval methodology: - **BOTH_FREQUENT_SEARCH_AND_ARCHIVE** Always use both the frequent search and the archive to fetch logs. The result contains merged data which is deduplicated and sorted by timestamp. Search labels -*************** +************* You can tweak the labels used by the toolset to identify kubernetes resources. This is **optional** and only needed if your -logs settings for ``pod``, ``namespace``, ``application`` and ``subsystem`` differ from the defaults in the example below. +logs settings for ``pod`` and ``namespace`` differ from the defaults in the example below. .. code-block:: yaml @@ -114,8 +114,6 @@ logs settings for ``pod``, ``namespace``, ``application`` and ``subsystem`` diff labels: # OPTIONAL: tweak the filters used by HolmesGPT if your coralogix configuration is non standard namespace: "kubernetes.namespace_name" pod: "kubernetes.pod_name" - application: "coralogix.metadata.applicationName" - subsystem: "coralogix.metadata.subsystemName" ... @@ -126,19 +124,8 @@ You can verify what labels to use by attempting to run a query in the coralogix :align: center -Disabling the default toolset -********************************* -If Coralogix is your primary datasource for logs, it is **advised** to disable the default HolmesGPT logging -tool by disabling the ``kubernetes/logs`` toolset. Without this. HolmesGPT may still use kubectl to -fetch logs instead of Coralogix. - -.. code-block:: yaml - - holmes: - toolsets: - kubernetes/logs: - enabled: false +.. include:: ./_disable_default_logging_toolset.inc.rst Capabilities @@ -152,5 +139,5 @@ Capabilities * - Tool Name - Description - * - fetch_coralogix_logs_for_resource + * - fetch_pod_logs - Retrieve logs using coralogix diff --git a/docs/configuration/holmesgpt/toolsets/grafanaloki.rst b/docs/configuration/holmesgpt/toolsets/grafanaloki.rst index fb7c9b5c6..7b421a7be 100644 --- a/docs/configuration/holmesgpt/toolsets/grafanaloki.rst +++ b/docs/configuration/holmesgpt/toolsets/grafanaloki.rst @@ -3,7 +3,7 @@ Loki ==== -By enabling this toolset, HolmesGPT will fetch node and pods logs from `Loki `_. +By enabling this toolset, HolmesGPT will fetch pod logs from `Loki `_. Loki can be accessed directly or by proxying through a `Grafana `_ instance. You **should** enable this toolset to replace the default :ref:`kubernetes/logs ` @@ -20,7 +20,7 @@ Proxying through Grafana This is the recommended approach because we intend to add more capabilities to the toolset that are only available with Grafana. Prerequisites -------------- +************* A `Grafana service account token `_ with the following permissions: @@ -79,7 +79,7 @@ A simple way to get the datasource UID is to access the Grafana API by running t Configuration (grafana proxy) ------------------------------ +***************************** .. md-tab-set:: @@ -98,7 +98,7 @@ Configuration (grafana proxy) grafana_datasource_uid: kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled .. include:: ./_toolset_configuration.inc.rst @@ -118,7 +118,7 @@ Configuration (grafana proxy) grafana_datasource_uid: kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled Direct connection ^^^^^^^^^^^^^^^^^ @@ -128,7 +128,7 @@ This is done by not setting the ``grafana_datasource_uid`` field. Not setting th assume that it is directly connecting to Loki. Configuration (direct connection) ---------------------------------- +********************************* .. md-tab-set:: @@ -146,7 +146,7 @@ Configuration (direct connection) X-Scope-OrgID: "" # Set the X-Scope-OrgID if loki multitenancy is enabled kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled .. include:: ./_toolset_configuration.inc.rst @@ -166,13 +166,14 @@ Configuration (direct connection) X-Scope-OrgID: "" # Set the X-Scope-OrgID if loki multitenancy is enabled kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism + enabled: false # HolmesGPT's default logging mechanism MUST be disabled Advanced configuration ^^^^^^^^^^^^^^^^^^^^^^ -**Search labels** +Search labels +************* You can tweak the labels used by the toolset to identify kubernetes resources. This is only needed if your Loki logs settings for ``pod``, and ``namespace`` differ from the defaults in the example above. @@ -223,19 +224,7 @@ Use the following commands to list Loki's labels and determine which ones to use curl http://localhost:3100/loki/api/v1/labels -**Disabling the default toolset** - -If Loki is your primary datasource for logs, it is **advised** to disable the default HolmesGPT logging -tool by disabling the ``kubernetes/logs`` toolset. Without this. HolmesGPT may still use kubectl to -fetch logs instead of Loki. - -.. code-block:: yaml - - holmes: - toolsets: - kubernetes/logs: - enabled: false - +.. include:: ./_disable_default_logging_toolset.inc.rst Capabilities ^^^^^^^^^^^^ @@ -248,7 +237,5 @@ Capabilities * - Tool Name - Description - * - fetch_loki_logs_for_resource - - Fetches the Loki logs for a given kubernetes resource - * - fetch_loki_logs - - Fetches Loki logs from any query + * - fetch_pod_logs + - Fetches pod logs \ No newline at end of file diff --git a/docs/configuration/holmesgpt/toolsets/grafanatempo.rst b/docs/configuration/holmesgpt/toolsets/grafanatempo.rst index 96bb9a1bc..d736906b4 100644 --- a/docs/configuration/holmesgpt/toolsets/grafanatempo.rst +++ b/docs/configuration/holmesgpt/toolsets/grafanatempo.rst @@ -135,9 +135,6 @@ Configuration (direct connection) headers: X-Scope-OrgID: "" # Set the X-Scope-OrgID if tempo multitenancy is enabled - kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism - .. include:: ./_toolset_configuration.inc.rst @@ -155,9 +152,6 @@ Configuration (direct connection) headers: X-Scope-OrgID: "" # Set the X-Scope-OrgID if tempo multitenancy is enabled - kubernetes/logs: - enabled: false # Disable HolmesGPT's default logging mechanism - Advanced configuration ^^^^^^^^^^^^^^^^^^^^^^ diff --git a/docs/configuration/holmesgpt/toolsets/kubernetes.rst b/docs/configuration/holmesgpt/toolsets/kubernetes.rst index b2cee3f82..dbe2f39ff 100644 --- a/docs/configuration/holmesgpt/toolsets/kubernetes.rst +++ b/docs/configuration/holmesgpt/toolsets/kubernetes.rst @@ -6,7 +6,7 @@ Core :checkmark:`_` -------------------- .. include:: ./_toolset_enabled_by_default.inc.rst -By enabling this toolset, HolmesGPT will be able to describe and find kubernetes resources like +By enabling this toolset, HolmesGPT will be able to describe and find Kubernetes resources like nodes, deployments, pods, etc. Configuration @@ -90,22 +90,8 @@ Capabilities * - Tool Name - Description - * - kubectl_previous_logs - - Run `kubectl logs --previous` on a single Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash. Never give a deployment name or a resource that is not a pod. - * - kubectl_previous_logs_all_containers - - Run `kubectl logs --previous` on a single Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash. - * - kubectl_container_previous_logs - - Run `kubectl logs --previous` on a single container of a Kubernetes pod. Used to fetch logs for a pod that crashed and see logs from before the crash. - * - kubectl_logs - - Run `kubectl logs` on a single Kubernetes pod. Never give a deployment name or a resource that is not a pod. - * - kubectl_logs_all_containers - - Run `kubectl logs` on all containers within a single Kubernetes pod. - * - kubectl_container_logs - - Run `kubectl logs` on a single container within a Kubernetes pod. This is to get the logs of a specific container in a multi-container pod. - * - kubectl_logs_grep - - Search for a specific term in the logs of a single Kubernetes pod. Only provide a pod name, not a deployment or other resource. - * - kubectl_logs_all_containers_grep - - kubectl logs {{pod_name}} -n {{ namespace }} --all-containers | grep {{ search_term }} + * - fetch_pod_logs + - Fetches logs from a kubernetes pod Live metrics diff --git a/docs/configuration/holmesgpt/toolsets/opensearch_logs.rst b/docs/configuration/holmesgpt/toolsets/opensearch_logs.rst new file mode 100644 index 000000000..1f4a612e8 --- /dev/null +++ b/docs/configuration/holmesgpt/toolsets/opensearch_logs.rst @@ -0,0 +1,134 @@ +.. _toolset_opensearch_logs: + +OpenSearch logs +============== + +By enabling this toolset, HolmesGPT will fetch pod logs from `OpenSearch `_. + +You **should** enable this toolset to replace the default :ref:`kubernetes/logs ` +toolset if all your kubernetes pod logs are consolidated inside OpenSearch/Elastic. It will make it easier for HolmesGPT +to fetch incident logs, including the ability to precisely consult past logs. + + +.. include:: ./_toolsets_that_provide_logging.inc.rst + +Configuration +^^^^^^^^^^^^^ + +.. md-tab-set:: + + .. md-tab-item:: Robusta Helm Chart + + .. code-block:: yaml + + holmes: + toolsets: + opensearch/logs: + enabled: true + config: + opensearch_url: https://skdjasid.europe-west1.gcp.cloud.es.io:443 # The URL to your opensearch cluster. + index_pattern: fluentd-* # The pattern matching the indexes containing the logs. Supports wildcards + opensearch_auth_header: "ApiKey b0ZlwQWEsdwAkv047bafirkallDFWJIWDWdwlQQ==" # An optional header value set to the `Authorization` header for every request to opensearch. + labels: # set the labels according to how values are mapped in your opensearch cluster + pod: "kubernetes.pod_name" + namespace: "kubernetes.namespace_name" + timestamp: "@timestamp" + message: "message" + + kubernetes/logs: + enabled: false # HolmesGPT's default logging mechanism MUST be disabled + + + .. include:: ./_toolset_configuration.inc.rst + + .. md-tab-item:: Holmes CLI + + Add the following to **~/.holmes/config.yaml**, creating the file if it doesn't exist: + + .. code-block:: yaml + + toolsets: + opensearch/logs: + enabled: true + config: + opensearch_url: + index_pattern: # The pattern matching the indexes containing the logs. Supports wildcards. For example `fluentd-*` + opensearch_auth_header: "ApiKey <...>" # An optional header value set to the `Authorization` header for every request to opensearch + labels: # set the labels according to how values are mapped in your opensearch cluster + pod: "kubernetes.pod_name" + namespace: "kubernetes.namespace_name" + timestamp: "@timestamp" + message: "message" + + kubernetes/logs: + enabled: false # HolmesGPT's default logging mechanism MUST be disabled + +Configuring index_pattern and labels +************************************ + +You can tweak the labels used by the toolset to identify kubernetes resources. This is **optional** and only needed if your +logs settings differ from the defaults in the example below. + +.. code-block:: yaml + + toolsets: + opensearch/logs: + enabled: true + config: + index_pattern: fluentd-* + labels: + pod: "kubernetes.pod_name" + namespace: "kubernetes.namespace_name" + timestamp: "@timestamp" + message: "message" + + +Below is a screenshot of a query that was done using Elastic dev tools to find out what should be the values for the labels. + +.. image:: /images/opensearch_toolset_labels_example.png + :width: 600 + :align: center + +In the image above, the following values and labels are identified by a yellow rectangle: + + +.. list-table:: + :header-rows: 1 + :widths: 20 20 60 + + * - Configuration field + - Value + - Description + * - index_pattern + - fluentd-* + - This defines what opensearch indexes should be used to fetch logs + * - pod + - kubernetes.pod_name + - The kubernetes pod name + * - namespace + - kubernetes.namespace_name + - The kubernetes namespace + * - timestamp + - @timestamp + - This timestamp is used to search logs by time range. + * - message + - message + - This is the content of the log message + + +.. include:: ./_disable_default_logging_toolset.inc.rst + + +Capabilities +^^^^^^^^^^^^ + +.. include:: ./_toolset_capabilities.inc.rst + +.. list-table:: + :header-rows: 1 + :widths: 30 70 + + * - Tool Name + - Description + * - fetch_pod_logs + - Retrieve logs using opensearch diff --git a/docs/configuration/holmesgpt/toolsets/opensearch.rst b/docs/configuration/holmesgpt/toolsets/opensearch_status.rst similarity index 92% rename from docs/configuration/holmesgpt/toolsets/opensearch.rst rename to docs/configuration/holmesgpt/toolsets/opensearch_status.rst index a0d5e4859..cc0a5ac05 100644 --- a/docs/configuration/holmesgpt/toolsets/opensearch.rst +++ b/docs/configuration/holmesgpt/toolsets/opensearch_status.rst @@ -1,5 +1,5 @@ -Opensearch -========== +OpenSearch status +================= By enabling this toolset, HolmesGPT will be able to access cluster metadata information like health, shards, and settings. This allows HolmesGPT to better troubleshoot problems @@ -8,11 +8,11 @@ with one or more opensearch clusters. Configuration ------------- -The configuration for Opensearch is passed through to the underlying +The configuration for OpenSearch is passed through to the underlying `opensearch-py library `_. Consult this library's `user guide `_ or `reference documentation `_ -for configuring the connection to Opensearch, including how to authenticate this toolset to an opensearch cluster. +for configuring the connection to OpenSearch, including how to authenticate this toolset to an opensearch cluster. .. code-block:: yaml @@ -35,7 +35,7 @@ for configuring the connection to Opensearch, including how to authenticate this username: password: -Here is an example of an insecure Opensearch configuration for local development using a bearer token: +Here is an example of an insecure OpenSearch configuration for local development using a bearer token: .. code-block:: yaml diff --git a/docs/images/opensearch_toolset_labels_example.png b/docs/images/opensearch_toolset_labels_example.png new file mode 100644 index 000000000..4f5e75024 Binary files /dev/null and b/docs/images/opensearch_toolset_labels_example.png differ