In this tutorial you will learn about Loki, which is a log aggregation system inspired by Prometheus.
Loki uses Promtail to fetch logs from all Pods running in your cluster. Then, logs are aggregated and compressed, and sent to the configured storage. Next, you can connect Loki data source to Grafana and view the logs.
Main reasons for using Loki:
Horizontally scalable.High availabilty.Multi-tenantmode available.Storesandindexesdata in a veryefficientway.Costeffective.Easyto operate.LogQLDSL (offers the same functionality asPromQLfromPrometheus, which is a plus if you're already familiar with it).
In the steps to follow, you will learn to:
- Install and configure
Lokifor yourDOKScluster. - Configure
GrafanawithLokidata source to query logs, usingLogQL. - Configure
Lokito usepersistentstorage for logs, viaDO Spaces.
After finishing all the steps, you will have a production ready Loki setup, that will help you troubleshoot any issues you might have with running applications. Just log in to Grafana, and make use of LogQL syntax to search, filter, and perform all kind of queries on application logs, to fetch useful information.
- Introduction
- Prerequisites
- Step 1 - Installing LOKI
- Step 2 - Configure Grafana with Loki
- Step 3 - Using LogQL
- Step 4 - Understanding Promtail
- Step 5 - Setting Persistent Storage for Loki
- Step 6 - Setting Loki Storage Retention
- Conclusion
To complete this tutorial, you will need:
- A DO Spaces bucket for
Lokistorage. Please follow the officialDigitalOceantutorial to create one. Make sure that it is set torestrict file listingfor security reasons. - Access keys for DO Spaces (S3-compatible object storage). Please follow the official
DigitalOceantutorial, to learn how to manage access keys. Keep theaccessandsecretkeys somewhere safe for later use. - A Git client, for cloning the
Starter Kitrepository. - Helm, for installing the
Lokistack chart. - Kubectl, for
Kubernetesinteraction. - A text editor with
YAMLlint support, for example: Visual Studio Code. - Emojivoto Sample App deployed in the cluster. Please follow the steps in its repository README.
In this step, you will learn how to deploy Loki to your DOKS cluster, using Helm.
First, clone the Starter Kit repository, and then change directory to your local copy:
git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
cd Kubernetes-Starter-Kit-DevelopersNext, add the Grafana Helm repository and list the available charts:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update grafana
helm search repo grafanaThe output looks similar to the following:
NAME CHART VERSION APP VERSION DESCRIPTION
grafana/grafana 6.20.5 8.3.4 The leading tool for querying and visualizing t...
grafana/enterprise-metrics 1.7.2 v1.6.1 Grafana Enterprise Metrics
grafana/fluent-bit 2.3.0 v2.1.0 Uses fluent-bit Loki go plugin for gathering lo...
grafana/loki-stack 2.6.4 v2.4.2 Loki: like Prometheus, but for logs.
...
Notes:
- The chart of interest is
grafana/loki-stack, which will install standaloneLokion the cluster. Please visit the loki-stack page for more details about this chart. - It's good practice in general, to use a specific version for the
Helmchart. This way, you canversionit usingGit, and target if for a specificrelease. In this tutorial, the Helm chart version2.6.4is picked forloki-stack, which maps to application version2.4.2.
For your convenience, there's a ready to use sample values file provided in the Starter Kit Git repository (loki-stack-values-v2.6.4.yaml). Please use your favorite text editor (preferably with YAML lint support), for inspection. You can use VS Code, for example:
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yamlNote:
The above values file, enables Loki and Promtail for you, so no other input is required. Prometheus and Grafana installation is disabled, because Set up Prometheus Stack took care of it already. Fluent Bit is not used, so it is disabled by default as well.
Next, install the stack using Helm. The following command installs version 2.6.4 of grafana/loki-stack in your cluster, using the Starter Kit repository values file (also creates the loki-stack namespace, if it doesn't exist):
HELM_CHART_VERSION="2.6.4"
helm install loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
--create-namespace \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"Finally, check Helm release status:
helm ls -n loki-stackThe output looks similar to (STATUS column should display 'deployed'):
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
loki loki-stack 1 2022-06-08 10:00:06.838665 +0300 EEST deployed loki-stack-2.6.4 v2.4.2
Next, inspect all the Kubernetes resources created for Loki:
kubectl get all -n loki-stackYou should have resources deployed for Loki itself (loki-0) and Promtail (loki-promtail). The output looks similar to:
NAME READY STATUS RESTARTS AGE
pod/loki-0 1/1 Running 0 44m
pod/loki-promtail-8dskn 1/1 Running 0 44m
pod/loki-promtail-mgb25 1/1 Running 0 44m
pod/loki-promtail-s7cp6 1/1 Running 0 44m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/loki ClusterIP 10.245.195.248 <none> 3100/TCP 44m
service/loki-headless ClusterIP None <none> 3100/TCP 44m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/loki-promtail 3 3 3 3 3 <none> 44m
NAME READY AGE
statefulset.apps/loki 1/1 44m
In the next step, you will configure Grafana to use the Loki datasource, and view application logs.
In this step, you will add the Loki data source to Grafana. First, you need to expose the Grafana web interface on your local machine (default credentials: admin/prom-operator):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80Note:
You should NOT expose Grafana to public network (eg. create an ingress mapping or LB service) with default login/password.
Next, open a web browser on localhost:3000, and follow below steps:
- Click the
Configurationgear from the left panel. - Select
Data sources. - Click the
Add data sourceblue button. - Select
Lokifrom the list and addLokiurlhttp://loki.loki-stack:3100. - Save and test.
If everything goes well, a green label message will appear, saying Data source connected and labels found.
Now, you can access logs from the Explore tab of Grafana. Make sure to select Loki as the data source. Use the Help button for log search cheat sheet.
In the next step, you'll be introduced to LogQL, which is PromQL's brother, but for logs. Some basic features of LogQL will be presented as well.
In this step, you will learn how to use LogQL for querying application logs, and make use of the available features to ease your work.
Loki comes with its very own language for querying logs, called LogQL. LogQL can be considered a distributed grep with labels for filtering.
A basic LogQL query consists of two parts: the log stream selector and a filter expression. Due to Loki’s design, all LogQL queries are required to contain a log stream selector.
The log stream selector will reduce the number of log streams to a manageable volume. Depending on how many labels you use to filter down the log streams, it will affect the relative performance of the query’s execution. The filter expression is then used to do a distributed grep over the retrieved log streams.
Next, a practical example is presented, for querying the Ambassador Edge Stack logs.
First, you need to expose the Grafana web console on your local machine (default credentials: admin/prom-operator):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80Next, point your web browser to localhost:3000, and navigate to the Explore tab from the left panel. Select Loki from the data source menu, and run this query:
{container="vote-bot", namespace="emojivoto"}The output looks similar to the following:
Perform another query, but this time filter the results to include only the Error message:
{container="web-svc",namespace="emojivoto"} |= "Error"The output looks similar to (notice how the Error word is being highlighted in the query results panel):
As you can see in the above examples, each query is composed of:
- A
log streamselector{container="web-svc", namespace="emojivoto"}, which targets theweb-svc containerfrom theambassador namespace. - A
filter- e.g.:|= "Error", which shows only the lines containing thewarningword.
More complex queries can be created using aggregation operators. For more details on the topic, and other advanced features, please visit the official LogQL page.
Another feature of Loki worth mentioning is Labels. Labels allows you to organize streams. In other words, labels add metadata to a log stream, so that the system can distinguish it later. Essentially, they are key-value pairs that can be anything you want, as long as they have a meaning for the data being tagged.
Loki indexes data based on labels, allowing more efficient storage. Below picture highlights this feature, in the Log labels panel:
In the next step, you will discover Promtail, which is the agent responsible with logs fetching, and transforming the data (labeling, adding new fields, dropping, etc).
In this step, you will learn what Promtail is, as well as how it works. It is deployed as a DaemonSet, and it's an important part of your Loki stack installation, responsible with fetching all Pods logs running in your Kubernetes cluster.
What Promtail is essentially doing, is to:
Discovertargets thatemitlogs.Attach labelstolog streams.Push log streamsto theLokiinstance.
Before Promtail can ship any data from log files to Loki, it needs to find out information about its environment. Specifically, this means discovering applications emitting log lines to files that need to be monitored.
Promtail borrows the same service discovery mechanism from Prometheus, although it currently only supports Static and Kubernetes service discovery. This limitation is due to the fact that Promtail is deployed as a daemon to every local machine and, as such, does not discover label from other machines. Kubernetes service discovery fetches required labels from the Kubernetes API server while Static usually covers all other use cases.
As with every monitoring agent, you need to have a way for it to be up all the time. The Loki stack Helm deployment already makes this possible via a DaemonSet, as seen below:
kubectl get ds -n loki-stackThe output looks similar to the following (notice the loki-promtail line):
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-prom-stack-prometheus-node-exporter 2 2 2 2 2 <none> 7d4h
loki-promtail 2 2 2 2 2 <none> 5h6m
This is great, but how does Promtail discover Kubernetes pods and assigns labels ?
The scrape_configs section from the Promtail main configuration will show you the details. You can use kubectl for inspection (notice that the application configuration is stored using a Kubernetes ConfigMap):
kubectl get cm loki-stack -n loki-stack -o yaml > loki-promtail-config.yamlNext, please open the loki-promtail-config.yaml file using a text editor of your choice (preferably with YAML support). For example you can use VS Code:
code loki-promtail-config.yamlThen, look for the scrape_configs section. The output should be similar to:
...
scrape_configs:
- job_name: kubernetes-pods-name
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
...Promtail knows how to scrape for logs by using scrape_configs. Each scrape configuration tells Promtail how to discover logs and extract labels. Next, each scrape configuration contains one or more entries (called jobs) which are executed for each discovered target. Then, each job may contain a pipeline comprised of multiple stages. The main purpose of each stage is to transform your application logs, but it can also drop (filter) unwanted logs data, if needed. Jobs can contain relabel_configs stanzas as well, used to transform labels.
Explanations for the above configuration:
job_name: Defines a newjob, and its associated name.pipeline_stages: Can be used toaddorupdatelabels,correctthe timestamp, orre-writelog lines entirely. It usesstagesto accomplish the previously mentioned tasks. The configuration snippet presented above, is using a docker stage, which can extract data based on the standardDockerformat. You can refer to the official documentation to read more about Stages and Pipelines.kubernetes_sd_config: TellsPromtailhow todiscover logscoming fromPodsviaKubernetes service discovery.relabel_configs: Defines alistof operations totransformthelabelsfromdiscoveryinto another form. The configuration snippet presented above, is renaming the__meta_kubernetes_namespacesource label provided by the Kubernetes service discovery mechanism, to a more human friendly form -namespace.
In most of the cases you may not want to fetch logs from all namespaces (and pods, implicitly). This is very useful to avoid high traffic inside your Kubernetes cluster caused by Promtail, as well as to reduce the amount of data ingested by Loki. What Promtail allows you to do in this case, is to drop logs from unwanted namespaces and keep the rest. Another advantage of this approach, is that the volume of data that needs to be indexed by Loki is reduced, meaning less storage used and number of objects, if using a DO Spaces bucket for example.
Promtail allows you to filter logs on a namespace basis, via the drop stage. You can use Helm to configure Promtail for namespace filtering.
First, open the 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yaml file provided in the Starter Kit repository, using a text editor of your choice (preferably with YAML lint support). For example you can use VS Code (please make sure to change directory where the Starter Kit repository was cloned first):
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yamlNext, please remove the comments surrounding the pipelineStages section. In the following example, you will configure Promtail to drop all logs coming from all namespaces prefixed with kube-, meaning: kube-node-lease, kube-public, kube-system. The output looks similar to:
promtail:
enabled: true
#
# Enable Promtail service monitoring
# serviceMonitor:
# enabled: true
#
# User defined pipeline stages
pipelineStages:
- docker: {}
- drop:
source: namespace
expression: "kube-.*"Explanations for the above configuration:
pipelineStages: TellsHelmto insert user defined pipeline stages in each job that it creates. By default, theLokistackHelmchart is configuringPromtailto fetchallthe logs coming from everynamespaceandpod(you can inspect the ConfigMap template for details).docker: TellsPromtailto use a Docker stage. Helps withDockerlogsformatting.drop: TellsPromtailto use a Drop stage. Then, you make use of thesourcefield, to drop logs based on anamespace. Finally,expressionis the regex selector for thesourcefield.
Finally, save the values file and apply changes using Helm upgrade:
HELM_CHART_VERSION="2.6.4"
helm upgrade loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"If the upgrade succeeded and no errors were reported, you can check via LogQL if logs are still pushed to Loki from the kube-system namespace after a while. Please wait a minute or so, and then run below queries. Make sure to adjust the time window in Grafana as well to match the last minute(s) interval (you will want to fetch the most recent data only).
Next (if not already), please create a port forward for the Grafana web console on your local machine (default credentials: admin/prom-operator):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80Then, point your web browser to localhost:3000, and navigate to the Explore tab from the left panel. Select Loki from the data source menu, and run below queries:
{namespace="kube-system"}{namespace="kube-public"}{namespace="kube-node-leases"}The output window should not return any data for any of the above queries.
Hints:
- You can have as many drop stages as you want by using different regex expressions, or combine them together under a single stage if desired, via the regex
ORoperator (|). - You can also leverage the power of
ServiceMonitorsand enable Promtailmetricscollection, as learned in Step 2 - Configure Prometheus and Grafana from thePrometheustutorial:-
First, make sure to enable both the Loki ServiceMonitor and Promtail ServiceMonitor from
Prometheus Stack, followed by aHelm upgrade. -
Next, enable Promtail Service from
Loki Stack, followed by aHelm upgrade. -
Then, you can install the Loki&Promtail Grafana dashboard provided by the community, and inspect the
volumeof dataingestedbyLoki. After a while, you can get a visual feedback of what it means to filter unnecessary application logs, as seen in the picture below (beforefitlering - on theleftside, andafter- on therightside):
-
For more features and in depth explanations, please visit the Promtail official documentation page.
In the next step, you will learn how to set up persistent storage for Loki, using DO Spaces. By default, Helm deploys Loki with ephemeral storage, meaning all your indexed log data will be lost if it restarts, or if the DOKS cluster is recreated.
In this step, you will learn how to enable persistent storage for Loki. You're going to use the DO Spaces bucket created in the Prerequisites section of the tutorial.
By default, Helm deploys Loki with ephemeral storage, via the emptyDir volume. To preserve indexed log data across Pod restarts, Loki can be set up to use DO Spaces instead.
Why use DO Spaces? Because it scales very well and it's cheaper than PVs, which rely on Block Storage. On the other hand, you don't have to worry anymore about running out of disk space, as it happens when relying on the PV implementation. More than that, you don't have to worry about PV sizing and doing the extra math.
In terms of performance, S3 is not that good compared to Block Storage, but it doesn't matter very much in this case because Loki does a good job in general.
Next, you're going to tell Helm how to configure Loki persistent storage, via DO Spaces, as well as setting the correct schema.
First, change directory where the Starter Kit repository was cloned:
cd Kubernetes-Starter-Kit-DevelopersNext, open the loki-stack-values-v2.6.4.yaml file provided in the Starter Kit repository, using a text editor of your choice (preferably with YAML lint support). You can use VS Code, for example:
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yamlThen, please remove the comments surrounding the schema_config and storage_config keys. The final Loki storage setup configuration looks similar to (please replace the <> placeholders accordingly):
loki:
enabled: true
config:
schema_config:
configs:
- from: "2020-10-24"
store: boltdb-shipper
object_store: aws
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: aws
aws:
bucketnames: <YOUR_DO_SPACES_BUCKET_NAME_HERE>
endpoint: <YOUR_DO_SPACES_BUCKET_ENDPOINT_HERE> # in the following format: <region>.digitaloceanspaces.com
region: <YOUR_DO_SPACES_BUCKET_REGION_HERE> # short region name (e.g.: fra1)
access_key_id: <YOUR_DO_SPACES_ACCESS_KEY_HERE>
secret_access_key: <YOURDO_SPACES_SECRET_KEY_HERE>
s3forcepathstyle: trueExplanation for the above configuration:
schema_config- defines a storagetypeand a schemaversionto facilitate migrations (schemas candifferbetweenLokiinstallations, soconsistencyis they key here). In this case,BoltDB Shipperis specified as the storage implementation and av11schema version. The24hperiod for the index is the default and preferred value, so please don't change it. Please visit Schema Configs for more details.storage_config- tellsLokiaboutstorage configurationdetails, like settingBoltDB Shipperparameters. It also informsLokiabout theawscompatibleS3storage parameters (bucketname,credentials,region, etc).
Apply settings, using Helm:
HELM_CHART_VERSION="2.6.4"
helm upgrade loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"Now, check if the main Loki application pod is up and running (it may take up to 1 minute or so to start, so please be patient):
kubectl get pods -n loki-stack -l app=lokiThe output looks similar to:
NAME READY STATUS RESTARTS AGE
loki-0 1/1 Running 0 13m
Hints:
-
The main application
Podis calledloki-0. You can check the configuration file, using the following command (please note that it containssensitiveinformation):kubectl exec -it loki-0 -n loki-stack -- /bin/cat /etc/loki/loki.yaml -
You can also check the logs while waiting. It's also good practice in general to check the application logs, and see if something goes bad or not.
kubectl logs -n loki-stack -l app=loki
If everything goes well, you should see the DO Spaces bucket containing the index and chunks folders (the chunks folder is called fake, which is a strange name - this is by design, when not running in multi-tenant mode).
For more advanced options, and fine tuning the storage for Loki, please visit the Loki Storage official documentation page.
Next, you will learn how to set storage retention policies for Loki.
In this step, you will learn how to set DO Spaces retention policies. Because you configured DO Spaces as the default storage backend for Loki, the same rules apply, as for every S3 compatible storage type.
Retention is a very important aspect when configuring storage backends in general, because storage is finite. While S3 storage is not expensive, and is somewhat infinte (it makes you think like that), it is good practice to have a retention policy set.
Although S3 is very scalable, and you don't have to worry about disk space issues, it's OK to have a retention policy in place. This way, really old data can be deleted if not needed.
S3 compatible storage has its own set of policies and rules for retention. In the S3 terminology, it is called object lifecycle. You can learn more about the DO Spaces bucket lifecycle options, on the official documentation page.
Hint:
S3CMD is a really good utility to have in order to inspect how many objects are present, as well as the size of the DO Spaces bucket used for Loki retention. S3CMD also helps you to see if the retention policies set so far are working or not. Please follow the DigitalOcean guide for installing and setting up s3cmd.
Setting the lifecycle for Loki storage bucket is achieved via the s3cmd utility.
Next, you're going to use the loki_do_spaces_lifecycle.xml configuration file provided in the Starter Kit Git repository, to configure retention for Loki bucket. The policy file contents look similar to:
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>Expire old fake data</ID>
<Prefix>fake/</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>10</Days>
</Expiration>
</Rule>
<Rule>
<ID>Expire old index data</ID>
<Prefix>index/</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>10</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>The above lifecycle configuration, will automatically delete after 10 days, all the objects from the fake/ and index/ paths in the Loki storage bucket. A 10 days lifespan is chosen in this example, because it's usually enough for development purposes. For production or other critical systems, a period of 30 days and even more is recommended.
How to configure the Loki bucket lifecycle, using s3cmd:
-
First, change directory where the
Starter Kitrepository was cloned.cd Kubernetes-Starter-Kit-Developers -
Next, open and inspect the
04-setup-observability/assets/manifests/loki_do_spaces_lifecycle.xmlfile from theStarter Kitrepository, using a text editor of your choice (preferably withXMLlint support), and adjust according to your needs. -
Then, set the
lifecyclepolicy (please replace the<>placeholders accordingly):s3cmd setlifecycle 04-setup-observability/assets/manifests/loki_do_spaces_lifecycle.xml s3://<LOKI_STORAGE_BUCKET_NAME>
-
Finally, check that the
policywas set (please replace the<>placeholders accordingly):s3cmd getlifecycle s3://<LOKI_STORAGE_BUCKET_NAME>
After finishing the above steps, you can inspect the bucket size, and number of objects via the du subcommand of s3cmd (the name is borrowed from the Linux Disk Usage utility). Please replace the <> placeholders accordingly:
s3cmd du -H s3://<LOKI_DO_SPACES_BUCKET_NAME>The output looks similar to the following (notice that it prints the bucket size - 19M, and number of objects present - 2799):
19M 2799 objects s3://loki-storage-test/
Next, the DO Spaces backend implementation will clean the objects for you automatically, based on the expiration date. You can always go back and edit the policy if needed later on, by uploading a new one.
In this tutorial, you learned how to install Loki for logs monitoring in your DOKS cluster. Then, you configured Grafana to use Loki as a data source. You also learned about LogQL for querying logs, and how to setup persistent storage and retention for Loki.
For more details about Loki and other available features, please visit the Loki official documentation page.
Next, go to Exporting Kuberentes Events.





