Skip to content

Commit 2cacca0

Browse files
authored
Update prometheus metrics list (#1057)
1 parent 56e997c commit 2cacca0

2 files changed

Lines changed: 210 additions & 52 deletions

File tree

crowdsec-docs/docs/observability/prometheus.md

Lines changed: 105 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -17,87 +17,166 @@ The goal of this endpoint, besides the usual resources consumption monitoring, a
1717

1818
All the counters are "since CrowdSec start".
1919

20+
### Metrics levels
21+
22+
The [prometheus configuration](/configuration/crowdsec_configuration.md#prometheus) accepts a `level` parameter that controls the verbosity of exposed metrics. The possible values are:
23+
24+
- `none` : no metrics are registered
25+
- `aggregated` : only aggregated metrics are registered (per-machine and per-bouncer LAPI metrics, per-node parser metrics, decision/alert gauges, and LAPI response time are not available)
26+
- `full` (default) : all metrics are registered
27+
28+
Acquisition metrics are registered per datasource — they only appear when the corresponding datasource is configured.
29+
2030
### Metrics details
2131

2232
#### Scenarios
2333

24-
- `cs_buckets` : number of scenario that currently exist
25-
- `cs_bucket_created_total` : total number of instantiation of each scenario
26-
- `cs_bucket_overflowed_total` : total number of overflow of each scenario
27-
- `cs_bucket_underflowed_total` : total number of underflow of each scenario (bucket was created but expired because of lack of events)
28-
- `cs_bucket_poured_total` : total number of event poured to each scenario with source as complementary key
34+
- `cs_buckets` : number of buckets that currently exist (Gauge, labels: `name`)
35+
- `cs_bucket_instantiation_total` : total number of instantiation of each scenario (Counter, labels: `name`)
36+
- `cs_bucket_overflowed_total` : total number of overflow of each scenario (Counter, labels: `name`)
37+
- `cs_bucket_underflowed_total` : total number of underflow of each scenario — bucket was created but expired because of lack of events (Counter, labels: `name`)
38+
- `cs_bucket_canceled_total` : total number of canceled buckets (Counter, labels: `name`)
39+
- `cs_bucket_poured_total` : total number of events poured to each scenario (Counter, labels: `source`, `type`, `name`)
2940

3041
<details>
3142
<summary>example</summary>
3243

3344
```
3445
#2030 lines from `/var/log/nginx/access.log` were poured to `crowdsecurity/http-scan-uniques_404` scenario
35-
cs_bucket_poured_total{name="crowdsecurity/http-scan-uniques_404",source="/var/log/nginx/access.log"} 2030
46+
cs_bucket_poured_total{name="crowdsecurity/http-scan-uniques_404",source="/var/log/nginx/access.log",type="nginx"} 2030
3647
```
3748

3849
</details>
3950

4051
#### Parsers
4152

42-
- `cs_node_hits_total` : how many times an event from a specific source was processed by a parser node :
53+
- `cs_node_hits_total` : how many times an event from a specific source was processed by a parser node (Counter, labels: `source`, `type`, `name`, `stage`, `acquis_type`)
4354

4455
<details>
4556
<summary>example</summary>
4657

4758
```
4859
# 235 lines from `auth.log` were processed by the `crowdsecurity/dateparse-enrich` parser
49-
cs_node_hits_total{name="crowdsecurity/dateparse-enrich",source="/var/log/auth.log"} 235
60+
cs_node_hits_total{name="crowdsecurity/dateparse-enrich",source="/var/log/auth.log",type="syslog",stage="s01-parse",acquis_type="file"} 235
5061
```
5162

5263
</details>
5364

54-
- `cs_node_hits_ko_total` : how many times an event from a specific was unsuccessfully parsed by a specific parser
65+
- `cs_node_hits_ko_total` : how many times an event from a specific source was unsuccessfully parsed by a specific parser (Counter, labels: `source`, `type`, `name`, `stage`, `acquis_type`)
5566

5667
<details>
5768
<summary>example</summary>
5869

5970
```
6071
# 2112 lines from `error.log` failed to be parsed by `crowdsecurity/http-logs`
61-
cs_node_hits_ko_total{name="crowdsecurity/http-logs",source="/var/log/nginx/error.log"} 2112
72+
cs_node_hits_ko_total{name="crowdsecurity/http-logs",source="/var/log/nginx/error.log",type="nginx",stage="s01-parse",acquis_type="file"} 2112
6273
```
6374

6475
</details>
6576

66-
- `cs_node_hits_ok_total` : how many times an event from a specific source was successfully parsed by a specific parser
77+
- `cs_node_hits_ok_total` : how many times an event from a specific source was successfully parsed by a specific parser (Counter, labels: `source`, `type`, `name`, `stage`, `acquis_type`)
78+
79+
- `cs_node_wl_hits_total` : how many times an event was processed by a whitelist node (Counter, labels: `source`, `type`, `name`, `reason`, `stage`, `acquis_type`)
80+
- `cs_node_wl_hits_ok_total` : how many times an event was successfully whitelisted by a node (Counter, labels: `source`, `type`, `name`, `reason`, `stage`, `acquis_type`)
81+
82+
- `cs_parser_hits_total` : how many times an event from a source has hit the parser (Counter, labels: `source`, `type`)
83+
- `cs_parser_hits_ok_total` : how many times an event from a source was successfully parsed (Counter, labels: `source`, `type`, `acquis_type`)
84+
- `cs_parser_hits_ko_total` : how many times an event from a source was unsuccessfully parsed (Counter, labels: `source`, `type`, `acquis_type`)
85+
86+
#### Processing
87+
88+
- `cs_parsing_time_seconds` : time spent parsing a line (Histogram, labels: `type`, `source`)
89+
- `cs_bucket_pour_seconds` : time spent pouring an event to buckets (Histogram, labels: `type`, `source`)
6790

68-
- `cs_parser_hits_total` : how many times an event from a source has hit the parser
69-
- `cs_parser_hits_ok_total` : how many times an event from a source was successfully parsed
70-
- `cs_parser_hits_ko_total` : how many times an event from a source was unsuccessfully parsed
91+
#### Decisions & Alerts
92+
93+
- `cs_active_decisions` : number of active decisions (Gauge, labels: `reason`, `origin`, `action`)
94+
- `cs_alerts` : number of alerts, excluding CAPI (Gauge, labels: `reason`)
95+
96+
#### Application Security Engine
97+
98+
- `cs_appsec_reqs_total` : total events processed by the Application Security Engine (Counter, labels: `source`, `appsec_engine`)
99+
- `cs_appsec_block_total` : total events blocked by the Application Security Engine (Counter, labels: `source`, `appsec_engine`)
100+
- `cs_appsec_rule_hits` : count of triggered rules (Counter, labels: `rule_name`, `type`, `appsec_engine`, `source`)
101+
- `cs_appsec_parsing_time_seconds` : time spent processing a request by the Application Security Engine (Histogram, labels: `source`, `appsec_engine`)
102+
- `cs_appsec_inband_parsing_time_seconds` : time spent processing a request by the inband Application Security Engine (Histogram, labels: `source`, `appsec_engine`)
103+
- `cs_appsec_outband_parsing_time_seconds` : time spent processing a request by the outband Application Security Engine (Histogram, labels: `source`, `appsec_engine`)
71104

72105
#### Acquisition
73106

74-
Acquisition metrics are split by datasource. The following metrics are available :
107+
Acquisition metrics are split by datasource. They only appear when the corresponding datasource is configured. The following metrics are available :
75108

76109
##### Cloudwatch
77110

78-
- `cs_cloudwatch_openstreams_total` : number of opened stream within group (by group)
79-
- `cs_cloudwatch_stream_hits_total` : number of event read from stream (by group and by stream)
111+
- `cs_cloudwatch_openstreams_total` : number of opened streams within group (Gauge, labels: `group`)
112+
- `cs_cloudwatch_stream_hits_total` : number of events read from stream (Counter, labels: `group`, `stream`)
113+
114+
##### Docker
115+
116+
- `cs_dockersource_hits_total` : total lines that were read (Counter, labels: `source`)
80117

81118
##### Files
82119

83-
- `cs_filesource_hits_total` : Total lines that were read (by source file)
120+
- `cs_filesource_hits_total` : total lines that were read (Counter, labels: `source`)
121+
122+
##### HTTP
123+
124+
- `cs_httpsource_hits_total` : total lines that were read from HTTP source (Counter, labels: `path`, `src`)
84125

85126
##### Journald
86127

87-
- `cs_journalctlsource_hits_total` : Total lines that were read (by source filter)
128+
- `cs_journalctlsource_hits_total` : total lines that were read (Counter, labels: `source`)
129+
130+
##### Kafka
131+
132+
- `cs_kafkasource_hits_total` : total lines that were read from topic (Counter, labels: `topic`)
133+
134+
##### Kinesis
135+
136+
- `cs_kinesis_stream_hits_total` : number of events read per stream (Counter, labels: `stream`)
137+
- `cs_kinesis_shards_hits_total` : number of events read per shard (Counter, labels: `stream`, `shard`)
138+
139+
##### Kubernetes Audit
140+
141+
- `cs_k8sauditsource_hits_total` : total number of events received by k8s-audit source (Counter, labels: `source`)
142+
- `cs_k8sauditsource_requests_total` : total number of requests received (Counter, labels: `source`)
143+
144+
##### Loki
145+
146+
- `cs_lokisource_hits_total` : total lines that were read (Counter, labels: `source`)
147+
148+
##### S3
149+
150+
- `cs_s3_hits_total` : number of events read per bucket (Counter, labels: `bucket`)
151+
- `cs_s3_objects_total` : number of objects read per bucket (Counter, labels: `bucket`)
152+
- `cs_s3_sqs_messages_total` : number of SQS messages received per queue (Counter, labels: `queue`)
88153

89154
##### Syslog
90155

91-
- `cs_syslogsource_hits_total` : Total lines that were received (by the syslog server)
92-
- `cs_syslogsource_parsed_total` : Total lines that were successfully parsed by the syslog server
156+
- `cs_syslogsource_hits_total` : total lines that were received (Counter, labels: `source`)
157+
- `cs_syslogsource_parsed_total` : total lines that were successfully parsed by the syslog server (Counter, labels: `source`, `type`)
158+
159+
##### VictoriaLogs
160+
161+
- `cs_victorialogssource_hits_total` : total lines that were read (Counter, labels: `source`)
162+
163+
##### Windows EventLog
164+
165+
- `cs_winevtlogsource_hits_total` : total events that were read (Counter, labels: `source`)
93166

94167
#### Local API
95168

96-
- `cs_lapi_route_requests_total` : number of calls to each route per method
97-
- `cs_lapi_machine_requests_total` : number of calls to each route per method grouped by machines
98-
- `cs_lapi_bouncer_requests_total` : number of calls to each route per method grouped by bouncers
99-
- `cs_lapi_decisions_ko_total` : number of unsuccessfully responses when bouncers ask for an IP.
100-
- `cs_lapi_decisions_ok_total` : number of successfully responses when bouncers ask for an IP.
169+
- `cs_lapi_route_requests_total` : number of calls to each route per method (Counter, labels: `route`, `method`)
170+
- `cs_lapi_machine_requests_total` : number of calls to each route per method grouped by machines (Counter, labels: `machine`, `route`, `method`)
171+
- `cs_lapi_bouncer_requests_total` : number of calls to each route per method grouped by bouncers (Counter, labels: `bouncer`, `route`, `method`)
172+
- `cs_lapi_decisions_ko_total` : number of calls to /decisions that returned nil result (Counter, labels: `bouncer`)
173+
- `cs_lapi_decisions_ok_total` : number of calls to /decisions that returned non-nil result (Counter, labels: `bouncer`)
174+
- `cs_lapi_request_duration_seconds` : response time of LAPI (Histogram, labels: `endpoint`, `method`)
175+
176+
#### Cache
177+
178+
- `cs_cache_size` : entries per cache (Gauge, labels: `name`, `type`)
179+
- `cs_regexp_cache_size` : entries per regexp cache (Gauge, labels: `name`)
101180

102181
#### Info
103182

0 commit comments

Comments
 (0)