Skip to content

Commit e60b48d

Browse files
committed
feat: Docker & cAdvisor. Fixing CPU usage on MacOS
1 parent 9684b20 commit e60b48d

42 files changed

Lines changed: 3014 additions & 252 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -310,10 +310,19 @@ jobs:
310310
run: make deps
311311

312312
- name: Run Gosec Security Scanner
313-
uses: securego/gosec@master
313+
uses: securego/gosec@v2.21.4
314314
with:
315315
args: '-no-fail -fmt sarif -out gosec-results.sarif ./...'
316316

317+
- name: Fix SARIF relationships
318+
if: always()
319+
run: |
320+
if [ -f gosec-results.sarif ]; then
321+
# Remove invalid relationships arrays using jq
322+
jq 'walk(if type == "object" and has("relationships") then .relationships |= map(select(type == "object")) else . end)' gosec-results.sarif > gosec-results-fixed.sarif
323+
mv gosec-results-fixed.sarif gosec-results.sarif
324+
fi
325+
317326
- name: Upload SARIF file
318327
uses: github/codeql-action/upload-sarif@v4
319328
if: always()

CHANGELOG.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<h3>TelemetryFlow Agent (OTEL Agent)</h3>
99

10-
[![Version](https://img.shields.io/badge/Version-1.1.4-orange.svg)](CHANGELOG.md)
10+
[![Version](https://img.shields.io/badge/Version-1.1.5-orange.svg)](CHANGELOG.md)
1111
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
1212
[![Go Version](https://img.shields.io/badge/Go-1.24+-00ADD8?logo=go)](https://golang.org/)
1313
[![OTEL SDK](https://img.shields.io/badge/OpenTelemetry_SDK-1.39.0-blueviolet)](https://opentelemetry.io/)
@@ -24,6 +24,44 @@ All notable changes to TelemetryFlow Agent will be documented in this file.
2424
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.1/),
2525
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
2626

27+
## [1.1.5] - 2026-02-19
28+
29+
### Added
30+
31+
- **Docker Container Metrics Collector**: Native Docker Engine API collector replacing cAdvisor dependency
32+
- Uses Docker SDK (`github.com/docker/docker`) with `ContainerStatsOneShot` for per-container metrics
33+
- **CPU**: `container.cpu.usage_percent`, `container.cpu.usage_total`, `container.cpu.user`, `container.cpu.kernel`, `container.cpu.online_cpus`, `container.cpu.throttled_periods`, `container.cpu.throttled_time`
34+
- **Memory**: `container.memory.usage`, `container.memory.working_set` (usage - inactive_file), `container.memory.limit`, `container.memory.max_usage`, `container.memory.rss`, `container.memory.cache`, `container.memory.usage_percent`
35+
- **Network**: Per-interface `container.network.{rx,tx}_{bytes,packets,errors,dropped}`
36+
- **Disk I/O**: `container.diskio.{read,write}_{bytes,ops}`
37+
- **PIDs**: `container.pids.current`
38+
- **State Summary**: `container.state.{running,stopped,paused,restarting,total}`
39+
- Container filtering with regex include/exclude patterns
40+
- Labels per metric: `container_id`, `container_name`, `image`, `status`
41+
- CPU delta tracking for accurate percentage calculation
42+
- **cAdvisor Prometheus Scraper Collector**: Scrapes container metrics from cAdvisor's `/metrics` endpoint
43+
- Parses Prometheus text format using `prometheus/common/expfmt`
44+
- Collects `container_*` and `machine_*` metric families by default
45+
- Supports all Prometheus types: counter, gauge, histogram, summary, untyped
46+
- Optional `metric_names` allowlist for selective collection
47+
- Custom labels injection from config
48+
- Configurable endpoint, metrics path, timeout, and interval
49+
- **Tags and Labels Propagation**: Agent tags and custom labels now included in heartbeat and OTLP exports
50+
- `tags` and `labels` fields added to heartbeat payload
51+
- Tags/labels exported as OTEL resource attributes
52+
53+
### Fixed
54+
55+
- **CPU Usage on macOS**: Removed `omitempty` from float64 fields in `SystemInfoPayload` that caused valid 0.0 values (CPU idle, iowait, steal, etc.) to be dropped from JSON serialization, resulting in "NaN %" display in dashboard
56+
57+
### Changed
58+
59+
- **Alphabetical Ordering**: All collectors in `config.go`, `agent.go`, and `tfo-agent.yaml` are now sorted alphabetically (cAdvisor → Docker → eBPF → Kubernetes → Logs → Node Exporter → Process → System)
60+
61+
### Dependencies
62+
63+
- Added `github.com/docker/docker v27.5.1+incompatible` for Docker Engine API
64+
2765
## [1.1.4] - 2026-02-11
2866

2967
### Added
@@ -316,6 +354,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
316354

317355
| Version | Date | OTEL SDK | Description |
318356
| ------- | ---------- | -------- | --------------------------------------------------------------------------------------------------------- |
357+
| 1.1.5 | 2026-02-19 | v1.39.0 | Docker container collector, cAdvisor scraper, CPU fix macOS, tags/labels propagation |
319358
| 1.1.4 | 2026-02-11 | v1.39.0 | eBPF collector (28 metrics), Cilium Hubble integration, 6 BPF programs, kernel-level observability |
320359
| 1.1.3 | 2026-02-04 | v1.39.0 | Network retransmit metrics, container name/image detection, page faults, IOPS, system calls |
321360
| 1.1.2 | 2026-01-03 | v1.39.0 | OSS observability (SigNoz, Coroot, HyperDX, OpenObserve, Netdata), APM (Dynatrace, Instana, ManageEngine) |

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<h3>TelemetryFlow Agent (OTEL Agent)</h3>
99

10-
[![Version](https://img.shields.io/badge/Version-1.1.3-orange.svg)](CHANGELOG.md)
10+
[![Version](https://img.shields.io/badge/Version-1.1.5-orange.svg)](CHANGELOG.md)
1111
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
1212
[![Go Version](https://img.shields.io/badge/Go-1.24+-00ADD8?logo=go)](https://golang.org/)
1313
[![OTEL SDK](https://img.shields.io/badge/OpenTelemetry_SDK-1.39.0-blueviolet)](https://opentelemetry.io/)

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
FROM golang:1.24-alpine AS builder
2828

2929
# Build arguments
30-
ARG VERSION=1.1.3
30+
ARG VERSION=1.1.5
3131
ARG GIT_COMMIT=unknown
3232
ARG GIT_BRANCH=unknown
3333
ARG BUILD_TIME=unknown
@@ -69,7 +69,7 @@ FROM alpine:3.21
6969
# =============================================================================
7070
LABEL org.opencontainers.image.title="TelemetryFlow Agent" \
7171
org.opencontainers.image.description="Enterprise telemetry collection agent for metrics, logs, and traces - Community Enterprise Observability Platform (CEOP)" \
72-
org.opencontainers.image.version="1.1.3" \
72+
org.opencontainers.image.version="1.1.5" \
7373
org.opencontainers.image.vendor="TelemetryFlow" \
7474
org.opencontainers.image.authors="DevOpsCorner Indonesia <support@devopscorner.id>" \
7575
org.opencontainers.image.url="https://telemetryflow.id" \

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# =============================================================================
1111
PRODUCT_NAME := TelemetryFlow Agent
1212
BINARY_NAME := tfo-agent
13-
VERSION ?= 1.1.2
13+
VERSION ?= 1.1.5
1414
OTEL_SDK_VERSION := 1.39.0
1515
GIT_COMMIT := $(shell git rev-parse --short HEAD 2>/dev/null || echo "unknown")
1616
GIT_BRANCH := $(shell git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "unknown")

README.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<h3>TelemetryFlow Agent (OTEL Agent)</h3>
99

10-
[![Version](https://img.shields.io/badge/Version-1.1.4-orange.svg)](CHANGELOG.md)
10+
[![Version](https://img.shields.io/badge/Version-1.1.5-orange.svg)](CHANGELOG.md)
1111
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
1212
[![Go Version](https://img.shields.io/badge/Go-1.24+-00ADD8?logo=go)](https://golang.org/)
1313
[![OTEL SDK](https://img.shields.io/badge/OpenTelemetry_SDK-1.39.0-blueviolet)](https://opentelemetry.io/)
@@ -32,7 +32,7 @@ TFO-Agent is fully aligned with the TelemetryFlow ecosystem, sharing the same Op
3232

3333
```mermaid
3434
graph LR
35-
subgraph "TelemetryFlow Ecosystem v1.1.4"
35+
subgraph "TelemetryFlow Ecosystem v1.1.5"
3636
subgraph "Instrumentation"
3737
SDK[TFO-Go-SDK<br/>OTEL SDK v1.39.0]
3838
end
@@ -59,7 +59,7 @@ graph LR
5959

6060
| Component | Version | OTEL Base | Description |
6161
| ----------------- | ------- | ------------------ | --------------------------- |
62-
| **TFO-Agent** | v1.1.4 | SDK v1.39.0 | Telemetry collection agent |
62+
| **TFO-Agent** | v1.1.5 | SDK v1.39.0 | Telemetry collection agent |
6363
| **TFO-Go-SDK** | v1.1.3 | SDK v1.39.0 | Go instrumentation SDK |
6464
| **TFO-Collector** | v1.1.3 | Collector v0.142.0 | Central telemetry collector |
6565

@@ -81,6 +81,8 @@ graph LR
8181
### System Monitoring
8282

8383
- **System Metrics Collection**: CPU, memory, disk, and network metrics
84+
- **Docker Container Monitoring**: Per-container CPU, memory, network, disk I/O, and PID metrics via Docker Engine API
85+
- **cAdvisor Metrics Scraping**: Prometheus endpoint scraper for cAdvisor container metrics
8486
- **Process Monitoring**: Track running processes
8587
- **Resource Detection**: Auto-detect host, OS, and container info
8688

@@ -139,11 +141,11 @@ docker-compose down
139141
```bash
140142
# Build image
141143
docker build \
142-
--build-arg VERSION=1.1.4 \
144+
--build-arg VERSION=1.1.5 \
143145
--build-arg GIT_COMMIT=$(git rev-parse --short HEAD) \
144146
--build-arg GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD) \
145147
--build-arg BUILD_TIME=$(date -u '+%Y-%m-%dT%H:%M:%SZ') \
146-
-t telemetryflow/telemetryflow-agent:1.1.4 .
148+
-t telemetryflow/telemetryflow-agent:1.1.5 .
147149

148150
# Run container
149151
docker run -d --name tfo-agent \
@@ -153,7 +155,7 @@ docker run -d --name tfo-agent \
153155
-p 13133:13133 \
154156
-v /path/to/config.yaml:/etc/tfo-agent/tfo-agent.yaml:ro \
155157
-v /var/lib/tfo-agent:/var/lib/tfo-agent \
156-
telemetryflow/telemetryflow-agent:1.1.4
158+
telemetryflow/telemetryflow-agent:1.1.5
157159
```
158160

159161
### OTEL Collector Ports
@@ -198,7 +200,7 @@ POST http://localhost:4318/v1/logs
198200
Create configuration file at `/etc/tfo-agent/tfo-agent.yaml`:
199201

200202
```yaml
201-
# TelemetryFlow Platform Configuration (v1.1.4+)
203+
# TelemetryFlow Platform Configuration (v1.1.5+)
202204
telemetryflow:
203205
api_key_id: "${TELEMETRYFLOW_API_KEY_ID}"
204206
api_key_secret: "${TELEMETRYFLOW_API_KEY_SECRET}"
@@ -287,10 +289,12 @@ tfo-agent/
287289
│ ├── agent/ # Core agent lifecycle
288290
│ ├── buffer/ # Disk-backed retry buffer
289291
│ ├── collector/ # Metric collectors
290-
│ │ ├── system/ # System metrics collector
292+
│ │ ├── cadvisor/ # cAdvisor Prometheus scraper collector
293+
│ │ ├── docker/ # Docker container metrics collector
294+
│ │ ├── ebpf/ # eBPF kernel-level metrics collector
291295
│ │ ├── kubernetes/ # Kubernetes metrics collector
292296
│ │ ├── nodeexporter/ # Node Exporter metrics collector
293-
│ │ └── ebpf/ # eBPF kernel-level metrics collector
297+
│ │ └── system/ # System metrics collector
294298
│ ├── config/ # Configuration management
295299
│ ├── exporter/ # OTLP data exporters
296300
│ └── version/ # Version and banner info
@@ -351,6 +355,25 @@ p.Start()
351355
| `system.network.bytes_sent` | counter | Total bytes sent |
352356
| `system.network.bytes_recv` | counter | Total bytes received |
353357

358+
### Docker Container Metrics
359+
360+
The Docker collector provides 32 per-container metrics via Docker Engine API:
361+
362+
- **CPU**: `container.cpu.{usage_percent,usage_total,user,kernel,online_cpus,throttled_periods,throttled_time}`
363+
- **Memory**: `container.memory.{usage,working_set,limit,max_usage,rss,cache,usage_percent}`
364+
- **Network**: `container.network.{rx,tx}_{bytes,packets,errors,dropped}` (per-interface)
365+
- **Disk I/O**: `container.diskio.{read,write}_{bytes,ops}`
366+
- **PIDs**: `container.pids.current`
367+
- **State**: `container.state.{running,stopped,paused,restarting,total}`
368+
369+
### cAdvisor Metrics
370+
371+
The cAdvisor collector scrapes Prometheus metrics from a running cAdvisor instance:
372+
373+
- Collects `container_*` and `machine_*` metric families
374+
- Supports counter, gauge, histogram, summary, and untyped metric types
375+
- Optional metric name allowlist for selective collection
376+
354377
### eBPF Metrics (Linux-only)
355378

356379
The eBPF collector provides 28 kernel-level metrics across 7 categories:

SECURITY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
<h3>TelemetryFlow Agent (OTEL Agent)</h3>
99

10-
[![Version](https://img.shields.io/badge/Version-1.1.3-orange.svg)](CHANGELOG.md)
10+
[![Version](https://img.shields.io/badge/Version-1.1.5-orange.svg)](CHANGELOG.md)
1111
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
1212
[![Go Version](https://img.shields.io/badge/Go-1.24+-00ADD8?logo=go)](https://golang.org/)
1313
[![OTEL SDK](https://img.shields.io/badge/OpenTelemetry_SDK-1.39.0-blueviolet)](https://opentelemetry.io/)

configs/tfo-agent.default.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,9 @@ agent:
4242
# Agent tags (for grouping and filtering in TFO Platform)
4343
tags:
4444
environment: "${TELEMETRYFLOW_ENVIRONMENT:-production}"
45-
region: "${TELEMETRYFLOW_REGION:-us-east-1}"
46-
cluster: "${TELEMETRYFLOW_CLUSTER:-main}"
47-
datacenter: "${TELEMETRYFLOW_DATACENTER:-dc1}"
45+
region: "${TELEMETRYFLOW_REGION:-ap-southeast-3}"
46+
cluster: "${TELEMETRYFLOW_CLUSTER:-telemetryflow}"
47+
datacenter: "${TELEMETRYFLOW_DATACENTER:-dc01}"
4848

4949
# ─── Heartbeat Configuration ──────────────────────────────────────────────────
5050
heartbeat:

0 commit comments

Comments
 (0)