Skip to content

Commit c0aa57d

Browse files
committed
fix: Update K8S metrics & Logs synchronize
1 parent 7fc0876 commit c0aa57d

3 files changed

Lines changed: 392 additions & 40 deletions

File tree

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
153153
- `services_test.go` — Service + Endpoint collection with describe-level fields
154154
- **Container Build Script** (`run-container.sh`): New unified container build/run script (221 lines) replacing the previous `run-build-container.sh`
155155

156+
### Added (2026-03-28)
157+
158+
- **Cloud Instance Metadata (IMDS) Collection** (`internal/collector/system/host.go`): `detectCloudMetadata()` now queries each cloud provider's Instance Metadata Service to populate `instanceType`, `instanceID`, `region`, and `zone` — previously only the provider name was detected, leaving all other fields empty (`unknown` in the UI)
159+
- **AWS EC2**: IMDSv2 token-based auth with IMDSv1 fallback; queries `169.254.169.254/latest/meta-data/` for instance-id, instance-type, placement/availability-zone; derives region from zone
160+
- **GCP Compute Engine**: Queries `metadata.google.internal/computeMetadata/v1/instance/` with `Metadata-Flavor: Google` header; parses fully-qualified machine-type and zone paths (e.g. `projects/123/machineTypes/e2-medium``e2-medium`)
161+
- **Azure VM**: Queries `169.254.169.254/metadata/instance?api-version=2021-02-01` with `Metadata: true` header; extracts vmId, vmSize, location, zone from JSON response
162+
- **Alibaba Cloud ECS**: Queries unique IMDS IP `100.100.100.200/latest/meta-data/` for instance-id, instance-type, region-id, zone-id; env var fallback via `ALIBABA_CLOUD_REGION_ID` / `ALICLOUD_REGION`
163+
- **Huawei Cloud ECS**: Queries OpenStack-compatible `169.254.169.254/openstack/latest/meta_data.json`; extracts uuid, `meta.metering.instance_type`, availability_zone; derives region from zone; env var fallback via `HUAWEICLOUD_REGION`
164+
- **DigitalOcean Droplet**: Queries `169.254.169.254/metadata/v1/` for id, size (slug e.g. `s-2vcpu-4gb`), region; env var detection via `DIGITALOCEAN_TOKEN` / `DO_REGION`
165+
- All IMDS queries use a dedicated 2-second timeout HTTP client to avoid blocking on non-cloud nodes
166+
- Cloud detection via DMI filesystem markers (`/sys/class/dmi/id/product_name`, `sys_vendor`) with `TELEMETRYFLOW_HOST_ROOT` container mount prefix support
167+
- New helper functions: `imdsGet()`, `fetchAWSIMDS()`, `fetchGCPIMDS()`, `fetchAzureIMDS()`, `fetchAlibabaIMDS()`, `fetchHuaweiIMDS()`, `fetchDigitalOceanIMDS()`
168+
156169
### Fixed
157170

158171
- **Default endpoint config** (`configs/tfo-agent.yaml`): Corrected default `TELEMETRYFLOW_ENDPOINT` from `http://localhost:3000/api/v2/monitoring` to `http://localhost:3000/api/v2` — aligns with all Kubernetes Helm templates, Docker Compose, and platform config files which consistently use `/api/v2` as base URL. Agent API paths already include `/monitoring/` prefix, so the previous default caused double `/monitoring` when using the built-in fallback

docs/SYSTEM-INFO.md

Lines changed: 60 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# TelemetryFlow Agent - System Information Capabilities
22

3-
[![Version](https://img.shields.io/badge/Version-1.1.4-orange.svg)](../CHANGELOG.md)
3+
[![Version](https://img.shields.io/badge/Version-1.1.9-orange.svg)](../CHANGELOG.md)
44

55
This document describes the comprehensive system information collected by the TelemetryFlow Agent during heartbeat and telemetry operations.
66

@@ -337,21 +337,67 @@ type NetworkInterfaceInfo struct {
337337

338338
## Cloud Metadata Detection
339339

340-
| Field | Type | Description |
341-
| --------------------- | ------ | -------------------------------- |
342-
| `cloud_provider` | string | Cloud provider (aws, gcp, azure) |
343-
| `cloud_instance_id` | string | Instance ID |
344-
| `cloud_instance_type` | string | Instance type (t3.micro) |
345-
| `cloud_region` | string | Cloud region |
346-
| `cloud_zone` | string | Availability zone |
340+
| Field | Type | Description | Example |
341+
| --------------------- | ------ | ----------------------------- | --------------------------------------------------------------------------------------------- |
342+
| `cloud_provider` | string | Cloud provider identifier | `aws`, `gcp`, `azure`, `alibaba`, `huawei`, `digitalocean` |
343+
| `cloud_instance_id` | string | Instance/VM unique identifier | `i-0abc123def456`, `550e8400...` |
344+
| `cloud_instance_type` | string | Instance type / machine size | `t3.medium`, `e2-standard-4`, `Standard_D2s_v3`, `ecs.g6.large`, `s3.medium.2`, `s-2vcpu-4gb` |
345+
| `cloud_region` | string | Cloud region | `us-east-1`, `us-central1`, `eastus`, `cn-hangzhou`, `ap-southeast-1`, `nyc1` |
346+
| `cloud_zone` | string | Availability zone | `us-east-1a`, `us-central1-a`, `eastus-1`, `cn-hangzhou-b` |
347347

348-
**Detection Methods:**
348+
### Detection & IMDS Query Flow
349+
350+
```mermaid
351+
flowchart TD
352+
START[detectCloudMetadata] --> RANCHER{Rancher/k3s/RKE?}
353+
RANCHER -->|CATTLE_* env or /var/lib/rancher| RET_RANCHER[provider=rancher/k3s]
354+
RANCHER -->|No| AWS{AWS?}
355+
AWS -->|/sys/hypervisor/uuid=ec2 or AWS_REGION| IMDS_AWS[fetchAWSIMDS<br/>169.254.169.254<br/>IMDSv2 + v1 fallback]
356+
AWS -->|No| GCP{GCP?}
357+
GCP -->|DMI=google or GOOGLE_CLOUD_PROJECT| IMDS_GCP[fetchGCPIMDS<br/>metadata.google.internal]
358+
GCP -->|No| AZURE{Azure?}
359+
AZURE -->|DMI sys_vendor=microsoft| IMDS_AZURE[fetchAzureIMDS<br/>169.254.169.254/metadata]
360+
AZURE -->|No| ALIBABA{Alibaba?}
361+
ALIBABA -->|DMI=alibaba or ALICLOUD_REGION| IMDS_ALI[fetchAlibabaIMDS<br/>100.100.100.200]
362+
ALIBABA -->|No| HUAWEI{Huawei?}
363+
HUAWEI -->|DMI sys_vendor=huawei or HUAWEICLOUD_REGION| IMDS_HW[fetchHuaweiIMDS<br/>169.254.169.254/openstack]
364+
HUAWEI -->|No| DO{DigitalOcean?}
365+
DO -->|DMI sys_vendor=digitalocean or DO env| IMDS_DO[fetchDigitalOceanIMDS<br/>169.254.169.254/metadata/v1]
366+
DO -->|No| UNKNOWN[provider=empty]
367+
368+
IMDS_AWS --> RESULT[Return provider, instanceID,<br/>instanceType, region, zone]
369+
IMDS_GCP --> RESULT
370+
IMDS_AZURE --> RESULT
371+
IMDS_ALI --> RESULT
372+
IMDS_HW --> RESULT
373+
IMDS_DO --> RESULT
374+
```
349375

350-
| Provider | Detection Method |
351-
| -------- | -------------------------------------------- |
352-
| AWS | `/sys/hypervisor/uuid`, `AWS_REGION` env |
353-
| GCP | DMI product name, `GOOGLE_CLOUD_PROJECT` env |
354-
| Azure | DMI sys_vendor |
376+
### Provider Detection Methods
377+
378+
| Provider | DMI / Filesystem Detection | Env Var Fallback | IMDS Endpoint |
379+
| ---------------- | ------------------------------------------------------ | -------------------------------------------- | ---------------------------------------------------------- |
380+
| **AWS** | `/sys/hypervisor/uuid` starts with `ec2` | `AWS_REGION` | `169.254.169.254/latest/meta-data/` (IMDSv2 token + v1) |
381+
| **GCP** | `/sys/class/dmi/id/product_name` contains `google` | `GOOGLE_CLOUD_PROJECT` | `metadata.google.internal/computeMetadata/v1/instance/` |
382+
| **Azure** | `/sys/class/dmi/id/sys_vendor` contains `microsoft` || `169.254.169.254/metadata/instance?api-version=2021-02-01` |
383+
| **Alibaba** | `/sys/class/dmi/id/product_name` contains `alibaba` | `ALIBABA_CLOUD_REGION_ID`, `ALICLOUD_REGION` | `100.100.100.200/latest/meta-data/` |
384+
| **Huawei** | `/sys/class/dmi/id/sys_vendor` contains `huawei` | `HUAWEICLOUD_REGION` | `169.254.169.254/openstack/latest/meta_data.json` |
385+
| **DigitalOcean** | `/sys/class/dmi/id/sys_vendor` contains `digitalocean` | `DIGITALOCEAN_TOKEN`, `DO_REGION` | `169.254.169.254/metadata/v1/` |
386+
| **Rancher** | `CATTLE_*` env or `/var/lib/rancher/` || None (no IMDS) |
387+
| **k3s** | `/var/lib/rancher/k3s` exists || None (no IMDS) |
388+
389+
### IMDS Response Parsing Details
390+
391+
| Provider | Instance Type Source | Region Derivation | Example Value |
392+
| ---------------- | --------------------------------------- | --------------------------------------------------------- | ----------------- |
393+
| **AWS** | `/meta-data/instance-type` | Zone minus trailing letter (`us-east-2a``us-east-2`) | `t3.medium` |
394+
| **GCP** | `/instance/machine-type` (last segment) | Zone minus last `-X` (`us-central1-a``us-central1`) | `e2-standard-4` |
395+
| **Azure** | JSON `.compute.vmSize` | JSON `.compute.location` | `Standard_D2s_v3` |
396+
| **Alibaba** | `/meta-data/instance/instance-type` | `/meta-data/region-id` | `ecs.g6.large` |
397+
| **Huawei** | JSON `.meta.metering.instance_type` | Zone minus trailing letter (`cn-north-4a``cn-north-4`) | `s3.medium.2` |
398+
| **DigitalOcean** | `/metadata/v1/size` | `/metadata/v1/region` | `s-2vcpu-4gb` |
399+
400+
> **Note:** All IMDS queries use a dedicated HTTP client with a 2-second timeout. Non-cloud nodes return immediately without blocking. DMI filesystem paths are checked with `TELEMETRYFLOW_HOST_ROOT` prefix for containerized deployments.
355401
356402
---
357403

0 commit comments

Comments
 (0)