You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: metric_monitor/REMOTE_WRITE_WITH_THANOS.md
+15-11Lines changed: 15 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,11 @@ As shown in the new architecture, the monitoring system consists of the followin
45
45
### Step 1: Set up TRON and Prometheus services
46
46
Run the below command to start a java-tron FullNode, node exporter and Prometheus services:
47
47
```sh
48
-
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d
48
+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d # Start all
49
+
50
+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d tron-node # Start tron-node only
51
+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d node-exporter # Start node-exporter only
52
+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d prometheus # Start prometheus only
49
53
```
50
54
51
55
You can verify the Prometheus service status and monitor targets by accessing `http://[host_IP]:9090/` in your browser. Alternatively, use `docker logs -f prometheus` to view the Prometheus service logs.
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 1-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
115
+
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 3-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
112
116
- The `targets` field specifies the java-tron services or other monitoring targets via their IP addresses and ports. Prometheus actively scrapes metrics from these defined endpoints.
113
117
- The `labels` section contains key-value pairs that uniquely identify each target within Prometheus. These labels enable powerful filtering capabilities in Grafana dashboards - for example, you can filter metrics using expressions like `{group="group-tron"}`.
114
118
@@ -119,7 +123,7 @@ remote_write:
119
123
##### 2. Storage configurations
120
124
- The volumes command `../prometheus_data:/prometheus` mounts a local directory used by Prometheus to store metrics data.
121
125
- Even when using Prometheus with remote-write, metrics data is still temporarily stored locally.
122
-
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 1 second and TSDB compression, **a single java-tron FullNode service requires about 2GB of Prometheus storage with 7 days of retention**.
126
+
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 3 second and TSDB compression, **a single java-tron FullNode service requires about 700MB of Prometheus storage with 7 days of retention**.
123
127
- The `--storage.tsdb.max-block-duration=30m` flag defines the maximum duration for generating TSDB blocks locally. With this setting, Prometheus will create new TSDB blocks at intervals no longer than 30 minutes, ensuring regular data persistence and efficient storage management.
124
128
- Other storage flags can be found in the [official documentation](https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects). For a quick start, you could use the default values.
125
129
@@ -146,7 +150,7 @@ docker-compose -f ./docker-compose/thanos-receive.yml up -d
146
150
As Promethus has already been configured to send metric metadata to Thanos Receive, check the logs to ensure the Thanos Receive is running properly.
`../receive-data:/receive/data`maps the host directory for metric TSDB storage.
185
-
- Retention Policy: `--tsdb.retention=30d`auto-purges data older than 30 days. As tested, it takes about **6GB of disk space per month for one java-tron(v4.7.6+) FullNode**.
189
+
- Retention Policy: The `--tsdb.retention=30d` flag automatically purges data older than 30 days. Based on testing with a java-tron(v4.7.6+) FullNode using a 3-second metric scrape interval, storage consumption averages approximately **3GB of disk space per month**.
186
190
187
191
- External Storage:
188
-
`../conf:/receive`mounts configuration files. The `--objstore.config-file` flag enables long-term storage in MinIO/S3-compatible buckets. In this case, it is [bucket_storage_bucket.yml](conf/bucket_storage_bucket.yml).
192
+
`../conf:/receive`mounts configuration files. The `--objstore.config-file` flag enables long-term storage in MinIO/S3-compatible buckets. In this case, it is [bucket_storage.yml](conf/bucket_storage.yml).
189
193
- Thanos Receive uploads TSDB blocks to an object storage bucket every 2 hours by default.
190
194
- Fallback Behavior: Omitting this flag keeps data local-only.
191
195
@@ -229,12 +233,12 @@ Core configuration in [thanos-store.yml](./docker-compose/thanos-store.yml):
ts=2025-04-03T05:37:53.676313173Z caller=tls_config.go:277 level=info service=http/server component=query msg="TLS is disabled." http2=false address=[::]:9091
261
265
ts=2025-04-03T05:37:53.676380298Z caller=grpc.go:131 level=info service=gRPC/server component=query msg="listening for serving gRPC" address=0.0.0.0:10901
262
-
ts=2025-04-03T05:37:58.685901342Z caller=endpointset.go:425 level=info component=endpointset msg="adding new receive with [storeEndpoints exemplarsAPI]" address=thanos-receive-0:10907 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
266
+
ts=2025-04-03T05:37:58.685901342Z caller=endpointset.go:425 level=info component=endpointset msg="adding new receive with [storeEndpoints exemplarsAPI]" address=thanos-receive:10907 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
263
267
ts=2025-04-03T05:37:58.685969217Z caller=endpointset.go:425 level=info component=endpointset msg="adding new store with [storeEndpoints]" address=thanos-store:10912 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
264
268
...
265
269
```
@@ -275,7 +279,7 @@ Below are the core configurations for the Thanos Query service:
275
279
- --endpoint.info-timeout=30s
276
280
- --http-address=0.0.0.0:9091
277
281
- --query.replica-label=receive_replica # Deduplication turned on for metric with the same replica_label
278
-
- --endpoint=thanos-receive-0:10907 # The grpc-address of the Thanos Receive service,if Receive run remotely replace container name "thanos-receive" with the real ip, could add multiple receive services
282
+
- --endpoint=thanos-receive:10907 # The grpc-address of the Thanos Receive service,if Receive run remotely replace container name "thanos-receive" with the real ip, could add multiple receive services
279
283
- --store=thanos-store:10907 # for historical data query
0 commit comments