Skip to content

Commit 3e95e25

Browse files
authored
Merge branch 'develop' into develop
2 parents dc3df26 + 8418932 commit 3e95e25

22 files changed

Lines changed: 2828 additions & 273 deletions

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ go
2525
gopath
2626
go1.*
2727
*FullNode_output-directory*
28+
*.pub
2829

2930
/single_node/output-directory/
3031
/metric_monitor/logs/
@@ -38,5 +39,4 @@ go1.*
3839
/metric_monitor/helm-charts/thanos-receive/charts/
3940
/metric_monitor/helm-charts/thanos-receive/Chart.lock
4041
/metric_monitor/store-data/
41-
42-
*.pub
42+
/metric_monitor/grafana_data/

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ We also provide tools to facilitate the CI and testing process:
2626
- `root`: compute merkle root for tiny db. NOTE: large db may GC overhead
2727
limit exceeded.
2828
- `fork`: Modify the database of java-tron for shadow fork testing.
29+
- `query`: Query the latest vote and reward information from the database.
2930
- **Stress Test**: Execute the stress test and evaluate the performance of the `java-tron` fullnode.
3031

3132

conf/private_net_layout.toml

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,8 @@
1-
# [database]
2-
# database_tar = "/path/to/database"
3-
41
# [[nodes]]
52
# node_ip = "192.168.1.1" # Remote node's IP
6-
# node_direcotry = "/path/to/direcotry" # Remote node's working direcotry for node
3+
# node_directory = "/path/to/direcotry" # Remote node's working direcotry for node
74
# config_file = "/path/to/config" # Config file for remote node
5+
# docker_compose_file =”/path/to/config“ # Config docker-compose file for remote node
86
# node_type = "fullnode/sr" # Fullnode or SR node
97
# ssh_port = 22
108
# ssh_user = "user1"
@@ -13,22 +11,21 @@
1311

1412
# [[nodes]]
1513
# node_ip = "192.168.1.2" # Changed IP to demonstrate different nodes
16-
# node_direcotry = "/path/to/direcotry"
14+
# node_directory = "/path/to/directory"
1715
# config_file = "/path/to/config"
16+
# docker_compose_file =”/path/to/config“ # Config docker-compose file for remote node
1817
# node_type = "fullnode/sr"
1918
# ssh_port = 2222 # Custom SCP port for this node
2019
# ssh_user = "user2"
2120
# # No password or key; assumes SSH agent or pre-configured key
2221

23-
[database]
24-
database_tar = "/Users/barbatos/Downloads/hello.tgz"
2522

2623
[[nodes]]
2724
node_ip = "ec2-3-25-116-244.ap-southeast-2.compute.amazonaws.com"
28-
node_direcotry = "/home/ubuntu/mytest"
29-
config_file = "/Users/barbatos/Documents/code/md/mydev/tron-docker/conf/private_net_config_others.conf"
30-
node_type = "fullnode"
25+
node_directory = "/home/ubuntu/mytest"
26+
config_file = "/Users/ubuntu/conf/private_net_config_others.conf"
27+
docker_compose_file = "/Users/ubuntu/docker-compose.yml"
3128
ssh_port = 22
3229
ssh_user = "ubuntu"
3330
# ssh_password = "password1"
34-
ssh_key = "/Users/barbatos/Downloads/test-ci.pem" # Optional; uncomment if using key auth
31+
ssh_key = "/Users/ubuntu/Downloads/test-ci.pem" # Optional; uncomment if using key auth

metric_monitor/REMOTE_WRITE_WITH_THANOS.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,11 @@ As shown in the new architecture, the monitoring system consists of the followin
4545
### Step 1: Set up TRON and Prometheus services
4646
Run the below command to start a java-tron FullNode, node exporter and Prometheus services:
4747
```sh
48-
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d
48+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d # Start all
49+
50+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d tron-node # Start tron-node only
51+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d node-exporter # Start node-exporter only
52+
docker-compose -f ./docker-compose/docker-compose-target-node.yml up -d prometheus # Start prometheus only
4953
```
5054

5155
You can verify the Prometheus service status and monitor targets by accessing `http://[host_IP]:9090/` in your browser. Alternatively, use `docker logs -f prometheus` to view the Prometheus service logs.
@@ -108,7 +112,7 @@ remote_write:
108112
<img src="../images/metric_push_external_label.png" alt="Alt Text" width="680" >
109113

110114
- For `scrape_configs`:
111-
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 1-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
115+
- The `scrape_interval` defines the frequency at which Prometheus collects metrics. While configured for 3-second intervals to enable real-time monitoring, this setting can be customized according to your specific monitoring needs. Keep in mind that decreasing the interval will increase the service load, as metrics are collected each time the HTTP request triggered.
112116
- The `targets` field specifies the java-tron services or other monitoring targets via their IP addresses and ports. Prometheus actively scrapes metrics from these defined endpoints.
113117
- The `labels` section contains key-value pairs that uniquely identify each target within Prometheus. These labels enable powerful filtering capabilities in Grafana dashboards - for example, you can filter metrics using expressions like `{group="group-tron"}`.
114118

@@ -119,7 +123,7 @@ remote_write:
119123
##### 2. Storage configurations
120124
- The volumes command `../prometheus_data:/prometheus` mounts a local directory used by Prometheus to store metrics data.
121125
- Even when using Prometheus with remote-write, metrics data is still temporarily stored locally.
122-
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 1 second and TSDB compression, **a single java-tron FullNode service requires about 2GB of Prometheus storage with 7 days of retention**.
126+
- The `--storage.tsdb.retention.time=7d` flag defines how long metrics data is retained. In this case, Prometheus automatically purges data older than 7 days. For a java-tron(v4.7.6+) FullNode, each metric request returns approximately 9KB of raw data. With a `scrape_interval` of 3 second and TSDB compression, **a single java-tron FullNode service requires about 700MB of Prometheus storage with 7 days of retention**.
123127
- The `--storage.tsdb.max-block-duration=30m` flag defines the maximum duration for generating TSDB blocks locally. With this setting, Prometheus will create new TSDB blocks at intervals no longer than 30 minutes, ensuring regular data persistence and efficient storage management.
124128
- Other storage flags can be found in the [official documentation](https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects). For a quick start, you could use the default values.
125129

@@ -146,7 +150,7 @@ docker-compose -f ./docker-compose/thanos-receive.yml up -d
146150
As Promethus has already been configured to send metric metadata to Thanos Receive, check the logs to ensure the Thanos Receive is running properly.
147151

148152
```sh
149-
docker logs -f thanos-receive-0
153+
docker logs -f thanos-receive
150154
151155
...
152156
ts=2025-04-03T03:13:49.395927626Z caller=intrumentation.go:56 level=info component=receive msg="changing probe status" status=ready
@@ -176,16 +180,16 @@ Core configuration for Thanos Receive in [thanos-receive.yml](./docker-compose/t
176180
- "--remote-write.address=0.0.0.0:10908"
177181
- "--label=receive_replica=\"0\""
178182
- "--label=receive_cluster=\"java-tron-mainnet\""
179-
- "--objstore.config-file=/receive/bucket_storage_bucket.yml"
183+
- "--objstore.config-file=/receive/bucket_storage.yml"
180184
```
181185
#### Key configuration elements:
182186
##### 1. Storage configuration
183187
- Local Storage:
184188
`../receive-data:/receive/data` maps the host directory for metric TSDB storage.
185-
- Retention Policy: `--tsdb.retention=30d` auto-purges data older than 30 days. As tested, it takes about **6GB of disk space per month for one java-tron(v4.7.6+) FullNode**.
189+
- Retention Policy: The `--tsdb.retention=30d` flag automatically purges data older than 30 days. Based on testing with a java-tron(v4.7.6+) FullNode using a 3-second metric scrape interval, storage consumption averages approximately **3GB of disk space per month**.
186190

187191
- External Storage:
188-
`../conf:/receive` mounts configuration files. The `--objstore.config-file` flag enables long-term storage in MinIO/S3-compatible buckets. In this case, it is [bucket_storage_bucket.yml](conf/bucket_storage_bucket.yml).
192+
`../conf:/receive` mounts configuration files. The `--objstore.config-file` flag enables long-term storage in MinIO/S3-compatible buckets. In this case, it is [bucket_storage.yml](conf/bucket_storage.yml).
189193
- Thanos Receive uploads TSDB blocks to an object storage bucket every 2 hours by default.
190194
- Fallback Behavior: Omitting this flag keeps data local-only.
191195

@@ -229,12 +233,12 @@ Core configuration in [thanos-store.yml](./docker-compose/thanos-store.yml):
229233
thanos_store:
230234
command:
231235
- "store"
232-
- "--objstore.config-file=/etc/thanos/bucket_storage_bucket.yml"
236+
- "--objstore.config-file=/etc/thanos/bucket_storage.yml"
233237
- "--grpc-address=0.0.0.0:10912"
234238
```
235239
The Store gateway:
236240

237-
- Connects to our Minio bucket via `bucket_storage_bucket.yml`, the same configuration file as Thanos Receive.
241+
- Connects to our Minio bucket via `bucket_storage.yml`, the same configuration file as Thanos Receive.
238242
- Exposes gRPC endpoint for Thanos Query to access historical data
239243
- Indexes object storage blocks for fast lookups
240244

@@ -259,7 +263,7 @@ ts=2025-04-03T05:37:53.676072548Z caller=intrumentation.go:56 level=info msg="ch
259263
ts=2025-04-03T05:37:53.676288048Z caller=tls_config.go:274 level=info service=http/server component=query msg="Listening on" address=[::]:9091
260264
ts=2025-04-03T05:37:53.676313173Z caller=tls_config.go:277 level=info service=http/server component=query msg="TLS is disabled." http2=false address=[::]:9091
261265
ts=2025-04-03T05:37:53.676380298Z caller=grpc.go:131 level=info service=gRPC/server component=query msg="listening for serving gRPC" address=0.0.0.0:10901
262-
ts=2025-04-03T05:37:58.685901342Z caller=endpointset.go:425 level=info component=endpointset msg="adding new receive with [storeEndpoints exemplarsAPI]" address=thanos-receive-0:10907 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
266+
ts=2025-04-03T05:37:58.685901342Z caller=endpointset.go:425 level=info component=endpointset msg="adding new receive with [storeEndpoints exemplarsAPI]" address=thanos-receive:10907 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
263267
ts=2025-04-03T05:37:58.685969217Z caller=endpointset.go:425 level=info component=endpointset msg="adding new store with [storeEndpoints]" address=thanos-store:10912 extLset="{receive_cluster=\"java-tron-mainnet\", receive_replica=\"0\", tenant_id=\"default-tenant\"}"
264268
...
265269
```
@@ -275,7 +279,7 @@ Below are the core configurations for the Thanos Query service:
275279
- --endpoint.info-timeout=30s
276280
- --http-address=0.0.0.0:9091
277281
- --query.replica-label=receive_replica # Deduplication turned on for metric with the same replica_label
278-
- --endpoint=thanos-receive-0:10907 # The grpc-address of the Thanos Receive service,if Receive run remotely replace container name "thanos-receive" with the real ip, could add multiple receive services
282+
- --endpoint=thanos-receive:10907 # The grpc-address of the Thanos Receive service,if Receive run remotely replace container name "thanos-receive" with the real ip, could add multiple receive services
279283
- --store=thanos-store:10907 # for historical data query
280284
```
281285
It will set up the Thanos Query service
File renamed without changes.

metric_monitor/conf/prometheus-remote-write.yml

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ global:
99
scrape_configs:
1010
- job_name: java-tron
1111
honor_timestamps: true
12-
scrape_interval: 1s
13-
scrape_timeout: 1s
12+
scrape_interval: 3s
13+
scrape_timeout: 3s
1414
metrics_path: /metrics
1515
scheme: http
1616
follow_redirects: true
@@ -29,20 +29,19 @@ scrape_configs:
2929
# The remote_write configuration tells Prometheus to continuously write metrics to the Thanos Receive service.
3030
# Refer https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
3131
remote_write:
32-
- url: http://thanos-receive-0:10908/api/v1/receive # if Thanos Receive service run on the same host with Prometheus
32+
- url: http://thanos-receive:10908/api/v1/receive # if Thanos Receive service run on the same host with Prometheus
3333
headers:
34-
X-Auth-Token: "token"
3534
X-Service-Group: "tron-fullnode-group1"
36-
remote_timeout: 10s
35+
remote_timeout: 15s
3736
queue_config:
38-
capacity: 2500
37+
capacity: 50000
3938
max_shards: 200 # the maximum number of shards, or parallelism, Prometheus will use for each remote-write queue
4039
min_shards: 1
41-
max_samples_per_send: 500
42-
batch_send_deadline: 5s
43-
min_backoff: 30ms
40+
max_samples_per_send: 10000
41+
batch_send_deadline: 3s
42+
min_backoff: 200ms
4443
max_backoff: 5s
4544
metadata_config:
4645
send: true
47-
send_interval: 1s # How frequently metric metadata is sent to remote storage.
48-
max_samples_per_send: 500
46+
send_interval: 3s # How frequently metric metadata is sent to remote storage.
47+
max_samples_per_send: 50000

metric_monitor/docker-compose/grafana.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ services:
1010
memory: 1g
1111
environment:
1212
- GF_METRICS_ENABLED=true # Enable Grafana metrics exposure
13+
- GF_SERVER_HTTP_PORT=3000 # Set Grafana's internal port to 3000
14+
volumes:
15+
- ../grafana_data:/var/lib/grafana # Grafana database and dashboards
1316
ports:
1417
- "3000:3000"
1518
restart: unless-stopped

metric_monitor/docker-compose/thanos-querier.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ services:
1111
- --endpoint.info-timeout=10s
1212
- --http-address=0.0.0.0:9091
1313
- --query.replica-label=receive_replica # Deduplication turned on for identical series except the replica label.
14-
- --endpoint=thanos-receive-0:10907
14+
- --endpoint=thanos-receive:10907
1515
#- --endpoint=thanos-receive-1:10907 # could add multiple receive services
1616
- --store=thanos-store:10912 # for historical data query
1717
restart: unless-stopped

metric_monitor/docker-compose/thanos-receive.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ services:
33
thanos-receive:
44
image: quay.io/thanos/thanos:v0.33.0
55
user: root
6-
container_name: thanos-receive-0
6+
container_name: thanos-receive
77
volumes:
88
- ../receive-data-0:/receive/data
99
- ../conf:/receive
@@ -20,5 +20,5 @@ services:
2020
- "--remote-write.address=0.0.0.0:10908"
2121
- "--label=receive_replica=\"0\""
2222
- "--label=receive_cluster=\"java-tron-mainnet\""
23-
- "--objstore.config-file=/receive/bucket_storage_bucket.yml"
23+
- "--objstore.config-file=/receive/bucket_storage.yml"
2424
restart: unless-stopped

metric_monitor/docker-compose/thanos-store.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ services:
1313
command:
1414
- "store"
1515
- "--data-dir=/var/thanos/store"
16-
- "--objstore.config-file=/etc/thanos/bucket_storage_bucket.yml"
16+
- "--objstore.config-file=/etc/thanos/bucket_storage.yml"
1717
- "--http-address=0.0.0.0:10911"
1818
- "--grpc-address=0.0.0.0:10912"
1919
restart: unless-stopped

0 commit comments

Comments
 (0)