Skip to content

Commit 09896bf

Browse files
authored
Fix RMB tester (#205)
* Fix Value error and minor refactor * Use zos.system.version instaed of rmb.version to test live nodes * update documention * doc: fix typo * Add a clear visual indicator for the outcome and a show the percentage of failures * Fix: ensure a single rmb-peer instance runs during the test_live_nodes.sh script and that the Redis database is clean * fix: hardening the script's error handling * show the final outcome in diffrent color * style: convert double quotes (") to single quotes (') for static strings that don't need variable expansion * fix: adding back the f prefix to the string literal * improve logging and debug output in the script * add precheck for binary path * update README.md
1 parent 0ef17e7 commit 09896bf

3 files changed

Lines changed: 148 additions & 47 deletions

File tree

tools/rmb_tester/README.md

Lines changed: 35 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,41 @@
22

33
You can find here CLI tools and scripts that can be used for testing and benchmarking [RMB](https://github.com/threefoldtech/rmb-rs). You can use either RMB_Tester, RMB_echo, or both to quickly test the communications over RMB.
44

5-
## Installation:
5+
## Installation
6+
67
- clone the repo
78
- create a new env
9+
810
```py
911
python3 -m venv venv
1012
```
13+
1114
- activate the new env
15+
1216
```py
1317
source ./venv/bin/activate
1418
```
19+
1520
- install dependencies
21+
1622
```py
1723
pip install -r requirements.txt
1824
```
1925

20-
## Usage:
26+
## Usage
27+
2128
RMB tools comprise two Python programs that can be used independently or in conjunction with each other.
2229

2330
### RMB_Tester
31+
2432
RMB_Tester is a CLI tool that serves as an RMB client to automate the process of crafting a specified number of test messages to be sent to one or more destinations. The number of messages, command, data, destination list, and other parameters can be configured through the command line. The tool will wait for the correct number of responses and report some statistics.
2533

2634
Please ensure that there is a process running on the destination side that can handle this command and respond back or use RMB_echo for this purpose.
2735

36+
Also, note that the rmb.version built-in command mentioned in this document is specific to the Rust rmb-peer implementation and is not guaranteed to be available in other RMB implementations. ZOS nodes no longer use the Rust rmb-peer. If you run this tool against a ZOS node, you must use a registered command, such as zos.system.version.
37+
2838
example:
39+
2940
```sh
3041
# We sending to two destinations
3142
# The default test command will be used and can be handled by RMB_echo process
@@ -35,62 +46,79 @@ python3 ./rmb_tester.py --dest 41 55
3546
to just print the summary use `--short` option
3647

3748
to override default command use the `--command`
49+
3850
```sh
3951
# The `rmb.version` command will be handled by RMB process itself
4052
python3 ./rmb_tester.py --dest 41 --command rmb.version
4153
```
4254

4355
for all optional args see
56+
4457
```sh
4558
python3 ./rmb_tester.py -h
4659
```
4760

4861
### RMB_Echo (message handler)
62+
4963
This tool will automate handling the messages coming to $queue and respond with same message back to the source and display the count of processed messages.
5064

5165
example:
66+
5267
```sh
5368
python3 ./msg_handler.py
5469
```
5570

5671
or specify the redis queue (command) to handle the messages from
72+
5773
```sh
5874
python3 ./msg_handler.py --queue helloworld
5975
```
6076

6177
for all optional args see
78+
6279
```sh
6380
python3 ./msg_handler.py -h
6481
```
6582

66-
## Recipes:
67-
### Simple method for testing live nodes:
83+
## Recipes
84+
85+
### Simple method for testing live nodes
86+
6887
- For simplicity, you can install this tool's dependencies by running the ``install.sh` script:
88+
6989
```sh
70-
./install
90+
./install.sh
7191
```
7292

7393
you can start testing live nodes if it is reachable over rmb by running `test-live-nodes.sh` script. it takes only one argument, the network name (one of `dev`, `qa`, `test`, `main`) and required to pass set you mnemonic as env var `MNEMONIC`. for testing dev network nodes:
94+
7495
```sh
7596
MNEMONIC="[YOUR MNEMONIC]" ./test_live_nodes.sh dev
7697
```
98+
7799
optionally, set `TIMEOUT` and/or `RMB_BIN`.
78100
`TIMEOUT` : set message ttl and client timeout. default to 60 (for large number of destinations use appropriate value)
79101
`RMB_BIN` : set the path of the rmb_peer binary file. default to `../../target/x86_64-unknown-linux-musl/release/rmb-peer`
80102

103+
Additionally, you can set `VERBOSE` to true (or any non-empty value) to display detailed response and error messages and/or `DEBUG` can be configured to enable debug output.
104+
81105
```sh
82106
MNEMONIC="[YOUR MNEMONIC]" TIMEOUT=500 ./test_live_nodes.sh main
83107
```
84108

85-
### More Customized method:
109+
### More Customized method
110+
86111
- Test all dest twins to ensure that they are reachable over RMB
112+
87113
```sh
88114
# The nodes.sh script when used with `--likely-up` option will output the IDs of the online nodes in the network using the gridproxy API.
89115
python3 ./rmb_tester.py -d $(./scripts/twins.sh --likely-up main) -c "rmb.version" -t 600 -e 600
90116
```
117+
91118
Note: this tool is for testing purposes and not optimized for speed, for large number of destinations use appropriate expiration and timeout values.
92119

93120
you can copy and paste all non responsive twins and run `./twinid_to_nodeid.sh` with the list of twins ids for easy lookup node id and verfiying the status (like know if node in standby mode).
121+
94122
```sh
95123
./scripts/twinid_to_nodeid.sh main 2562 5666 2086 2092
96124
```
@@ -99,6 +127,7 @@ First arg is network (one of `dev`, `qa`, `test`, `main`)
99127
Then you follow it with space separated list of twin ids
100128

101129
the output would be like
130+
102131
```sh
103132
twin ID: 2562 node ID: 1419 status: up
104133
twin ID: 5666 node ID: 3568 status: up

tools/rmb_tester/rmb_tester.py

Lines changed: 38 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -74,24 +74,34 @@ def send_all(messages):
7474
responses_expected = 0
7575
return_queues = []
7676
with alive_bar(len(messages), title='Sending ..', title_length=12) as bar:
77-
for msg in messages:
78-
r.lpush("msgbus.system.local", msg.to_json())
79-
responses_expected += len(msg.twin_dst)
80-
return_queues += [msg.reply_to]
81-
bar()
77+
with r.pipeline() as pipe:
78+
for msg in messages:
79+
pipe.lpush("msgbus.system.local", msg.to_json())
80+
responses_expected += len(msg.twin_dst)
81+
return_queues += [msg.reply_to]
82+
bar()
83+
pipe.execute() # Execute all commands in the pipeline at once
8284
return responses_expected, return_queues
8385

8486
def wait_all(responses_expected, return_queues, timeout):
85-
responses = []
86-
err_count = 0
87-
success_count = 0
88-
with alive_bar(responses_expected, title='Waiting ..', title_length=12) as bar:
89-
for _ in range(responses_expected):
90-
start = timer()
91-
result = r.blpop(return_queues, timeout=timeout)
92-
if not result:
93-
break
94-
timeout = timeout - round(timer() - start, 3)
87+
responses = []
88+
err_count = 0
89+
success_count = 0
90+
start_time = timer()
91+
timedout = False
92+
93+
with alive_bar(responses_expected, title='Waiting ..', title_length=12) as bar:
94+
while responses_expected > 0:
95+
elapsed_time = timer() - start_time
96+
remaining_time = timeout - elapsed_time
97+
98+
if remaining_time <= 0:
99+
timedout = True
100+
break
101+
102+
# Use the remaining time for the blpop timeout
103+
result = r.blpop(return_queues, timeout=remaining_time)
104+
if result:
95105
response = Message.from_json(result[1])
96106
responses.append(response)
97107
if response.err is not None:
@@ -101,7 +111,11 @@ def wait_all(responses_expected, return_queues, timeout):
101111
success_count += 1
102112
bar.text(f'received a response from twin {response.twin_src} ✅')
103113
bar()
104-
return responses, err_count, success_count
114+
responses_expected -= 1
115+
if timedout:
116+
print("Timeout reached, stopping waiting for responses.")
117+
118+
return responses, err_count, success_count
105119

106120
def main():
107121
global r
@@ -135,16 +149,22 @@ def main():
135149
print(f"received_success: {success_count}")
136150
print(f"received_errors: {err_count}")
137151
print(f"no response errors (client give up): {no_responses}")
138-
responding = {int(response.twin_src) for response in responses}
152+
responding = {int(response.twin_src) for response in responses if response.twin_src != "" }
139153
not_responding = set(args.dest) - responding
140154
print(f"twins not responding (twin IDs): {' '.join(map(str, not_responding))}")
141155
print(f"elapsed time: {elapsed_time}")
156+
if responses_expected == success_count:
157+
print("\033[92m🎉 All responses received successfully! 🎉\033[0m")
158+
else:
159+
missing_responses = (no_responses / responses_expected) * 100
160+
print(f"\033[93m⚠️ Warning: {missing_responses:.2f}% of responses are missing! ⚠️\033[0m")
161+
142162
print("=======================")
143163
if not args.short:
144164
print("Responses:")
145165
print("=======================")
146166
for response in responses:
147-
print(response)
167+
print({k: v for k, v in response.__dict__.items() if v})
148168
print("=======================")
149169
print("Errors:")
150170
print("=======================")
Lines changed: 75 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,108 @@
11
#!/usr/bin/env bash
22

33
case $1 in
4-
main|dev|qa|test ) # Ok
5-
;;
6-
*)
7-
# The wrong first argument.
8-
echo 'Expected "dev", "qa", "test", or "main" as second arg' >&2
9-
exit 1
4+
main|dev|qa|test ) # Ok
5+
;;
6+
*)
7+
# The wrong first argument.
8+
echo 'Expected "dev", "qa", "test", or "main" as second arg' >&2
9+
exit 1
1010
esac
1111

12+
if [ -z "$MNEMONIC" ]; then
13+
echo 'MNEMONIC is not set'
14+
echo 'Please set the MNEMONIC environment variable'
15+
echo 'Example: MNEMONIC="..." ./test_live_nodes.sh <NETWORK-ALIAS>'
16+
exit 1
17+
fi
1218

13-
if [[ "$1" == "main" ]]; then
14-
SUBSTRATE_URL="wss://tfchain.grid.tf:443"
15-
RELAY_URL="wss://relay.grid.tf"
19+
if [[ "$1" == 'main' ]]; then
20+
SUBSTRATE_URL='wss://tfchain.grid.tf:443'
21+
RELAY_URL='wss://relay.grid.tf'
1622
else
1723
SUBSTRATE_URL="wss://tfchain.$1.grid.tf:443"
1824
RELAY_URL="wss://relay.$1.grid.tf"
1925
fi
20-
RMB_LOG_FILE="./rmb-peer.log"
26+
RMB_LOG_FILE='./rmb-peer.log'
2127
TIMEOUT="${TIMEOUT:-60}"
2228
RMB_BIN="${RMB_BIN:-../../target/x86_64-unknown-linux-musl/release/rmb-peer}"
29+
VERBOSE="${VERBOSE:-false}"
30+
DEBUG="${DEBUG:-false}"
31+
32+
if [ -f "$RMB_BIN" ]; then
33+
binary_version_output=$( "$RMB_BIN" --version )
34+
else
35+
echo "rmb-peer binary not found at $RMB_BIN"
36+
exit 1
37+
fi
2338

2439
cleanup() {
25-
echo "stop all bash managed jobs"
26-
jlist=$(jobs -p)
27-
plist=$(ps --ppid $$ | awk '/[0-9]/{print $1}')
28-
29-
kill ${jlist:-$plist}
40+
set +e
41+
debug 'cleaning up initiated'
42+
if [ -n "$VIRTUAL_ENV" ]; then
43+
debug 'deactivating virtual environment'
44+
deactivate
45+
fi
46+
# close redis-server
47+
debug 'closing redis-server ...'
48+
redis-cli -p 6379 shutdown
49+
jlist=$(jobs -pr)
50+
plist=$(ps --ppid $$ | awk '/[0-9]/{print $1}' | grep -v -E "^$$|^$(pgrep -f 'ps')|^$(pgrep -f 'awk')|^$(pgrep -f 'grep')$")
51+
pids=${jlist:-$plist}
52+
if [ -n "$pids" ]; then
53+
debug "stop rmb-peer and all bash managed jobs"
54+
kill $pids
55+
else
56+
debug "All jobs in this bash session have completed or stoped, so there are none left to clean up."
57+
fi
3058
}
3159

32-
trap cleanup SIGHUP SIGINT SIGQUIT SIGABRT SIGTERM
60+
debug() {
61+
if [[ "$DEBUG" == "true" ]]; then
62+
echo "$@"
63+
fi
64+
}
3365

66+
trap cleanup SIGHUP SIGINT SIGQUIT SIGABRT SIGTERM
3467

68+
echo 'starting live nodes rmb test script ...'
69+
echo "network: $1net"
70+
debug "script version: $(git describe --tags)"
71+
debug "rmb-peer version: $binary_version_output"
3572
# start redis in backgroud and skip errors in case alreday running
3673
set +e
37-
echo "redis-server starting .."
74+
debug 'redis-server starting ...'
3875

39-
redis-server --port 6379 2>&1 > /dev/null&
76+
redis-server --port 6379 > /dev/null 2>&1 &
4077
sleep 3
78+
# clear all databases
79+
debug 'Removes all keys in Redis'
80+
redis-cli -p 6379 FLUSHALL > /dev/null 2>&1 &
4181
set -e
4282

83+
# ensure that RMB is not already running
84+
if pgrep -x $(basename "$RMB_BIN") > /dev/null; then
85+
echo 'Another instance of rmb-peer is already running. Killing...'
86+
pkill -x $(basename "$RMB_BIN")
87+
fi
88+
89+
# ensure the MNEMONIC has no leading or trailing spaces
90+
MNEMONIC="${MNEMONIC#"${MNEMONIC%%[![:space:]]*}"}"; MNEMONIC="${MNEMONIC%"${MNEMONIC##*[![:space:]]}"}"
91+
4392
# start rmb in background
44-
echo "rmb-peer starting .."
93+
debug "rmb-peer starting ($1net).."
4594
$RMB_BIN -m "$MNEMONIC" --substrate "$SUBSTRATE_URL" --relay "$RELAY_URL" --redis "redis://localhost:6379" --debug &> $RMB_LOG_FILE &
4695

4796
# wait till peer establish connection to a relay
48-
timeout --preserve-status 10 tail -f -n0 $RMB_LOG_FILE | grep -qe 'now connected' || (echo "rmb-peer taking too much time to start! check the log at $RMB_LOG_FILE for more info." && cleanup)
97+
if ! timeout --preserve-status 20 tail -f -n0 $RMB_LOG_FILE | grep -qe 'now connected'; then
98+
echo "rmb-peer taking too much time to start! check the log at $RMB_LOG_FILE for more info."
99+
cleanup
100+
exit 1
101+
fi
49102

50103
# start rmb_tester
51104
source venv/bin/activate
52-
echo "rmb_tester starting .."
53-
python3 ./rmb_tester.py -d $(./scripts/twins.sh --likely-up $1) -c "rmb.version" -t $TIMEOUT -e $TIMEOUT --short
54-
deactivate
105+
debug "rmb_tester starting .."
106+
python3 ./rmb_tester.py -d $(./scripts/twins.sh --likely-up $1) -c "zos.system.version" -t "$TIMEOUT" -e "$TIMEOUT" $(if [[ "$VERBOSE" == "false" ]]; then echo "--short"; fi)
55107

56108
cleanup

0 commit comments

Comments
 (0)