This is a light-weight system monitoring tool for cluster machines.
- Clone this repository as
$HOME/.r_lmondirectory.
$ git clone https://github.com/SofDevs-Do/R-LMON $HOME/.r_lmon
-
Configure/update the
machinefilelocated at$HOME/.r_lmon/machinefile. You will have to add a unique machine ID, username@IP, unique room ID, unique rack ID, and position of the machine in the rack in a new line. You may use '#' as a single line comment in this file, spaces and tabs are allowed in the file for your convinience. However, leading and trailing spaces, tabs will not be ignored. -
Setup a cron job to collect regular data from all of the machines listed in
$HOME/.r_lmon/machinefile.
- Take a backup of your existing cron jobs
$ crontab -l > $HOME/user-cron-backup.cron
You may setup the r_lmon cron job by running the following command
$ crontab $HOME/.r_lmon/core_backend/scripts/main.cron
- Collection of data, storage, and serving of data for viewing is done as shown below:
-
Configure the
serverfile.yamldb_url- IP:Port of the Mongo-DB servercore_backend_url- IP:Port of the backend-server -
Start the MongoDB server and add its IP and Port in the
serverfile.yamlfile. -
Start the core-backend server
$ cd $HOME/.r_lmon/core_backend/web_server
$ gunicorn main:app --bind 127.0.0.1:8001
Add the core-backend server's IP and port in the serverfile.yaml file. You may change the above IP from 127.0.0.1 to any IP for it to be accessed in the LAN.
- Start the web-application front-end
$ cd $HOME/.r_lmon/web_app
$ gunicorn main:app --bind 127.0.0.1:8000
Change the above IP from 127.0.0.1 to any IP for it to be accessed in the LAN.
- Visit the web-application front-end IP (here
127.0.0.1:8000)
- Install dependencies
$ sudo apt install openssh-server sysstat
- Check if
sarcan collect system activity information.- change the
ENABLEDvariable in/etc/default/sysstatfile totrue - Restart sysstat by running the following command
- change the
$ sudo service sysstat restart
-
Enable remote
sshlogins without password by appending the monitoring machine's ssh-public key to the~/.ssh/authorized_keysfile in the cluster machine.-
Generate ssh-public key on the monitoring machine if it does not exist already you need to generate it only ONCE. Fowllow the instructions given here. Leave the passphrase empty, as the tool has not been tested with having a passphrase.
-
Copy over the public key of monitoring machine to all of the monitored machine's
rootusers.
Example:
On monitoring machine$ cat ~/.ssh/id_rsa.pub | ssh <monitored-machine-1-user>@<monitored-machine1-IP> "cat >> /tmp/id_rsa.pub"On monitored machine (monitored-machine-1) Log into monitored-machine-1
$ ssh <monitored-machine-1-user>@<monitored-machine1-IP>Append the contents of
/tmp/id_rsa.pubfile to/root/.ssh/authorized_keysfile$ su # cd ~ # mkdir -p .ssh # cat /tmp/id_rsa.pub >> .ssh/authorized_keys # chmod 600 .ssh/authorized_keysNOTE: There is a security threat while performing the the
catoperation above, as someone with access to any user on the machine can easily replace the/tmp/id_rsa.pubfile. Please verify the contents of the file before appending to the/root/.ssh/authorized_keysfile. If therootuser is not used in the monitored machines, then functions likeshutdownandrebootwill not work from the tool, all other functions will continue to work.
Back on monitoring machine
- Add the details of
<monitored-machine-1>in the$HOME/.r_lmon/machinefilefile present on monitoring machine.
-
