A realtime visualization of memory and CPU load for CAS servers:
- rosalindf
- alice
- tdobz
This is run in two pieces:
- The monitoring and database side is run on
ibss-central
usingdocker-compose
anddocker
. - The dash based web url is run on
ibss-central
and reset via crontab on that machine.
To start the monitoring process, run as user admin
, which has the keys set up correctly.
Execute
launch-monitor.sh build
to rebuild the docker. Without build
if you've just made code changes.
To start the dash graphs based portion execute
run.sh
A processts.tsv
file has been provided for test data.
In dash_graph.py
where the Analyze
class is instantiated set use_tsv=True
.
Run locally in the command line using:
python3 dash_graph.py
mkdir data
docker-compose build
docker-compose up
- Edit docker-compose IP mapping.
- Edit monitor.py list of servers. E.g.:
HOSTS = ['rosalindf', 'alice', 'tdobz', 'flor']
Deployed in a docker container on ibss-crontab
with only the dash app.
Monitoring is in docker on ibss-central
. Why did I do it that way?!
- TODO: Make hosts configurable
- TODO: Make db password clean
- TODO: Should we pull any other stats? disk IO/ops?
- TODO: generate a per-user report - your core hours this month, top N processes, pie showing your %age ram and %age CPU.
- TODO: Generate real time alerts for the top N when the system is at or near capacity.
- TODO: track GPU usage
- TOOD: Extract a single process and trace its CPU/memory usage over the lifetime of the run