-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathrun-stat-tuple
executable file
·82 lines (75 loc) · 2.34 KB
/
run-stat-tuple
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#
# Synopsis:
# Summarize stats and duration in run/{flowd,bio4d}.{pid,gyr} files.
# Description:
# Flowd|bio4d processes can be in one of five states:
#
# UP => running as expected.
# DOWN => is not running but is expected to be online (OK)
# ZOMBI => process is supposed be dead but appears alive
# YETI => process is supposed to be alive but no updates
# OFFL => process is intentionally off line (cleanly)
#
# The stats accumulated are since boot and the most recent sample.
# Typically samples are every 10 seconds. Green/Yellow/Red exist
# for both the boot and recent samples.
#
# This script folds the above values into a single, tab-separated tuple.
# suitable for easy parsing by shell scripts and what-not.
#
# [bio4d|flowd|flowd>remote] \
# [UP|DOWN|ZOMBI|YETI|OFFL]] \
# <boot-epoch> \
# <boot-green> <boot-yellow> <boot-red> \
# <recent-epoch> \
# <recent-green> <recent-yellow> <recent-red> \
#
# Here is an example tuple for the flowd process
#
# flowd\tUP\1630547570\t1080\t0\t01630884910t275\t0\t0
#
# <green> is the count of correct requests, <yellow> the count
# of worrisome requests which may impact service and <red> is the
# count of requests which do impact service.
#
# See the individual processes for the meaning of green,yellow,red.
#
# sbin/gyr-flowd-tuple
# sbin/gyr-bio4-tuple
#
# Usage:
# export BLOBIO_ROOT=$HOME/opt/blobio
# run-stat-tuple | ... sort | ... column
#
# while sleep 60; do
# run-stat | put-to-db
# done
# See:
# run-stat-tuple
# Note:
# triple check that all run-stat-*-tuple do not return empty fields.
# From time to time, i see errors that appear to be due to an empty
# string, especially after a reboot.
#
# Really, really, really need to move the files in log/*.{qdr,xdr,fdr}
# to directory data/ or spool/!
#
STALE_AFTER=61 # need to derive heartbeat values in etc/profile
die()
{
echo "$(basename $0): ERROR: $@" >&2
exit 1
}
test $# = 0 || die "wrong number cli arguments: got $#, expected 0"
test -n "$BLOBIO_ROOT" || die "env not defined: BLOBIO_ROOT"
cd $BLOBIO_ROOT || die "cd $BLOBIO_ROOT: exit status=$?"
run-stat-bio4d-tuple
run-stat-flowd-tuple flowd
# check sync flowd process
find sync/remote \
-maxdepth 2 \
-name run \
\( -type d -o -type l \) |
while read RUN_DIR; do
run-stat-flowd-tuple "flowd>$(basename $(dirname $RUN_DIR))"
done