-
Notifications
You must be signed in to change notification settings - Fork 6
civiccc/cassandra-nagios
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The repo: This repo houses the work we did to monitor Cassandra using Nagios. How repo is structured: src - a patch to apply to nagios so that it does not truncate performance data examples - example configuration nagios - example nagios configuration jolokia - example jolokia configuration plugins - the check_cassandra.pl plugin and Jolokia.pm wrapper Why: Jolokia is a really awesome tool that makes it easy to get at jmx without having to use jmx or java. If you haven't already you should check out jmx4perl, a complete version of the Jolokia API, and another very good script check_jmx4perl. We decided not to use jmx4perl because it had *too many* features (and dependencies). We wrote a minimalist Jolokia.pm module, implementing a very small subset of the Jolokia API (list, read). Also, we wanted to minimize the number of nagios checks we are making while still exporting all of the performance datas. What: The script, check_cassandra.pl, will query the Jolokia API running inside the Cassandra JVM, checking an arbitrary number of metrics. The script returns all of the metrics, not just the checked metrics, as performane data. The examples directory has an example nagios configuration*. The script is meant to run under ePn and can check multiple metrics in each run, so it should be relatively efficient. The most common check is a regex check (-C regex -r '.*') that will check all metrics matching the regular expression. The script can also check the length of a list. The StorageService mbean exposes lists of nodes in certain states. This is the only place this type of check is used. The script can also check for the presence of a string. This is useful, e.g., to check the cluster is NORMAL and the nodes are UP. *A disclaimer here: We have not actually started using Cassandra in production so the thresholds are probably wrong and/or we are not monitoring everything we should and/or we are monitoring things we probably shouldn't be. How: First, you may want to apply the patch in the src directory to Nagios. Nagios truncates plugin output at an arbitrary buffer size. This step is only required if you are interested in collecting all of the performance output returned by the plugin. Without the patch nagios will truncate the performance data. Configure Cassandra: Cassandra needs to be configured with the jolokia-agent[1]. The agent should be on the classpath. The easiest way is to drop the jolokia-jvm.jar into /usr/share/cassandra/lib. Add the following to cassandra-env.sh: # Load the jolokia agent JVM_OPTS="$JVM_OPTS -javaagent:/usr/share/cassandra/lib/jolokia-jvm-1.1.1-agent.jar=\ config=/etc/cassandra/jolokia/jolokia.properties" Then copy the examples/jolokia/ to /etc/cassandra/jolokia/. Configure Nagios: Install the required perl modules on the nagios machine: sudo yum install perl-libwww-perl perl-JSON sudo apt-get install libwww-perl libjson-perl Copy the plugins directory into the nagios plugins directory. Define nagios check commands similar to those in the examples directory. Collect performance data: Use graphios. Seriously, use graphios. It *just works*. [1] http://jolokia.org/agent/jvm.html
About
A script to monitor cassandra
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published