Tuesday, December 10, 2013

Creating sensu alerts based on graphite data

We had the need to create Sensu alerts based on some of the metrics we send to Graphite. Googling around, I found this nice post by @ulfmansson talking about the way they did it at Recorded Future. Ulf recommends using Sean Porter's check-data.rb Sensu plugin for alerting based on Graphite data. It wasn't clear how to call the plugin, so we experimented a bit and came up with something along these lines (note that check-data.rb requires the sensu-plugin gem):

$ ruby check-data.rb -s graphite.example.com -t "movingAverage(stats.lb1.prod.assets-backend.session_current,10)" -w 100 -c 200

This run the check-data.rb script against the server graphite.example.com (-s option) requesting the value or the target metric movingAverage(stats.lb1.prod.assets-backend.session_current,10) (-t option) and setting a warning threshold of 100 for this value (-w option), and a critical threshold of 200 (-c option).  The target can be any function supported by Graphite. In this example, it is a 10-minute moving average for the number of sessions for the "assets" haproxy backend. By default check-data.rb looks at the last 10 minutes of Graphite data (this can be changed by specifying something like -f "-5mins").

To call the check in the context of sensu, you need to deploy it to the client which will run it, and configure the check on the Sensu server in a json file in /etc/sensu/conf.d/checks:

"command": "/etc/sensu/plugins/check-data.rb -s graphite.example.com -t \"movingAverage(stats.lb1.prod.assets-backend.session_current,10)\" -w 100 -c 200"

No comments:

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...