Vigil: homebrewed network monitoring solution
With the server migration over I turned my attention yesterday to finding a network monitoring solution. Back when I was with a dedicated server from a managed hosting provider, I left the monitoring entirely in their hands. Now that I’m in the cloud it’s up to me.
So I started by installing RDDTool to generate pretty graphs, and today I’ve been working on a simple Sinatra front-end to display them in the browser. Every project needs a name, even mini-projects like this one, so I’ve christened this one as Vigil.
Here’s what the DSL for defining hosts looks like right now:
# 60 seconds is actually the default but you can override it like this
vigil.defaults[:interval] = 60.seconds
host 'wincent.dev' do |host|
# the :every option can be used to override the checking interval
# for a specific monitoring task
host.ping :every => 60.seconds do
alert.when :packet_loss => equals(100.percent) do
# optional custom code to be run here
end
alert.when :packet_loss => greater_than(50.percent),
:for => more_than(5.minutes),
:severity => :medium
alert.when :avg_rtt => greater_than(1000),
:for => more_than(2.minutes)
alert.when :avg_rtt => greater_than(500),
:severity => :medium
ok.when :packet_loss => less_than_or_equal(10.percent),
:for => more_than(5.minutes)
end
host.https '/heartbeat/ping' do
alert.when :status => not_equal_to(200)
alert.unless :response_body => matches(/I am alive!/)
# if no "ok" block defined, an implicit "ok" is derived when none of
# the other conditions matches
end
host.http '/a/products/' do
alert.when :status => not_equal_to(200)
alert.unless :response_body => matches(/Install is easily configured/)
end
end
Next step: a simple queuing system to run those monitoring tasks in a background thread.