Health Checks

 
monitor.png

This page contains all the information you need to know about the Health Checks including detailed information on each one supplied with Wolfpack. In general a HealthCheck will return a result that is "true" or "false" indicating success or failure of the test plus it will also provide a numeric (double) value in the ResultCount property - this value being appropriate to the specific test; eg: the MSMQ Queue Info HealthCheck Result property is based on whether the queue exists or not and the ResultCount value will be the number of items within the queue if it does exist. Please consult the documentation for each HealthCheck below for details on the value placed in the ResultCount property - this can help you should you wish to display this in a Geckoboard dashboard widget.

A HealthCheck contains these data properties and each check will set them as required...
  • Result (bool)
  • ResultCount (double)
  • Tags (string) - comma delimited set of keywords to help identify this result
  • Properties (name/value string pair collection) - allows you to bundle ANY additional result information with this result.

Exception Handling

A Health Check is effectively "sandboxed" by it's scheduler - very simply a Health Checks "Execute()" method is wrapped by the framework in a try/catch block and should it throw an exception it will be caught and details of the exception will be published; the result will be "null" as we don't know whether it actually passed or failed. Detailed information about the exception is available in the Result.Check.CriticalFailureDetails property; the Result.Check.CriticalFailure property will also be set to true to clearly indicate a critical failure has occured. The result message will also contain a GUID - this GUID is used by Wolfpack when it logs the error in the Wolfpack.log file and allows you to correlate the failing Result message with the exception details and stack trace in the error log.

The Health Check will continue to execute in case this is just a transient problem.

So - no explicit code is required to deal with exceptions, the Wolfpack infrastructure takes care of this for you however if you do need to trap an exception but still want this reported/handled by Wolfpack then in your custom HealthCheck provide your own try/catch and then just rethrow the exception after you have done your processing to allow Wolfpack to deal with it.

Current Catalogue

This is the current set of health checks provided by Wolfpack and provides comprehensive coverage for monitoring your system and applications however if you have a monitoring requirement not covered or would like to add a new feature to an existing check then please start a new discussion. Finally it's easy to write your own check component and this guide provides a walk-thru to create a simple "hello world" check from which you can use as a base to extend.
  • CPU % - checks the overall processor load and can optionally provide alerts based on utilisation level %
  • Disk Space Used - checks the amount of disk space used on the specified drive and can optionally provide alerts based on space used %
  • MSMQ Queue Info - checks a specific queue exists and returns additional information about it.
  • MSMQ Queue Not Empty - checks a specific queue exists and whether it is empty (success) or contains items (failure).
  • WMI Process Not Running - counts the number of instances of the specified process and raises a failure if there are zero running.
  • SqlServer Query - allows you to build health check results from custom SqlServer queries.
  • LogParser Query - allows you to build health check results based on LogParser results.
  • Windows Service State - checks the service is in the state expected.
  • Windows Service Startup - checks the service has the expected Startup Type set (eg: Auto, Manual, Disabled).
  • Host Ping - (new in v2.3) ICMP ping a server, optional roundtrip response time alert threshold and only alert if slow/no response.
  • Url Ping - pings a set of urls and checks they respond (Http 200 OK) and within an optional response time.
  • File Info - checks a specific file exists and returns additional information about it.
  • Folder Info - checks a specific folder exists and returns additional information about it.
  • Owl Energy Monitor - If you want to monitor your home/office/properties energy usage then Wolfpack can access the OWL data and display it on Geckoboard for you.

Contrib

That's not the end of the story as far as plug-ins go... a separate Wolfpack Contrib project established by @RobGibbens provides additional healthchecks including an email publisher - check it out for more plug-in goodness!
  • Fakes - this provides two checks, one that provides an failure result and one a successful result. These are useful for testing your publishers out.
  • NuGet Release - this checks for new releases of a NuGet package. Supports private feeds and multiple packages.
  • MongoDb - this provides checks for monitoring MongoDb databases.
  • SSL Certificate Expiry - this provides a check for monitoring SSL certificate expiry.

Last edited Oct 2, 2012 at 11:17 PM by jimbobdog, version 17