Project Description

Wolfpack aims to be the "swiss army knife" of monitoring. To borrow a line from Etsy..."if it moves we monitor it, even if it doesn't move we'll monitor it just in case it makes a break for it!".

Wolfpack is an extensible .Net windows service based framework for running jobs to monitor your software and system. The data collected can be sent directly to Growl clients, WCF, SQL, SQLite, NServiceBus and it exposes a native Geckoboard REST/JSON data feed so you can create instant dashboards for your system.

It comes preloaded with some tasks but it's simple to implement your own! A contrib project augments this with email, console and MongoDb outputs plus there is even a plugin to give you alerts when a new version of a NuGet package is released! Wolfpack uses another project of mine called Sidewinder to provide a simple update mechanism based on NuGet - upgrading existing or downloading and installing new plugins is super easy and built straight into Wolfpack as a simple command line switch!

Wolfpack was previously known as "MonitorWang" and has been renamed to allow it to be taken slightly more seriously...monitor & wang raises a few eyebrows Stateside! The MonitorWang codeplex project is still available as it still hosts the last .Net 3.5 build (v1.0.8) but is not longer active - all new development will be released in this project and there are no plans to keep MonitorWang up to date.

Quick Start

The Documentation page has links to installing, configuring and creating your own custom plug-ins. You will also find a Roadmap link and details of how to stay up to date with project news. For more information about the idea behind the project and a high level view of how it works read on...

Background

In my day job our team often has to monitor and report on the state of our software - error logs, queues etc. Rather than solve this directly I saw an opening for an open source system/software monitoring framework - something relatively simple to extend and implement your own custom monitoring activities. Our specific requirement is to monitor the length of MSMQ queues (usually an "error" queue!) for our NServiceBus installations so I started this open source project, previously known as "MonitorWang".

What is it?

Wolfpack is a "system monitor" - however it's focus is aimed squarely at monitoring the touch points your application has with it's infrastructure. From my experience over the last 9 years working with large scale eCommerce and business systems it is remarkable how many critical logs, queues, database tables accumulate or record error or abnormal conditions only for it to go un-noticed until it blows up and becomes a real problem. Early detection using a monitoring system could save you a lot of headaches and allows you to spot issues before they cause trouble.

Wolfpack can also be used to monitor for custom business activity - your KPI's, not just errors. The easy to use "Health Checks" it ships with allow you to quickly set up passive queries to detect these activities and data scenarios plus the AppStats feature allows you to actively pump KPI information directly into Wolfpack - combined with a Geckoboard account you can create a powerful business dashboard to visualise your system. Need to see how many high value orders have been placed in the last 3 hours or the number of new users? Simple, Wolfpack can do this "out-of-the-box" against your SqlServer (& other supported) databases!

Wolfpack aims to provide a simple, extensible system you can easily adapt to monitor your mission critical applications, platforms and systems. It's designed so that you can easily create new plugins to detect and monitor scenarios and situations unique to your systems and business however the plugins provided should be able to cover many of these common "touch points" including...
  • IIS logs
  • Firewall logs
  • Event logs (including queries that join to those on remote machines)
  • Many other textual logging formats such as CSV, XML (including making an http call to retrieve the data, eg: RSS, Webservice)
  • FileSystem
  • Sql Server data (write queries to detect any sort of data condition eg: monitor for orders > £value or not despatched after N days)
  • MSMQ
  • Windows Services
  • Web service/site Ping
  • System (CPU, Disk) utilisation
  • Build/CI (TeamCity at the moment) and the extraction of stats from common build tools like NCover, SpecFlow, StoryQ
  • Deployment - Wolfpack can automatically deploy NuGet packages - these can contain your entire website/application or just some unit tests and Wolfpack will detect, download and unpack it, even running the unit tests and sending a notifcation with the results in!
  • SSL Certificate expiry - monitor for certificates getting close to expiry and stop embarrassing incidents like the one SagePay just experienced.
Finally it uses best of breed components to keep you notified of what your system is doing. Growl and Geckoboard provide excellent notification and visualisation mechanisms - Wolfpack provides support for these "out of the box" so you'll never miss a problem again! For those that don't know these components, Growl is a desktop system tray toaster style notification app that can even forward notifications to your smartphone and Geckoboard provides a browser-based dashboard experience with many custom visualisation widgets alongside native support for many enterprise systems like Basecamp, Pingdom, Google Analytics, Zendesk. Plus...it's extensible - the contrib project provides a "HipChat" notification component and you can create your own plugins very easily - there is even a SignalR activity in development to allow you to broadcast notifications to any SignalR connected client!

It's designed so that you can run Wolfpack instances ("Agents") across many servers collecting data and publishing to another "Server" instance where you would republish the data to a database and/or Growl - this is the Distributed System Monitoring role it was primarily aimed at

distributed-monitorwang.png

BTW: It's written in C# with Visual Studio 2010 and targets v4.0 .Net framework. It can be run on both x86 and x64 windows operating systems.

The core is a .Net Windows Service (based on the Topshelf framework) that has a plug-in architecture for performing any custom activities ("Health Checks") you wish on a set schedule. The data from each Health Check is then "published" via another set of extensible plugins ("Publishers"). You can also create general background activity plug-ins ("Activities") - these just start and stop with the service; a good example is the activity that creates a WCF ServiceHost to run a self-hosted WCF service. Finally it supports the ability to create plugins for when a Health Check executes ("Schedulers" eg: at a fixed time/schedule, triggered by an external event etc).

The full list of extensibility points is,
  • Startup Plugin - executes at the very beginning of an Agent starting up - use this for general initialisation or Agent configuration
  • Scheduler Plugin - defines when the associated Health Check plugin will execute
  • HealthCheck Plugin - the actual test/code to run on the schedule defined by its host Scheduler plugin
  • Publisher Plugin - these provide the means to communicate the result of a Health Check
  • Activity Plugin - these provide general service wide features - these are usually messenging orientated - eg: create the NServiceBus handlers to receive NSB messages or create a WCF ServiceHost to allow results to be sent to Wolfpack or retrieve data as in the case of the RESTful Geckoboard Data Service Activity.
  • RoleProfile Plugin - a "role profile" component forms the heart of Wolfpack. A default "Agent" profile is provided and this will load and execute all the components required by the Wolfpack service. You can provide a custom implementation for this core component by simply implementing an interface and passing the name (class name) on the command line switch at startup.
  • Publisher Filters - these provide a way to attach custom rules to whether a result is published. Filters can be global for either Publisher or Health Check or made specific to a single Publisher/HealthCheck combination.
  • Growl Notification Finalisers - these provide a way to attach custom logic to format the Growl notification (priority and message text). There are two built-in Finalisers that quickly allow you to set the priority based on the success/failure of a HealthCheck or the "Count" that a HealthCheck returns via simple xml configuration. Being able to set the priority is useful as you can configure Growl to forward notifications based on the priority (eg: only forward "Emergency" priority notifications to your iPhone). For instance you could create a WMI Process Check to detect a mission critical process that should always be running - if it failed ("count" = 0) you could set the notification to "Emergency". Finalisers can be chained together (order is undefined though). The built-in Finalisers also serve as a great example of how to create your own ones - a base class to inherit from takes care of some heavy lifting so you can concentrate on just writing your logic.

HealthChecks

Plugins provided (a full list with instructions for each is here); see this walk-thru guide for creating a new Health Check. Additional HealthChecks are also available in the Wolfpack.Contrib Project.
  • Reporting MSMQ queue state (queue length, oldest message datetime)
  • Reporting MSMQ queue is not empty
  • Detect if a process running (local/remote - WMI)
  • Windows Service State (local/remote) - checks the services you specify are in the correct state (eg: running)
  • Windows Service Startup (local/remote) - check the services you specify have the correct startup type (eg: auto/manual/disabled)
  • Url Ping - allows you to specify multiple urls to ping and optionally set a response time threshold. If the ping fails or breaches the threshold a failed result will be published. You can also keep track of response times - looks great graphing them with a Geckoboard widget!
  • Host (ICMP) Ping - allows you to specify multiple servers/machines to ping and optionally set a response time threshold. If the ping fails or breaches the threshold a failed result will be published. You can also keep track of response times - looks great graphing them with a Geckoboard widget!
  • CPU % utilisation - reports the current value (local/remote) and can optionally raise alerts when a configurable threshold is breached
  • Disk space % used - reports the used space of a drive (local/remote) and can optionally raise alerts when this is breached (you are running out of space)
  • File information - reports data about a specific file (exists, size etc)
  • Folder information - report data about a specific folder (exists, number of files/sub folders)
  • LogParser based HealthCheck components (query the EventLog, IIS, XML, FileSystem, RSS/Atom feeds! etc)
  • SqlServer Scalar Query - write ad-hoc queries (a SELECT COUNT(*) FROM... statement) to detect custom data conditions in your SqlServer databases
  • Owl Energy Monitor HealthCheck - this allows you to monitor your home/office energy consumption and view the data in your Geckoboard.
  • TeamCity Build Monitor - part of the new v2.1 Build Analytics feature...this allows you to monitor a TeamCity build configuration and get notifications about the state and duration - hook this up to Growl for desktop (& smartphone) alerts and Geckoboard via the Geckoboard Data Service activity for an instant dashboard.

Publishers

  • MSSQL Database - save the monitoring data to MS SqlServer database.
  • SQLite Database - save your Wolfpack data in SQLite format.
  • WCF - transmit the monitoring data to a WCF service (also provided).
  • NServiceBus (NSB) - send the monitoring data as a NSB message (to a NSB Gateway/handler provided).
  • Growl - the awesome system tray notification app. Monitoring data can be sent to a Growl instance, from here you can forward it on to others (say your ops team) and even your iPhone via Prowl!
  • HipChat - send messages to one of your HipChat rooms! (This is a contrib plugin).
The WCF & NServiceBus (NSB) publishers are particularly powerful in that they can be used to transmit the data to another server. This allows Wolfpack to be installed on the server to be monitored ("Agent") and send the data to another server running Wolfpack enabled to capture this data ("Server") where it is republished to whatever publishers have been "Enabled"; it is possible to have multiple Wolfpack "Agents" publishing data via WCF/NSB to the same "Server". An "Agent" publishing to the Growl publisher is also capable of communicating with a Growl client on a remote server.

The SQLite & SqlServer publishers save the data into a simple table called "AgentData" - this allows you to capture the current "state" of the running HealthChecks; it would be possible to write a simple "viewer" app to query the db and display the data on a big flat panel screen sitting in the office so that everyone could see the "system health" however I recommend that you check out Geckoboard as Wolfpack has native support for this via the Geckoboard Data Service Activity. Wolfpack will publish to ALL "Enabled" publishers so you could set up the "Server" to republish to both a Database and Growl.

Remember, Growl is a system tray application - it only runs for the logged-in/interactive user!

Additional Publishers are also available in the Wolfpack.Contrib Project.

Activities

  • Geckoboard Data Service - A WCF REST Starter kit based JSON data feed that connects your Wolfpack data stored in a database to Geckoboard custom widgets.
  • WCF BasicHttp ServiceHost - provides self-hosting of the Wolfpack WCF service so that the publisher has somewhere to "publish" to!

Schedulers

  • 24/7 Scheduler - provides total control when your health checks execute over a week period. You can set as many times per day as you like and which days including shorthand configuration to set weekdays, weekends and every day of the week; these can all be combined to provide a complete custom schedule. Eg: you could configure a check to run on set times weekdays, set additional times for monday, wednesday & saturday and have no timings on sunday.
  • Interval Scheduler - provides a simple interval based scheduler; executes the associated HealthCheck every N seconds.
It's a simple interface to implement to create your own activities to run any .Net code you like. Custom schedules are also possible by implementing another interface.

Contrib

That's not the end of the story as far as plug-ins go... a separate Wolfpack Contrib project established by @RobGibbens provides additional publishers, healthchecks etc including an email publisher - check it out for more plug-in goodness!

Deployment

I've also worked on making the deployment options flexible. Each plugin has an "Enabled" configuration setting to allow you to quickly reconfigure what is running. Using this method we can quickly deploy the service using a simple xcopy deployment.

Rules & Thresholds

At the moment there are no "rules" or "thresholds" as such - only those baked directly into each HealthCheck (eg: only report something if it's logic says so) - I'm keen to keep HealthChecks dumb and for them to provide a stream of monitoring data and run a separate Activity for the "rules" - this Activity would interpret the monitor data and allow you to set thresholds for alerts. For instance you would have an "Agent" report an MSMQ queue length back to a "Server" which would save it to SQL; the "Rules" Activity would monitor the database and apply any alerting rules (eg: if Agent is X and Queue is Y and Queue Length > 5 then republish data to Growl as an emergency notification).

However there is a feature called Result Publisher Filters - these allow you to intercept the call to each enabled publisher and decide whether to actually publish the result or abort it.

Getting Started with Wolfpack

The documentation should provide you with everything you need.

This guide should help you install and configure Wolfpack...

This guide should help you write you own custom HealthCheck component and get it reporting data...

Last edited Jul 12, 2012 at 10:26 AM by jimbobdog, version 43