Off The Grid
   Simple listserv
   xml tools
Karel as an adult


How do I get syscheck?

For downloads, use the following links:

What's syscheck?

I often find myself hopping from server to server, to see whether jobs have crashed, to check that daemons are running, what not. Being lazy as I am, I'm always trying to make such systems "self-healing" in the sense that possible errors should be auto-detected and fixed. Yes, I admit - being called at night for support is something I avoid like the plague.

There are many tools to accomplish this task, from tiny to fullblown - e.g., cfengine is a truly great one. But in many situations I'll log onto a server where such tooling isn't available, and I find myself needing something small, portable, quick to set up, and just fit for the job.

In such cases I tend to use syscheck:

Well, it works for me. Feel free to grab it and try it out, feel free to modify and extend it. Syscheck is distributed under GPLV3, which basically means that you're free to use it without cost or warranty, that you're free to re-distribute it as long as you don't change the licensing, and that you're free to modify it as you see fit, but in that case you're obliged to make your changes public. If you modify syscheck, I'd appreciate hearing about it - just drop me a mail.

How to use syscheck?

  1. Install syscheck in any directory you like on your system. E.g., /usr/local/bin makes sense.
  2. Copy syscheck.conf.sample to the same directory, and rename it to syscheck.conf.
  3. For a quick overview, to see 'usage' information, just type
  4. Modify your syscheck.conf to suit your needs; the stock file syscheck.conf.sample is pretty self-explanatory (more on the configuration is described below).
  5. Test it out. Run
    syscheck test
    If you need more verbosity, run
    syscheck -v test
  6. For a real run which would verify that all your daemons are running (and which would restart them if necessary), run
    syscheck go
  7. Once you're satisfied, enter the following line in your crontab definition (assuming you want syscheck to run each 5 minutes):
    */5 * * * * /usr/local/bin/syscheck go
  8. If you aren't allowed to use cron, then type
    syscheck daemonize 300
    (where 300 is the sleep-period between checks). When syscheck runs in daemon mode, then sending signal 1 (HUP) to it, instructs it to re-load the configuration.

What's in the configuration?

Syscheck mainly does the following for you: Additionally there are these two things: The formal syntax of all commands is:
Syntax: populate LISTNAME COMMAND. The output of the command is stored in the named list.
Syntax: expect LISTNAME REGEX. The regular expression is searched for in the named list. Expect statements can be repeated to match multiple regular expressions. The outcome is "true" when all matches are made (so it's a logical 'and' match).
Syntax: correct COMMAND. The command is executed when one or more of the previous expects failed to match.
Syntax: eval PERLCODE. This is used to special cases to 'inject' Perl code.
Syntax: system COMMAND. The command is run unconditionally.


The usage of the commands is best illustrated by examples. Below is a configuration that checks whether httpd is present in the process list. If not, apachectl start is executed. This of course checks that Apache is up and running.
  # Get the process list (output of ps ax)
  populate pslist ps ax

  # Check that httpd is in that list, if not run apachectl start
  expect pslist httpd
  correct apachect start
Once a list is available, then the same list can be re-used. Next expect/corect combo's can re-use the list pslist. Also, lists can be constructed from any output; be creative. The following example probes whether Apache is running by (a) examining the process list and searching for httpd, (b) fetching the output of http://localhost/ and searching for the string <html>.
  # Get the process list
  populate pslist ps ax
  # Get http://localhost/
  populate httpoutput curl http://localhost/

  # Match httpd in the processes and <html> in the http output
  expect pslist httpd
  expect httpoutput <html>

  # If one or both are not found, correct the situation. Kill off
  # any misbehaving Apache processes first.
  correct killall -9 httpd; sleep 1; apachectl start
Below is an example of eval. Imagine a hypothetical program mydaemon which won't run unless the environment variable MYHOME is set. Assuming that restarts of mydaemon would not work without that variable, then you'd have basically two options: (a) set MYHOME before calling syscheck, or (b) use eval in the configuration. These options are equivalent. Below is an example of using eval in the configuration file:
  # Get the process list
  populate pslist ps ax

  # Set MYHOME
  eval $ENV{MYHOME} = '/opt/mydaemon/etc';

  # Scan for 'mydaemon' in the process list, if not found, start it
  expect pslist mydaemon
  correct /opt/mydaemon/bin/mydaemon