1. Introduction

Note: This has been written for two audiences: serious geeks and others who may not be terribly familiar with Unix text-handling capabilities. I'm not trying to talk over (or down to) anyone.

There are several good reasons to monitor your logs:

Make sure what should happen (backups, etc) actually does.
Find out if something that shouldn't happen (disk failure, etc) does.
Be warned about delinquents rattling your doorknob.

This setup works for Unix-like systems including Solaris, FreeBSD, and Linux. It can be used for any type of event log that can be put into text form.

2. Automate this

Everyone's got better things to do than check logfiles by hand, and your system can do a more thorough job than you can, anyways. There are several programs that'll handle this, and the one I use is called checksyslog.

2.1. How it works

The best description is in the original article. Unfortunately, this link seems to have disappeared. Here's a local copy.

Cliff-notes version: reduce the noise that you don't care about by weeding out benign log entries, leaving just the stuff you want to see.

2.2. Regular expressions

The checksyslog scanner relies on pattern files containing regular expressions (regexes) to decide what to ignore. A regex is a concise, structured description of a search you want to do. The scanner is written in Perl, which has a very powerful regex engine and has been bundled with just about every Unix/Linux system for the last decade.

Here's an example of some system maintenance entries I can ignore:

user.notice: Feb 26 23:05:01 fdb-driver: start
user.notice: Feb 26 23:05:01 fdb-driver: running fdbclean
user.notice: Feb 27 04:55:28 fdb-driver: compressing files in /var/fdb/2012/0325
user.notice: Feb 27 04:57:45 fdb-driver: done

Here's the regex matching those lines:

fdb-driver: (start|compressing files|running fdbclean|done)

This means match any line including fdb-driver: followed by a space and any one of start, compressing files, running fdbclean, or done.

3. Make it easy to scan your logs

A little preparation makes this a lot easier.

3.1. Separate your log output

If possible, split incoming syslog messages into several files.

Some message categories are more active than others (i.e., informational messages about system maintenance), and others have less activity but need more attention (i.e., your kernel or authentication logs). This way, you can tune your checking depending on the log you're looking at.
More specific logfile entries means fewer patterns to match against every time the log-checker is run, and therefore less time for a given scan.

There's no reason to create logs by hand when the system can do it for you:

#!/bin/ksh
#
# $Revision: 1.1 $ $Date: 2012-03-28 21:08:51-04 $
# $UUID: 9618d29a-2746-3be0-931d-8f37bb6bfce3 $
#
#<make-syslog: set up logfiles

logdir=/var/log

files=\
'root wheel 640 ./authlog
root wheel 600 ./cron
root wheel 640 ./ftplog
root wheel 640 ./kernlog
root wheel 644 ./lastlog
root wheel 640 ./local0log
root wheel 640 ./local1log
root wheel 640 ./local2log
root wheel 640 ./local3log
root wheel 640 ./local4log
root wheel 640 ./local5log
root wheel 640 ./local6log
root wheel 640 ./local7log
root wheel 640 ./lpdlog
root wheel 640 ./maillog
root wheel 640 ./newslog
root wheel 640 ./ntplog
root wheel 640 ./securelog
root wheel 640 ./syslog
root wheel 600 ./userlog
root wheel 640 ./uucplog'

cd $logdir

echo "$files" |
while read owner group mode file
do
    test -f $file || echo "cp /dev/null $file"
    echo "chown $owner $file"
    echo "chgrp $group $file"
    echo "chmod $mode $file"
done

exit 0

(Download this)

Here's a syslog.conf file for a production server:

# /etc/syslog.conf
#
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
kern.none;mail.none;authpriv.none;auth.none;cron.none    /var/log/syslog

# authlog has restricted access
auth.*                                          /var/log/authlog
authpriv.*                                      /var/log/authlog
cron.*                                          /var/log/cron
daemon.*                                        /var/log/syslog
kern.*                                          /var/log/kernlog
lpr.*                                           /var/log/lpdlog
user.*                                          /var/log/syslog

# Everybody gets emergency messages
*.emerg                                         *

# Local logs: Save boot messages also to boot.log
local0.*                                        /var/log/local0log
local1.*                                        /var/log/local1log
local2.*                                        /var/log/local2log
local3.*                                        /var/log/local3log
local4.*                                        /var/log/local4log
local5.*                                        /var/log/local5log
local6.*                                        /var/log/local6log
local7.*                                        /var/log/bootlog

(Download this)

3.2. Viewing multiple logfiles

This script lets me view the most recent log entries consistently. I make multiple links to one script so I don't have a dozen almost identical scripts to deal with.

#!/bin/ksh
#
# $Revision: 1.3 $ $Date: 2011-01-18 17:27:59-05 $
# $UUID: 7241bf1f-0b8b-33bf-917f-3a0b74768528 $
#
#<syslog: have a look at recent messages
#   
# To install:
#   list='auth boot cron kern local0 local1
#      local2 local3 local4 local5 local6 ntp rpm su'
#   for x in $list; do ln syslog ${x}log; done

tag=${0##*/}
sd=/var/log

case "$tag" in
    authlog)   file=$sd/$tag         ;;
    bootlog)   file=$sd/$tag         ;;
    cronlog)   file=$sd/cron         ;;
    faillog)   file=$sd/$tag         ;;
    kernlog)   file=$sd/$tag         ;;
    local0log) file=$sd/$tag         ;;
    local1log) file=$sd/$tag         ;;
    local2log) file=$sd/$tag         ;;
    local3log) file=$sd/$tag         ;;
    local4log) file=$sd/$tag         ;;
    local5log) file=$sd/$tag         ;;
    local6log) file=$sd/$tag         ;;
    syslog)    file=$sd/$tag         ;;
    ntplog)    file=$sd/$tag         ;;
    rpmlog)    file=$sd/rpmpkgs      ;;
    sulog)     file=$sd/sudo         ;;
esac

exec less +G $file
exit 1

(Download this)

Typing kernlog takes me immediately to the end of the current kernel logfile. Since entries are appended, I'm seeing the most recent stuff. You can always replace less with tail or whatever floats your boat.

4. Periodic checking

You shouldn't have to worry about reminding your system to check logfiles; set up cron to do it as often as you like and forget about it.

Use something like this for root's crontab file if you want to check logfiles every 5 minutes around the clock:

# Other environment variables set by cron:
#   USER=root   SHLVL=1   LOGNAME=root
#
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/libexec
MAILTO=root
HOME=/
#
# To test, uncomment this line:
## *    *     *   *   *    /bin/env > /tmp/env$$
#============================================================================
# Everything on a line is separated by blanks or tabs.
#
#+--------------------------- Minute (0-59)
#|   +----------------------- Hour   (0-23)
#|   |     +----------------- Day    (1-31)
#|   |     |   +------------- Month  (1-12)
#|   |     |   |   +--------- Day of week (0-6, 0=Sunday)
#|   |     |   |   |    +---- Command to be run
#|   |     |   |   |    |
#v   v     v   v   v    v
#============================================================================
# Hourly, daily stuff.
01   *     *   *   *    run-parts /etc/cron.hourly
02   2     *   *   *    run-parts /etc/cron.daily
22   4     *   *   0    run-parts /etc/cron.weekly
42   4     1   *   *    run-parts /etc/cron.monthly
#============================================================================
# LOCAL:
# Check logfiles every five min starting at 1 min past the hour.
1-57/5  *  *   *   *    /usr/local/cron/run-checksyslog
#----------------------------------------------------------------------------
# EOF

(Download this)

4.1. Don't nag

I don't need checksyslog whining about stuff I've already seen, so this script compares the output to the most recent run and only sends me a popup message if something's changed. Line numbers added for readability:

 1  #!/bin/ksh
 2  #
 3  # $Revision: 1.12 $ $Date: 2024-05-20 00:35:41-04 $
 4  # $UUID: e55cf839-1444-3aa9-b2c6-397da5b4286e $
 5  #
 6  # Driver for filter to check any syslog files for odd entries.
 7  
 8  PATH=/bin:/usr/bin:/usr/local/libexec
 9  export PATH
10  umask 022
11  
12  # Variables and functions.
13  cfgdir=/usr/local/lib/checksyslog   # directory holding rulesets
14  rundir=/var/checksyslog             # record of previous run
15  host=$(hostname)
16  
17  # should pop up on your desktop.
18  alert () {
19      echo "$*" | mailx admin-urgent
20  }
21  
22  die () {
23      alert "$*"
24      exit 1
25  }
26  
27  # Sanity checks.
28  test -d "$cfgdir" || die $cfgdir directory not found
29  cd $rundir        || die $rundir chdir failed
30  
31  # Run each set of rules, compare output to previous run.
32  
33  for rfile in $cfgdir/*
34  do
35      b=$(basename $rfile)
36      current="cur.$b"
37      new="new.$b"
38      logfile="/var/log/$b"
39  
40      test -f $current || touch $current
41      checksyslog --rules $rfile --log $logfile --today > $new
42      subject="$host: $logfile entries"
43  
44      if test -s $new
45      then
46          cmp -s $current $new
47          case "$?" in
48              0) ;;
49  
50              *) alert "$subject"
51                 comm -23 $new $current | mailx -s "$subject" syslog
52                 ;;
53          esac
54      fi
55  
56      mv $new $current
57  done
58  
59  exit 0

(Download this)

Lines 1-10 are boilerplate. 17-20 write a popup message to my desktop, and 22-25 make the script roll over and die if something's seriously wrong.

The interesting lines are 50-51. 50 sends a one-line popup telling me what host and logfile had a problem, and 51 finds the unique lines in the newest output compared to the previous output and mails them to me.

4.2. Where to put pattern files

This might make the script above a little easier to understand. The links under the "checksyslog" directory are my production filtering rules, with locally-sensitive stuff stripped out.

/usr
|   +--local
|   |   +--lib
|   |   |   +--checksyslog     [Pattern files]
|   |   |   |   +--authlog
|   |   |   |   +--cronlog
|   |   |   |   +--kernlog
|   |   |   |   +--syslog

/var
|   +--checksyslog             [Results from previous run]
|   |   +--cur.authlog
|   |   +--cur.cronlog
|   |   +--cur.kernlog
|   |   +--cur.syslog

/var
|   +--log                     [Actual logfiles]
|   |   +--authlog
|   |   +--cronlog
|   |   +--kernlog
|   |   +--syslog

Account creation, etc is written to /var/log/authlog
Filtering rules are in /usr/local/lib/checksyslog/authlog
Results from the most recent scan are in /var/checksyslog/cur.authlog

The script above doesn't care how you name your logfiles, as long as the last parts of the names are consistent.

5. Alert messages

If you have several hosts to monitor, it's better to set up a mail address that will automatically send you a popup message or alert of some type if something nasty happens. Procmail will handle that very nicely, and it comes with most Linux boxes.

Popups can be incredibly annoying, so I don't use them unless there's something requiring immediate attention. If you use X-Windows, have a look at the xalarm package. If not, write will do the trick:

#!/bin/ksh
#
# $Revision: 1.3 $ $Date: 2011-09-25 19:51:09-04 $
# $UUID: 97362b8a-57af-3b67-b751-ce8712d62c27 $
#
#<popup: send a quick popup message.

export PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
export USER=yourname

# If the user isn't taking calls, exit.
test -f "$HOME/.nopopup" && exit 0

# If no message, exit.
case "$#" in
    0)  exit 0 ;;
    *)  str=${1+"$@"} ;;
esac

# If running under X use xalarm, else use write.
case "$DISPLAY" in
    "") set X $(who | grep pts/ | head -1)
        tty="$3"
        echo "$str" | write $USER $tty
        ;;

    *)  set X $(date)
        today="$4 $3 $5"
        msg=$(echo "$today @ $str" | tr '@' '\012')
        export DISPLAY
        xalarm -name xmemo -time +0 -geometry +20-40 -nowarn "$msg"
        ;;
esac

exit 0

(Download this)

6. Instant messages

You can also use email to send yourself SMS (instant) messages. Here's a list of email-to-SMS gateways as of 18 Dec 2011:

Alltel
10-digit-phone-number@message.alltel.com
Example: 1234567890@message.alltel.com
AT&T (formerly Cingular)
10-digit-phone-number@txt.att.net
10-digit-phone-number@cingularme.com
Example: 1234567890@txt.att.net
Boost Mobile
10-digit-phone-number@myboostmobile.com
Example: 1234567890@myboostmobile.com
Nextel (now Sprint Nextel)
10-digit-phone-number@messaging.nextel.com
Example: 1234567890@messaging.nextel.com
Sprint PCS (now Sprint Nextel)
10-digit-phone-number@messaging.sprintpcs.com
Example: 1234567890@messaging.sprintpcs.com
T-Mobile
10-digit-phone-number@tmomail.net
Example: 1234567890@tmomail.net
US Cellular
10-digit-phone-number@email.uscc.net
Example: 1234567890@email.uscc.net
Verizon
10-digit-phone-number@vtext.com
Example: 1234567890@vtext.com
Virgin Mobile USA
10-digit-phone-number@vmobl.com
Example: 1234567890@vmobl.com

7. Network security

My server uses a packet filter called IPtables, which allows a system administrator to configure the tables and rules used by the Linux kernel firewall.

It's been around for about 14 years, and you can do all sorts of weird things with it, but all I need is basic stuff to allow one or two services, deny everything else, and tell me who's rattling the doorknob.

7.1. IPtables setup

I messed with the filter just long enough to get it working and saved the configuration. Here's a small example which allows SSH and HTTP connections but drops everything else. Line-numbers added for readability:

 1  # Generated by iptables-save v1.3.1 on Sun Apr 23 05:32:09 2006
 2  *filter
 3  :INPUT ACCEPT [0:0]
 4  :FORWARD ACCEPT [0:0]
 5  :LOGNDROP - [0:0]
 6  :OUTPUT ACCEPT [0:0]
 7  -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
 8  -A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT
 9  -A INPUT -i eth0 -p tcp -m tcp --dport 80 -j ACCEPT
10  -A INPUT -i lo -j ACCEPT
11  -A INPUT -j LOGNDROP
12  -A LOGNDROP -p tcp -m limit --limit 5/min -j LOG --log-prefix
            "Denied TCP: " --log-level notice
13  -A LOGNDROP -p udp -m limit --limit 5/min -j LOG --log-prefix
            "Denied UDP: " --log-level notice
14  -A LOGNDROP -p icmp -m limit --limit 5/min -j LOG --log-prefix
            "Denied ICMP: " --log-level notice
15  -A LOGNDROP -j DROP
16  COMMIT
17  # Completed on Sun Apr 23 05:32:09 2006

Lines 1-6 are mostly boilerplate, but line 5 helps with logging; anything I want to deny also gets logged if LOGNDROP is present. Line 15 tells the system to drop anything marked as LOGNDROP.

Line 7 says "keep track of what state a connection is in, and accept it if you've already seen traffic in both directions, or if it's related to another existing connection".

Lines 8-9 allow SSH (port 22) and HTTP (port 80) incoming connections. Line 10 allows anything on the loopback interface.

Lines 11-14 log and drop any other traffic.

7.2. Sample log entries

The packet filter writes entries to the kernel log in text format, so what you see below is what the scanner sees (lines wrapped for readability). 5.6.7.8 = my server, and 10.0.0.1 = our router.

kern.notice: Mar 2 15:49:48 kernel: Denied UDP: IN=eth0 OUT=
    MAC=00:1a:64:a2:02:6a:00:23:05:73:54:00:08:00 SRC=10.0.0.1 DST=5.6.7.8
    LEN=76 TOS=0x00 PREC=0xC0 TTL=255 ID=0 PROTO=UDP SPT=123 DPT=123 LEN=56
kern.notice: Mar 2 15:56:10 kernel: Denied UDP: IN=eth0 OUT=
    MAC=00:1a:64:a2:02:6a:00:23:05:73:54:00:08:00 SRC=10.0.0.1 DST=5.6.7.8
    LEN=76 TOS=0x00 PREC=0xC0 TTL=255 ID=0 PROTO=UDP SPT=123 DPT=123 LEN=56

The router is sending NTP packets (port 123) and we're ignoring them.

7.3. Tweaking the filter

If you want to experiment, have a look at 25 most frequently-used Linux IPtables rules.

7.4. References

8. Checking high-volume logs

If you have a few hundred (or thousand) rapidly-growing logs to monitor, a program called since might help; it reads files and remembers where it left off, so you don't have to read the same useless stuff twice.

8.1. How it works

since works by reading the files you provide and using a simple key-value DB to record how far it got in each one. If the files in question have content being appended, since avoids unnecessary work by using the DB to skip to where it previously finished reading.

We use Samba to create Windows-compatible shares on several Unix servers. Each separate user connection has its own logfile, and my office has over 800 users. The average size of a day's worth of logs is 130-200 Mb, so re-scanning them every few minutes would add up to a ton of wasted I/O.

8.2. Use for regular logfiles

You can also use since to take a quick, human-readable look at the type of logs we've been talking about. First, take a snapshot of your current logs:

me% since /var/log/Xorg.0.log /var/log/authlog /var/log/brlog \
    /var/log/ftplog /var/log/kernlog /var/log/local?log \
    /var/log/maillog /var/log/syslog /var/log/uucplog > /dev/null

The logs are quiet:

me% since -s /var/log/Xorg.0.log ... /var/log/uucplog
s         62045     62045     267389/9904/Xorg.0.log
s         0         0         267389/9939/authlog
s         27        27        267389/1930/brlog
s         0         0         267389/9346/ftplog
s         32142     32142     267389/9953/kernlog
s         15912168  15912168  267389/9824/local0log
s         0         0         267389/9349/local1log
s         316       316       267389/9351/local3log
s         356       356       267389/9352/local4log
s         1902      1902      267389/9353/local5log
s         72        72        267389/9354/local6log
s         120       120       267389/9355/local7log
s         159238    159238    267389/9956/maillog
s         244300    244300    267389/9948/syslog
s         0         0         267389/9362/uucplog

The leading s shows everything's the same. You can also use it to list changed vs. unchanged files without having to read them at all; the second and third columns show the previous end-of-file offset and current filesize.

Change one logfile by writing a dopey message, and display the changes in more readable form:

me% logger running pid $$

me% since -r /var/log/Xorg.0.log ... /var/log/uucplog
==> /var/log/Xorg.0.log [no changes] <==
==> /var/log/authlog [no changes] <==
[...]
==> /var/log/newslog [no changes] <==
==> /var/log/ntplog [no changes] <==
==> /var/log/securelog [no changes] <==

==> /var/log/syslog <==
Feb 27 21:17:09 myhost vogelke: running pid 24049

==> /var/log/uucplog [no changes] <==

since can also write any new entries to a temporary file that checksyslog can scan.

9. Windows event-logs

If you don't feel like installing ActiveState Perl on your Windows box, the easiest thing to do is use SSH or SSL to upload event logs to a Unix host and filter them there.

10. Code

11. Feedback

Feel free to send comments.

Generated from monitor.t2t by txt2tags
$Revision: 1.12 $

Monitoring system logs

Karl Vogel

Mon, 20 May 2024 00:35:41

1. Introduction

2. Automate this

2.1. How it works

2.2. Regular expressions

3. Make it easy to scan your logs

3.1. Separate your log output

3.2. Viewing multiple logfiles

4. Periodic checking

4.1. Don't nag

4.2. Where to put pattern files

5. Alert messages

6. Instant messages

7. Network security

7.1. IPtables setup

7.2. Sample log entries

7.3. Tweaking the filter

7.4. References

8. Checking high-volume logs

8.1. How it works

8.2. Use for regular logfiles

9. Windows event-logs

10. Code

11. Feedback