Monitoring system logs Karl Vogel %%mtime(%a, %d %b %Y %T) = Introduction = Note: This has been written for two audiences: serious geeks and others who may not be terribly familiar with Unix text-handling capabilities. I'm not trying to talk over (or down to) anyone. There are several good reasons to monitor your logs: - Make sure what **should** happen (backups, etc) actually does. - Find out if something that **shouldn't** happen (disk failure, etc) does. - Be warned about delinquents rattling your doorknob. This setup works for Unix-like systems including Solaris, FreeBSD, and Linux. It can be used for any type of event log that can be put into text form. = Automate this = Everyone's got better things to do than check logfiles by hand, and your system can do a more thorough job than you can, anyways. There are several programs that'll handle this, and the one I use is called **checksyslog**. == How it works == The best description is in [the original article http://www.jammed.com/~jwa/hacks/security/checksyslog/checksyslog-doc.html]. Cliff-notes version: reduce the noise that you don't care about by weeding out benign log entries, leaving just the stuff you want to see. == Regular expressions == The **checksyslog** scanner relies on pattern files containing regular expressions (regexes) to decide what to ignore. A regex is a concise, structured description of a search you want to do. The scanner is written in Perl, which has a very powerful regex engine and has been bundled with just about every Unix/Linux system for the last decade. Here's an example of some system maintenance entries I can ignore: ``` user.notice: Feb 26 23:05:01 fdb-driver: start user.notice: Feb 26 23:05:01 fdb-driver: running fdbclean user.notice: Feb 27 04:55:28 fdb-driver: compressing files in /var/fdb/2012/0325 user.notice: Feb 27 04:57:45 fdb-driver: done ``` Here's the regex matching those lines: ``` fdb-driver: (start|compressing files|running fdbclean|done) ``` This means //match any line including **fdb-driver:** followed by a space and any one of// //**start**, **compressing files**, **running fdbclean**, or **done**//. = Make it easy to scan your logs = A little preparation makes this a lot easier. == Separate your log output == If possible, split incoming syslog messages into several files. - Some message categories are more active than others (i.e., informational messages about system maintenance), and others have less activity but need more attention (i.e., your kernel or authentication logs). This way, you can tune your checking depending on the log you're looking at. - More specific logfile entries means fewer patterns to match against every time the log-checker is run, and therefore less time for a given scan. There's no reason to create logs by hand when the system can do it for you: %!include: ``make-syslog.txt`` [(Download this) make-syslog.txt] Here's a syslog.conf file for a production server: %!include: ``syslog.conf`` [(Download this) syslog.conf] == Viewing multiple logfiles == This script lets me view the most recent log entries consistently. I make multiple links to one script so I don't have a dozen //almost// identical scripts to deal with. %!include: ``view-logs.txt`` [(Download this) view-logs.txt] Typing //kernlog// takes me immediately to the end of the current kernel logfile. Since entries are appended, I'm seeing the most recent stuff. You can always replace //less// with //tail// or whatever floats your boat. = Periodic checking = You shouldn't have to worry about reminding your system to check logfiles; set up cron to do it as often as you like and forget about it. Use something like this for root's crontab file if you want to check logfiles every 5 minutes around the clock: %!include: ``root-cron.txt`` [(Download this) root-cron.txt] == Don't nag == I don't need **checksyslog** whining about stuff I've already seen, so this script compares the output to the most recent run and only sends me a popup message if something's changed. Line numbers added for readability: ``` 1 #!/bin/ksh 2 # 3 # $Revision: 1.11 $ $Date: 2012-03-30 19:41:51-04 $ 4 # $UUID: e55cf839-1444-3aa9-b2c6-397da5b4286e $ 5 # 6 # Driver for filter to check any syslog files for odd entries. 7 8 PATH=/bin:/usr/bin:/usr/local/libexec 9 export PATH 10 umask 022 11 12 # Variables and functions. 13 cfgdir=/usr/local/lib/checksyslog # directory holding rulesets 14 rundir=/var/checksyslog # record of previous run 15 host=$(hostname) 16 17 # should pop up on your desktop. 18 alert () { 19 echo "$*" | mailx admin-urgent 20 } 21 22 die () { 23 alert "$*" 24 exit 1 25 } 26 27 # Sanity checks. 28 test -d "$cfgdir" || die $cfgdir directory not found 29 cd $rundir || die $rundir chdir failed 30 31 # Run each set of rules, compare output to previous run. 32 33 for rfile in $cfgdir/* 34 do 35 b=$(basename $rfile) 36 current="cur.$b" 37 new="new.$b" 38 logfile="/var/log/$b" 39 40 test -f $current || touch $current 41 checksyslog --rules $rfile --log $logfile --today > $new 42 subject="$host: $logfile entries" 43 44 if test -s $new 45 then 46 cmp -s $current $new 47 case "$?" in 48 0) ;; 49 50 *) alert "$subject" 51 comm -23 $new $current | mailx -s "$subject" syslog 52 ;; 53 esac 54 fi 55 56 mv $new $current 57 done 58 59 exit 0 ``` [(Download this) run-checksyslog.txt] Lines 1-10 are boilerplate. 17-20 write a popup message to my desktop, and 22-25 make the script roll over and die if something's seriously wrong. The interesting lines are 50-51. 50 sends a one-line popup telling me what host and logfile had a problem, and 51 finds the unique lines in the newest output compared to the previous output and mails them to me. == Where to put pattern files == This might make the script above a little easier to understand. The links under the "checksyslog" directory are my production filtering rules, with locally-sensitive stuff stripped out. %!include: ''tree.htm'' - Account creation, etc is written to /var/log/authlog - Filtering rules are in /usr/local/lib/checksyslog/authlog - Results from the most recent scan are in /var/checksyslog/cur.authlog The script above doesn't care how you name your logfiles, as long as the last parts of the names are consistent. = Alert messages = If you have several hosts to monitor, it's better to set up a mail address that will automatically send you a popup message or alert of some type if something nasty happens. [Procmail http://www.procmail.org/] will handle that very nicely, and it comes with most Linux boxes. Popups can be incredibly annoying, so I don't use them unless there's something requiring immediate attention. If you use X-Windows, have a look at the [xalarm ftp://ftp.x.org/contrib/utilities] package. If not, **write** will do the trick: %!include: ``popup.txt`` [(Download this) popup.txt] = Instant messages = You can also use email to send yourself SMS (instant) messages. Here's a list of email-to-SMS gateways as of 18 Dec 2011: - Alltel{br} 10-digit-phone-number@message.alltel.com{br} Example: 1234567890@message.alltel.com - AT&T (formerly Cingular){br} 10-digit-phone-number@txt.att.net{br} 10-digit-phone-number@cingularme.com{br} Example: 1234567890@txt.att.net - Boost Mobile{br} 10-digit-phone-number@myboostmobile.com{br} Example: 1234567890@myboostmobile.com - Nextel (now Sprint Nextel){br} 10-digit-phone-number@messaging.nextel.com{br} Example: 1234567890@messaging.nextel.com - Sprint PCS (now Sprint Nextel){br} 10-digit-phone-number@messaging.sprintpcs.com{br} Example: 1234567890@messaging.sprintpcs.com - T-Mobile{br} 10-digit-phone-number@tmomail.net{br} Example: 1234567890@tmomail.net - US Cellular{br} 10-digit-phone-number@email.uscc.net{br} Example: 1234567890@email.uscc.net - Verizon{br} 10-digit-phone-number@vtext.com{br} Example: 1234567890@vtext.com - Virgin Mobile USA{br} 10-digit-phone-number@vmobl.com{br} Example: 1234567890@vmobl.com = Network security = My server uses a packet filter called [IPtables http://www.netfilter.org/], which allows a system administrator to configure the tables and rules used by the Linux kernel firewall. It's been around for about 14 years, and you can do all sorts of weird things with it, but all I need is basic stuff to allow one or two services, deny everything else, and tell me who's rattling the doorknob. == IPtables setup == I messed with the filter just long enough to get it working and saved the [configuration iptables-setup.txt]. Here's a small example which allows SSH and HTTP connections but drops everything else. Line-numbers added for readability: ``` 1 # Generated by iptables-save v1.3.1 on Sun Apr 23 05:32:09 2006 2 *filter 3 :INPUT ACCEPT [0:0] 4 :FORWARD ACCEPT [0:0] 5 :LOGNDROP - [0:0] 6 :OUTPUT ACCEPT [0:0] 7 -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT 8 -A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT 9 -A INPUT -i eth0 -p tcp -m tcp --dport 80 -j ACCEPT 10 -A INPUT -i lo -j ACCEPT 11 -A INPUT -j LOGNDROP 12 -A LOGNDROP -p tcp -m limit --limit 5/min -j LOG --log-prefix "Denied TCP: " --log-level notice 13 -A LOGNDROP -p udp -m limit --limit 5/min -j LOG --log-prefix "Denied UDP: " --log-level notice 14 -A LOGNDROP -p icmp -m limit --limit 5/min -j LOG --log-prefix "Denied ICMP: " --log-level notice 15 -A LOGNDROP -j DROP 16 COMMIT 17 # Completed on Sun Apr 23 05:32:09 2006 ``` Lines 1-6 are mostly boilerplate, but line 5 helps with logging; anything I want to deny also gets logged if LOGNDROP is present. Line 15 tells the system to drop anything marked as LOGNDROP. Line 7 says "keep track of what state a connection is in, and accept it if you've already seen traffic in both directions, or if it's related to another existing connection". Lines 8-9 allow SSH (port 22) and HTTP (port 80) incoming connections. Line 10 allows anything on the loopback interface. Lines 11-14 log and drop any other traffic. == Sample log entries == The packet filter writes entries to the kernel log in text format, so what you see below is what the scanner sees (lines wrapped for readability). 5.6.7.8 = my server, and 10.0.0.1 = our router. ``` kern.notice: Mar 2 15:49:48 kernel: Denied UDP: IN=eth0 OUT= MAC=00:1a:64:a2:02:6a:00:23:05:73:54:00:08:00 SRC=10.0.0.1 DST=5.6.7.8 LEN=76 TOS=0x00 PREC=0xC0 TTL=255 ID=0 PROTO=UDP SPT=123 DPT=123 LEN=56 kern.notice: Mar 2 15:56:10 kernel: Denied UDP: IN=eth0 OUT= MAC=00:1a:64:a2:02:6a:00:23:05:73:54:00:08:00 SRC=10.0.0.1 DST=5.6.7.8 LEN=76 TOS=0x00 PREC=0xC0 TTL=255 ID=0 PROTO=UDP SPT=123 DPT=123 LEN=56 ``` The router is sending NTP packets (port 123) and we're ignoring them. == Tweaking the filter == If you want to experiment, have a look at [25 most frequently-used Linux IPtables rules http://www.thegeekstuff.com/2011/06/iptables-rules-examples/]. == References == - [Standalone server http://bash.cyberciti.biz/firewall/linux-iptables-firewall-shell-script-for-standalone-server/] - [Basic rules from GitHub gist.github.txt] - [Firewall starting point start-fw.txt] - [How to clear all rules stop-fw.txt] - [Simple stateful firewall https://wiki.archlinux.org/index.php/Simple_stateful_firewall] = Checking high-volume logs = If you have a few hundred (or thousand) rapidly-growing logs to monitor, a program called **since** might help; it reads files and remembers where it left off, so you don't have to read the same useless stuff twice. == How it works == **since** works by reading the files you provide and using a simple key-value DB to record how far it got in each one. If the files in question have content being appended, **since** avoids unnecessary work by using the DB to skip to where it previously finished reading. We use [Samba http://www.samba.org/] to create Windows-compatible shares on several Unix servers. Each separate user connection has its own logfile, and my office has over 800 users. The average size of a day's worth of logs is 130-200 Mb, so re-scanning them every few minutes would add up to a ton of wasted I/O. == Use for regular logfiles == You can also use **since** to take a quick, human-readable look at the type of logs we've been talking about. First, take a snapshot of your current logs: ``` me% since /var/log/Xorg.0.log /var/log/authlog /var/log/brlog \ /var/log/ftplog /var/log/kernlog /var/log/local?log \ /var/log/maillog /var/log/syslog /var/log/uucplog > /dev/null ``` The logs are quiet: ``` me% since -s /var/log/Xorg.0.log ... /var/log/uucplog s 62045 62045 267389/9904/Xorg.0.log s 0 0 267389/9939/authlog s 27 27 267389/1930/brlog s 0 0 267389/9346/ftplog s 32142 32142 267389/9953/kernlog s 15912168 15912168 267389/9824/local0log s 0 0 267389/9349/local1log s 316 316 267389/9351/local3log s 356 356 267389/9352/local4log s 1902 1902 267389/9353/local5log s 72 72 267389/9354/local6log s 120 120 267389/9355/local7log s 159238 159238 267389/9956/maillog s 244300 244300 267389/9948/syslog s 0 0 267389/9362/uucplog ``` The leading **s** shows everything's the same. You can also use it to list changed vs. unchanged files without having to read them at all; the second and third columns show the previous end-of-file offset and current filesize. Change one logfile by writing a dopey message, and display the changes in more readable form: ``` me% logger running pid $$ me% since -r /var/log/Xorg.0.log ... /var/log/uucplog ==> /var/log/Xorg.0.log [no changes] <== ==> /var/log/authlog [no changes] <== [...] ==> /var/log/newslog [no changes] <== ==> /var/log/ntplog [no changes] <== ==> /var/log/securelog [no changes] <== ==> /var/log/syslog <== Feb 27 21:17:09 myhost vogelke: running pid 24049 ==> /var/log/uucplog [no changes] <== ``` **since** can also write any new entries to a temporary file that **checksyslog** can scan. = Windows event-logs = If you don't feel like installing [ActiveState Perl http://www.activestate.com/activeperl] on your Windows box, the easiest thing to do is use SSH or SSL to upload event logs to a Unix host and filter them there. = Code = - [Local production version of checksyslog checksyslog.pl.txt] - [Local production version of since since.pl.txt] = Feedback = Feel free to send [comments mailto:vogelke+unix@pobox.com]. ----------------------- //Generated from [%%infile %%infile] by// //[txt2tags http://txt2tags.sourceforge.net]//{br} //$Revision: 1.11 $//