Management
and Administration
My sysadmin toolbox
By:
Karl
Vogel
I've been a system administrator since 1988, working
mainly with Solaris and one or two versions of BSD. Here
are some of the things I use all the time; they're not
flashy, but they save me a ton of keystrokes.
Scripts can read their own source code
On every version of Unix I've ever used, shell and
perl scripts know their own name; $0
holds the pathname of the script being run. This lets any
script read its own source code, which is useful when you
want to keep online help in sync with the program
documentation. Here's a shell script which displays its
own usage information if it gets confused:
1 #!/bin/sh
2 #
3 # $Id: doit,v 1.5 2001/08/04 21:44:39 vogelke Exp $
4 # $Source: /src/scripts/RCS/doit,v $
5 #
6 # NAME:
7 # doit
8 #
9 # SYNOPSIS:
10 # doit [-hv] [pattern]
11 #
12 # DESCRIPTION:
13 # Some blather here about what this script does.
14 #
15 # OPTIONS:
16 # -h print this message
17 # -v print the version and exit
18 #
19 # EXAMPLE:
20 # doit arg presumably does something with "arg".
21 #
22 # AUTHOR:
23 # Based on Free Software Foundation configure scripts.
24 # Your name <your@email.addr>
25 # Your company, Inc.
26
27 PATH=/bin:/usr/sbin:/usr/bin:/usr/local/bin
28 export PATH
29 umask 022
30 tag=`basename $0`
31
32 # ======================== FUNCTIONS =============================
33 # die: prints an optional argument to stderr and exits.
34
35 die () {
36 echo "$tag: error: $*" 1>&2
37 exit 1
38 }
39
40 # usage: prints an optional string plus part of the comment
41 # header (if any) to stderr, and exits with code 1.
42
43 usage () {
44 lines=`egrep -n '^# (NAME|AUTHOR)' $0 | sed -e 's/:.*//'`
45
46 (
47 case "$#" in
48 0) ;;
49 *) echo "usage error: $*"; echo ;;
50 esac
51
52 case "$lines" in
53 "") ;;
54 *) set `echo $lines | sed -e 's/ /,/'`
55 sed -n ${1}p $0 | sed -e 's/^#//g' |
56 egrep -v AUTHOR:
57 ;;
58 esac
59 ) 1>&2
60
61 exit 1
62 }
63
64 # version: prints the current version to stdout.
65
66 version () {
67 lsedscr='s/RCSfile: //
68 s/.Date: //
69 s/,v . .Revision: / v/
70 s/\$//g'
71
72 lrevno='$RCSfile: doit,v $ $Revision: 1.5 $'
73 lrevdate='$Date: 2001/08/04 21:44:39 $'
74 echo "$lrevno $lrevdate" | sed -e "$lsedscr"
75 }
76
77 # ======================== MAIN PROGRAM ==========================
78
79 ac_invalid="invalid option; use -h to show usage"
80 argv=
81
82 for ac_option; do
83 case "$ac_option" in
84 -h) usage ;;
85 -v) version; exit 0 ;;
86 -*) die "$ac_option: $ac_invalid" ;;
87
88 *) case "$argv" in
89 "") argv="$ac_option" ;;
90 *) argv="$argv $ac_option" ;;
91 esac ;;
92 esac
93 done
94
95 # Real work starts here.
96
97 echo "Arguments: $argv"
98 test -f "$argv" || die "$argv: not a file"
99 exit 0
To get the version:
% doit -v
doit v1.5 2001/08/04 21:44:39
Here's the usage information:
% doit -h
NAME:
doit
SYNOPSIS:
doit [-hv] [pattern]
DESCRIPTION:
Some blather here about what this script does.
OPTIONS:
-h print this message
-v print the version and exit
EXAMPLE:
doit arg presumably does something with "arg".
The die function (lines 35-38) makes it easy
to write tests like the one at line 98, instead of
messing around with an if-then block.
The usage function (lines 43-62) reads the
comment header, skips to the line holding NAME,
and prints everything until the line holding
AUTHOR to stderr.
The version function (lines 66-75) prints the
program name, version, and last checkin time. Since the
version information is kept by RCS, I don't have to do
anything but make sure I check my changes in.
Locate and xargs
I was looking for a certain CSS stylesheet, and since
I update my locate database every night, all I
needed was a one-liner. This looks through my notebook
files for any HTML documents, and checks each one for a
stylesheet entry:
% locate $HOME/notebook | grep '\.htm$' | xargs grep rel=.stylesheet
I was also looking for an example of how to create a
solid border;
% locate $HOME/notebook | grep '\.css$' | xargs grep solid
/home/vogelke/notebook/2005/0704/2col.css: border-left: 1px solid gray;
/home/vogelke/notebook/2005/0704/3col.css: border: 1px solid gray;
...
The locate databases on our fileservers are
also updated every night, so it's easy to tell if
someone's deleted something since yesterday. This comes
in handy when there's an irate customer on the phone; if
I can find their files using locate, it means
the files were on the system as of late yesterday, so the
customer can find out if someone in their workgroup did
something clever this morning.
Either someone fesses up to deleting the files, or
they've been moved; usually, the mover thought they were
putting the files in one folder, and they ended up either
hitting the parent folder by mistake or creating an
entirely new folder somewhere else. A quick find
will often fix this without my having to restore
anything.
If I can't find the files using locate, they
were zapped at least a day or two ago, which generally
means a trip to the backup server.
Shell aliases for process control
I spend most of my time in an xterm flipping around
between programs, and it's nice to be able to suspend and
restart jobs quickly. On my workstation, I always have
emacs plus a shell running as root as the first two jobs.
Under the Z-shell, I use "j" as an alias for "jobs
-dl":
% j
[1] 92178 suspended sudo ksh
(pwd : ~)
[2] - 92188 suspended emacs
(pwd : ~)
[3] + 96064 suspended vi 003-shell-alias.mkd
(pwd : ~/notebook/2006/0618/newsforge-article)
This way, I get the process IDs (in case something
gets wedged) plus the working directories for each
process.
The Z-shell lets you bring a job to the foreground by
typing a percent-sign followed by the job number. I hate
typing two characters when one's enough, so these aliases
are convenient:
alias 1='%1'
alias 2='%2'
alias 3='%3'
alias 4='%4'
alias 5='%5'
alias 6='%6'
alias 7='%7'
alias 8='%8'
alias 9='%9'
alias z='suspend'
I can type '1' to become root, quickly check
something, and then just type 'z' to become me again.
Tools to keep a sitelog
I learned the hard way (several times) that messing
with a server and neglecting to write down what you did
can easily screw up an entire weekend.
My first few attempts at writing a site-logging
program weren't terribly successful. I've been working
with the Air Force for nearly 25 years, and when someone
from the federal government tells you that you
tend to over-design things, your process clearly needs a
touchup. A basic text file with time-stamped entries
solves 90% of the problem with about 10% of the
effort.
The sitelog file format is pretty simple - think of a
weblog with all the entries jammed together in ascending
time order. Timestamp lines are left-justified;
everything else has at least 4 leading spaces or a
leading tab. Code listings and program output are
delimited by dashed lines ending with a single capital
'S' or 'E', for start and end. The whole idea was to be
able to write a Perl parser for this in under an
hour.
Here's an example, created when I installed Berkeley
DB. I've always used LOG for the filename,
mainly because README was already taken.
BEGINNING OF LOG FOR db-4.4.20 ======================================
Fri, 23 Jun 2006 19:34:15 -0400 Karl Vogel (vogelke at myhost)
To build:
https://localhost/mis/berkeley-db/ref/build_unix/intro.html
--------------------------------------------------------------S
me% cd build_unix
me% CC=gcc CFLAGS="-O" ../dist/configure --prefix=/usr/local
installing in /usr/local
checking build system type... sparc-sun-solaris2.8
checking host system type... sparc-sun-solaris2.8
[...]
config.status: creating db.h
config.status: creating db_config.h
me% make
/bin/sh ./libtool --mode=compile gcc -c -I. -I../dist/..
-D_REENTRANT -O2 ../dist/../mutex/mut_pthread.c
[...]
creating db_verify
/bin/sh ./libtool --mode=execute true db_verify
--------------------------------------------------------------E
Fri, 23 Jun 2006 20:32:34 -0400 Karl Vogel (vogelke at myhost)
Install:
--------------------------------------------------------------S
root# make install_setup install_include install_lib install_utilities
Installing DB include files: /usr/local/include ...
Installing DB library: /usr/local/lib ...
[...]
cp -p .libs/db_verify /usr/local/bin/db_verify
--------------------------------------------------------------E
These scripts do most of the heavy lifting:
-
timestamp: writes a line holding the
current time in ARPA-standard format, your full name,
your userid, and the name of the host you're on. A
short version is included below; I have a longer one
that can parse most date formats and return a line
with that time instead of the current time.
-
remark: starts VIM on the LOG file and
puts me at the last line so I can append entries. The
'v' key (often unused) is mapped to call
timestamp and append its output after the
current line.
-
mfmt: originally stood for "make format";
intended to take the output of make (or any program),
break up and indent long lines to make them more
readable, and wrap the whole thing in dashed lines
ending with 'S' and 'E'.
-
site2html: read a LOG file and generate a
decent-looking webpage, like this one.
-
log2troff: read a LOG file and generate
something that looks good on paper.
Here's a short version of timestamp:
#!/bin/sh
PATH=/usr/local/bin:/bin:/usr/bin; export PATH
name=`grep "^$USER:" /etc/passwd | cut -f5 -d:`
host=`hostname | cut -f1 -d.`
exec date "+%n%n%a, %d %b %Y %T %z $name ($USER at $host)%n"
exit 0
If I know I've logged something, it's also nice to be
able to do something like
me% locate LOG | xargs grep something
The W3M text web-browser
w3m is a
text-based web-browser which does a wonderful job of
rendering HTML tables correctly. If I want a
halfway-decent text-only copy of a webpage that includes
tables, I run a script that calls wget to fetch
the HTML page and then w3m to render it:
1 #!/bin/ksh
2 # Fetch files via wget, w3m. Usage: www URL
3
4 PATH=/usr/local/bin:$PATH
5 export PATH
6
7 die () {
8 echo "$*" >& 2
9 exit 1
10 }
11
12 #
13 # Don't go through a proxy server for local hosts.
14 #
15
16 case "$1" in
17 "") die "usage: $0 url" ;;
18 *local*) opt="--proxy=off $1" ;;
19 http*) opt="$1" ;;
20 ftp*) opt="$1" ;;
21 esac
22
23 #
24 # Fetch the URL back to a temporary file using wget, then render
25 # it using w3m: better support for tables.
26 #
27
28 tfile="wget.$RANDOM.$$"
29 wget -F -O $tfile $opt
30 test -f $tfile || die "wget failed"
31
32 #
33 # Set the output width from the enviroment.
34 #
35
36 case "$WCOLS" in
37 "") cols=70 ;;
38 *) cols="$WCOLS" ;;
39 esac
40
41 w3m="/usr/local/bin/w3m -no-graph -dump -T text/html -cols $cols"
42 result="w3m.$RANDOM.$$"
43 $w3m $tfile > $result
44
45 test -f "$result" && $EDITOR $result
46 rm -f $tfile
47 exit 0
Line 18 lets me specify URLs on the local subnet which
should not go through our proxy server; traffic through
that server is assumed to be coming from the outside
world, which requires a username and password.
Lines 28 and 42 create safe temporary files by taking
advantage of the Korn shell's ability to generate random
numbers.
I call wget on line 29, using the -F option
to force any input to be treated as HTML. The -O option
lets me pick the output filename. You might be able to
use w3m to do everything, but here it seems to
have some problems with the outgoing proxies (which I
don't control), and wget doesn't.
Lines 36-39 let me specify the output width as an
environment variable:
% WCOLS=132 www http://some.host/url
would give me wider output for landscape printing.
When w3m returns, you're placed in an editor in
case you want to make any final touchups. After you exit
the editor, you should have a new file in the current
directory named something like
w3m.19263.26012.
Dealing with different archive
formats
I got fed up with remembering how to deal with
archives that might be tar files, zip files, compressed,
frozen, gzipped, bzipped, or whatever bizarre format
comes along next. Three short scripts take care of that
for me:
-
tc: shows the contents of an archive
file
-
tcv: shows the verbose contents of an
archive file
-
tx: extracts the contents of an archive
file in the current directory
tc and tcv are hard-linked
together:
1 #!/bin/sh
2 # tc: check a gzipped archive file
3 # if invoked as "tcv", print verbose listing.
4
5 case "$#" in
6 0) exit 1 ;;
7 *) file="$1" ;;
8 esac
9
10 name=`basename $0`
11 case "$name" in
12 tcv) opt='tvf' ;;
13 *) opt='tf' ;;
14 esac
15
16 case "$file" in
17 *.zip) exec unzip -lv "$file" ;;
18 *.tgz) exec gunzip -c "$file" | tar $opt - ;;
19 *.bz2) exec bunzip2 -c "$file" | tar $opt - ;;
20 *.tar.gz) exec gunzip -c "$file" | tar $opt - ;;
21 *.tar.Z) exec uncompress -c "$file" | tar $opt - ;;
22 *) exec tar $opt $file ;;
23 esac
tx is very similar:
1 #!/bin/sh
2 # tx: extract a gzipped archive file
3
4 case "$#" in
5 0) exit 1 ;;
6 *) file="$1"; pat="$2" ;;
7 esac
8
9 case "$file" in
10 *.zip) exec unzip -a "$file" ;;
11 *.tgz) exec gunzip -c "$file" | tar xvf - $pat ;;
12 *.bz2) exec bunzip2 -c "$file" | tar xvf - $pat ;;
13 *.tar.gz) exec gunzip -c "$file" | tar xvf - $pat ;;
14 *) exec tar xvf $file $pat ;;
15 esac
Z-shell and Bash aliases
I've tried bash and tcsh, but the Z-shell is definitely my
favorite. Here are some of my aliases:
To view command-line history:
h fc -l 1 | less
history fc -l 1
To check the tail end of the syslog file:
syslog less +G /var/log/syslog
To beep my terminal when a job's done (i.e.,
/run/long/job && yell):
yell echo done | write $LOGNAME
To quickly find all the directories or executables in
the current directory:
d /bin/ls -ld *(-/)
x ls -laF | fgrep "*"
For listing dot-files:
dot ls -ldF .[a-zA-Z0-9]*
Largest files shown first or last:
lsl ls -ablprtFT | sort -n +4
lslm ls -ablprtFT | sort -n +4 -r | less
Smallest files shown first or last:
lss ls -ablprtFT | sort -n +4 -r
lssm ls -ablprtFT | sort -n +4 | less
Files sorted by name:
lsn ls -ablptFT | sort +9
lsnm ls -ablptFT | sort +9 | less
Newly-modified files shown first or last:
lst ls -ablprtFT
lstm ls -ablptFT | less
Converting decimal to hex and back:
d2h perl -e ''printf qq|%X\n|, int( shift )''
h2d perl -e ''printf qq|%d\n|, hex( shift )''
Most of these aliases (except for the fc
stuff) work just fine in bash, with just a few minor
tweaks in the formatting. Some examples:
alias 1='%1'
alias 2='%2'
alias 3='%3'
alias 4='%4'
alias 5='%5'
alias 6='%6'
alias 7='%7'
alias 8='%8'
alias 9='%9'
alias d2h='perl -e "printf qq|%X\n|, int(shift)"'
alias d='(ls -laF | fgrep "/")'
alias dot='ls -ldF .[a-zA-Z0-9]*'
alias h2d='perl -e "printf qq|%d\n|, hex(shift)"'
alias h='history | less'
alias j='jobs -l'
alias p='less'
alias x='ls -laF | fgrep "*"'
alias z='suspend'
If you want to pass arguments to an alias, it might be
easier to use a function. For example, I use mk
to make a new directory with mode 755, regardless of my
umask setting. The $* will be replaced by whatever
arguments you pass:
mk () {
mkdir $*
chmod 755 $*
}
You can use seq to generate sequences, like
10 to 20:
seq () {
local lower upper output;
lower=$1 upper=$2;
while [ $lower -le $upper ];
do
output="$output $lower";
lower=$[ $lower + 1 ];
done;
echo $output
}
Sample use:
% seq 10 20
10 11 12 13 14 15 16 17 18 19 20
Functions can call other functions. For example, if
you want to repeat a given command some number of
times:
repeat () {
local count="$1" i;
shift;
for i in $(seq 1 "$count");
do
eval "$@";
done
}
Sample use:
% repeat 10 'date; sleep 1'
Wed Jul 5 21:29:18 EDT 2006
Wed Jul 5 21:29:19 EDT 2006
Wed Jul 5 21:29:20 EDT 2006
Wed Jul 5 21:29:21 EDT 2006
Wed Jul 5 21:29:22 EDT 2006
Wed Jul 5 21:29:23 EDT 2006
Wed Jul 5 21:29:24 EDT 2006
Wed Jul 5 21:29:25 EDT 2006
Wed Jul 5 21:29:26 EDT 2006
Wed Jul 5 21:29:27 EDT 2006
Using PGP to create a password safe
How many different passwords do you have to remember,
and how often do you have to change them? Lots of
organizations seem to believe that high change frequency
makes a password safe, even if the one you ultimately
pick is only three characters long or your name spelled
backwards.
PGP or the GNU
Privacy Guard can help you safely keep track of
dozens of nice, long passwords, even if you have to
change them weekly. There are several commercial packages
which serve as password safes, but PGP is free, and all
you need is a directory with one script to encrypt your
password list and one to decrypt it.
The most important thing to remember: DO
NOT use the password for your safe for anything
else!
I use GNU Privacy Guard for encryption, but any strong
crypto will do. You can set up your own private/public
key in just a few minutes by following the directions in
the GNU
Privacy Handbook. Let's say you put your passwords in
the file "pw". Follow these steps to create a GPG
public/private keypair and encrypt the password file:
Generate a keypair
% gpg --gen-key
gpg (GnuPG) 1.4.1; Copyright (C) 2005 Free Software Foundation, Inc.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. See the file COPYING for details.
Please select what kind of key you want:
(1) DSA and Elgamal (default)
(2) DSA (sign only)
(5) RSA (sign only)
Your selection? [hit return]
DSA keypair will have 1024 bits.
ELG-E keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) [hit return]
Requested keysize is 2048 bits
Please specify how long the key should be valid.
0 = key does not expire
<n> = key expires in n days
<n>w = key expires in n weeks
<n>m = key expires in n months
<n>y = key expires in n years
Key is valid for? (0) [hit return]
Key does not expire at all
Is this correct? (y/N) y
You need a user ID to identify your key; the software constructs the
user ID from the Real Name, Comment and Email Address in this form:
"Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"
Real name: Your Name
Email address: yourid@your.host.com
Comment:
You selected this USER-ID:
"Your Name <yourid@your.host.com>"
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
You need a Passphrase to protect your secret key.
[enter your passphrase]
Generate a revocation certificate in case you
forget your passphrase or your key's been
compromised
% gpg --output revoke.asc --gen-revoke "Your Name"
sec 1024D/B3D36900 2006-06-27 Your Name <yourid@your.host.com>
Create a revocation certificate for this key? (y/N) y
Please select the reason for the revocation:
0 = No reason specified
1 = Key has been compromised
2 = Key is superseded
3 = Key is no longer used
Q = Cancel
(Probably you want to select 1 here)
Your decision? 1
Enter an optional description; end it with an empty line:
> Revoking my key just in case it gets lost
>
Reason for revocation: Key has been compromised
Revoking my key just in case it gets lost
Is this okay? (y/N) y
You need a passphrase to unlock the secret key for
user: "Your Name <yourid@your.host.com>"
1024-bit DSA key, ID B3D36900, created 2006-06-27
ASCII armored output forced.
Revocation certificate created.
Your revocation key is now in the file
revoke.asc. Store it on a medium which you can
hide; otherwise someone can use it to render your key
unusable.
Export your public key
% gpg --armor --output public.gpg --export yourid@your.host.com
will store your public key in public.gpg, if
you want to put it on your website or mail it.
Encrypt the pw file
% gpg --armor --output pw.gpg --encrypt --recipient yourid@your.host.com pw
will encrypt the pw file as pw.gpg.
To decrypt it, you must include your own key in the
--recipient list.
Test decrypting the pw file
% gpg --output testpw --decrypt pw.gpg
You need a passphrase to unlock the secret key for
user: "Your Name <yourid@your.host.com>"
2048-bit ELG-E key, ID 19DF3967, created 2006-06-27 (main key ID B3D36900)
Enter passphrase:
gpg: encrypted with 2048-bit ELG-E key, ID 19DF3967, created 2006-06-27
"Your Name <yourid@your.host.com>"
The file testpw should be identical to
pw, or something's wrong.
I use one script with two hardlinks for reading and
updating passwords. When invoked as readp, the
script decrypts my password safe; after I finish checking
or editing the decrypted file, updatep encrypts
it.
1 #!/bin/ksh
2 # read or encrypt a file.
3 # use with GPG v1.4.1 or better.
4
5 PATH=/bin:/usr/bin:/usr/sbin:/usr/local/bin
6 export PATH
7 name=`basename $0`
8
9 case "$1" in
10 "") file="pw" ;;
11 *) file=$1 ;;
12 esac
13
14 # clear = plaintext file.
15 # enc = ascii-armor encrypted file.
16
17 case "$file" in
18 *.gpg) enc=$file
19 clear=`echo $file | sed -e 's/.gpg$//g'`
20 ;;
21
22 *) clear="$file"
23 enc="$file.gpg"
24 ;;
25 esac
26
27 case "$name" in
28 "readp")
29 if test -f "$enc"
30 then
31 gpg --output $clear --decrypt $enc
32 else
33 echo "encrypted file $enc not found"
34 fi
35 ;;
36
37 "updatep")
38 if test -f $clear
39 then
40 mv $enc $enc.old
41 gpg --armor --output $enc --encrypt \
42 --recipient yourid@your.host.com $clear && rm $clear
43 else
44 echo "cleartext file $clear not found"
45 fi
46 ;;
47 esac
48
49 exit 0
The mutt mail-reader
Mutt is very
useful for taking a quick look at a mailbox or
correctly sending messages with attachments from
the command line; there's more to it than just
concatenating a few files together and piping the results
to mail.
If you poke around Google for awhile, you can find
many setups that make mutt quite suitable for general
mail-handling. Dave
Pearson's site has some great configuration
files.
Setting up a full-text index for code and
documents
I started trying to index my files for fast lookup
back when WAIS was all the rage; I also tried Glimpse and Swish-e, neither of
which really did the trick for me.
The QDBM,
Estraier,
and Hyper-estraier
programs are without a doubt the best full-text index and
search programs I've ever used. They're faster and less
memory-intensive than any version of Swish, and the
Hyperestraier package includes an excellent CGI program
which lets you do things like search for similar
files.
Keeping a copy of my browser history
Having command-line access to my browser links from
any given day has occasionally been helpful. I know
Mozilla and Firefox store your history for you, but it's
either for a limited time, or you end up with the history
logfile from hell. If I have logfiles that are updated on
the fly, I'd rather keep them relatively small.
The biggest advantage is being able to search my
browser history using the same interface as I use for my
regular files (Estraier), as well as standard
command-line tools. I keep my working files in dated
folders, and I was recently looking for something I did
in June on the same day that I looked up some outlining
sites:
% locate browser-history | xargs grep -i outlin
.../2006/0610/browser-history: 19:07:27 http://webservices.xml.com/pub/a/ws/2002/04/01/outlining.html
.../2006/0610/browser-history: 19:07:27 http://www.oreillynet.com/pub/a/webservices/2002/04/01/outlining.html
.../2006/0610/browser-history: 19:08:57 http://radio.weblogs.com/0001015/instantOutliner/daveWiner.opml
.../2006/0610/browser-history: 19:10:27 http://www.deadlybloodyserious.com/instantOutliner/garthKidd.opml
.../2006/0610/browser-history: 19:10:43 http://www.decafbad.com/deus_x/radio/instantOutliner/l.m.orchard.opml
.../2006/0610/browser-history: 19:10:47 http://radio.weblogs.com/0001000/instantOutliner/jakeSavin.opml
Here's a
perl script by Jamie Zawinski which parses the
Mozilla history file. According to Jamie, the history
format is "just about the stupidest file format I've
ever seen", and after trying to write my own parser
for it, I agree.
The cron script below is run every night at 23:59 to
store my browser history (minus some junk) in my
notebook.
1 #!/bin/sh
2 # mozhist: save mozilla history for today
3
4 PATH=/bin:/usr/bin:/usr/local/bin:$HOME/bin
5 export PATH
6 umask 022
7
8 # your history file.
9 hfile="$HOME/.mozilla/$USER/nwh6n09i.slt/history.dat"
10
11 # sed script
12 sedscr='
13 s/\/$//
14 /view.atdmt.com/d
15 /ad.doubleclick.net/d
16 /tv.yahoo.com/d
17 /adq.nextag.com\/buyer/d
18 '
19
20 # remove crap like trailing slashes, doubleclick ads, etc.
21 set X `date "+%Y %m %d"`
22 case "$#" in
23 4) yr=$2; mo=$3; da=$4 ;;
24 *) exit 1 ;;
25 esac
26
27 dest="$HOME/notebook/$yr/${mo}${da}"
28 test -d "$dest" || exit 2
29
30 exec mozilla-history $hfile | # get history...
31 sed -e "$sedscr" | # ... strip crap ...
32 sort -u | # ... remove duplicates ...
33 tailocal | # ... change date to ISO ...
34 grep "$yr-$mo-$da" | # ... look for today ...
35 cut -c12- | # ... zap the date ...
36 cut -f1,3 | # ... keep time and URL ...
37 expand -1 > $dest/browser-history # ... and store
38
39 exit 0
mozilla-history (line 30) is Jamie's perl
script.
tailocal (line 33) is a program written by
Dan Bernstein which reads lines timestamped with the raw
Unix date, and writes them with an ISO-formatted date
like so:
% echo 1151637537 howdy | tailocal
2006-06-29 23:18:57 howdy
If you don't have tailocal, here's a short Perl
equivalent:
#!/usr/bin/perl
use POSIX qw(strftime);
while (<>) {
if (m/(\d+)\s(.*)/) {
print strftime("%Y-%m-%d %T ", localtime($1)), "$2\n";
}
}
exit (0);
The resulting file has entries for one day which look
like this:
15:55:27 http://mediacast.sun.com/share/bobn/SMF-migrate.pdf
16:02:36 http://www.sun.com/bigadmin/content/selfheal
Reading whitespace-delimited fields
At least once a day, I need the third or fourth column
of words from either an existing file or the output of a
program. It's usually something simple like checking the
output from ls -lt, weeding a few things out by
eye, and then getting just the filenames for use
elsewhere.
I use one script with nine hard-links:
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f1*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f2*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f3*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f4*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f5*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f6*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f7*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f8*
-rwxr-xr-x 9 vogelke vogelke 546 Oct 1 2003 f9*
The script is just a wrapper for awk:
1 #!/bin/sh
2 # print space-delimited fields.
3
4 PATH=/bin:/usr/bin; export PATH
5 tag=`basename $0`
6
7 case "$tag" in
8 f1) exec awk '{print $1}' ;;
9 f2) exec awk '{print $2}' ;;
10 f3) exec awk '{print $3}' ;;
11 f4) exec awk '{print $4}' ;;
12 f5) exec awk '{print $5}' ;;
13 f6) exec awk '{print $6}' ;;
14 f7) exec awk '{print $7}' ;;
15 f8) exec awk '{print $8}' ;;
16 f9) exec awk '{print $9}' ;;
17 *) ;;
18 esac
f3 gets the third field, etc.
Using ifile for SPAM control
If you're still plagued by spam, or you need a generic
method of categorizing text files, have a look at
ifile.
It's one of many "Bayesian mail filters", but unlike
bogofilter and
spamassassin, it
can do n-way filtering rather than simply spam vs.
non-spam.
Author bio
Karl is a Solaris/BSD system administrator at
Wright-Patterson Air Force Base, Ohio.
He graduated from Cornell University with a BS in
Mechanical and Aerospace Engineering, and joined the Air
Force in 1981. After spending a few years on DEC and IBM
mainframes, he became a contractor and started using
Berkeley Unix on a Pyramid system.
He likes FreeBSD, trashy supermarket tabloids, Perl,
cats, teen-angst TV shows, and movies.