1. Introduction
  2. Feedback

1. Introduction

I was in the unfortunate position of having a bunch of files in both compressed and uncompressed form, and not knowing if their contents matched. Since it might happen again, I wrote a script.

Here's what the directory looked like:

-rw-r--r-- 1 bin  bin 18896387 27-Oct-2008 03:47:00 file01
-rw-r--r-- 1 bin  bin  3198352 27-Oct-2008 03:47:00 file01.xz
-rw-r--r-- 1 bin  bin 18694752 27-Oct-2008 02:35:44 file02
-rw-r--r-- 1 bin  bin  2704780 27-Oct-2008 02:35:44 file02.xz
-rw-r--r-- 1 bin  bin 19062396 27-Oct-2008 02:01:24 file03
-rw-r--r-- 1 bin  bin  3225536 27-Oct-2008 02:01:24 file03.xz
-rw-r--r-- 1 bin  bin 10717201 27-Oct-2008 02:25:19 file04
-rw-r--r-- 1 bin  bin  1561900 27-Oct-2008 02:25:19 file04.xz
-rw-r--r-- 1 bin  bin 11261877 27-Oct-2008 02:20:34 file05
-rw-r--r-- 1 bin  bin  1213712 27-Oct-2008 02:20:34 file05.xz
-rw-r--r-- 1 bin  bin  9339640 27-Oct-2008 02:39:59 file06
-rw-r--r-- 1 bin  bin  1798632 27-Oct-2008 02:39:59 file06.xz

You can run the script using either the compressed or regular filenames. It'll figure out what it needs, use MD5 to compare the contents of the uncompressed and compressed files, and write an "rm" command to stdout if it's safe to remove the uncompressed version:

me% compare-compressed file01 file02 file03
rm file01
rm file02
rm file03

You can find the script here.

2. Feedback

Feel free to send comments.


Generated from article.t2t by txt2tags
$Revision: 1.1 $
$UUID: 2bf3f655-fa15-38aa-9955-05a13680e85b $