I was in the unfortunate position of having a bunch of files in both compressed and uncompressed form, and not knowing if their contents matched. Since it might happen again, I wrote a script.
Here's what the directory looked like:
-rw-r--r-- 1 bin bin 18896387 27-Oct-2008 03:47:00 file01 -rw-r--r-- 1 bin bin 3198352 27-Oct-2008 03:47:00 file01.xz -rw-r--r-- 1 bin bin 18694752 27-Oct-2008 02:35:44 file02 -rw-r--r-- 1 bin bin 2704780 27-Oct-2008 02:35:44 file02.xz -rw-r--r-- 1 bin bin 19062396 27-Oct-2008 02:01:24 file03 -rw-r--r-- 1 bin bin 3225536 27-Oct-2008 02:01:24 file03.xz -rw-r--r-- 1 bin bin 10717201 27-Oct-2008 02:25:19 file04 -rw-r--r-- 1 bin bin 1561900 27-Oct-2008 02:25:19 file04.xz -rw-r--r-- 1 bin bin 11261877 27-Oct-2008 02:20:34 file05 -rw-r--r-- 1 bin bin 1213712 27-Oct-2008 02:20:34 file05.xz -rw-r--r-- 1 bin bin 9339640 27-Oct-2008 02:39:59 file06 -rw-r--r-- 1 bin bin 1798632 27-Oct-2008 02:39:59 file06.xz
You can run the script using either the compressed or regular filenames. It'll figure out what it needs, use MD5 to compare the contents of the uncompressed and compressed files, and write an "rm" command to stdout if it's safe to remove the uncompressed version:
me% compare-compressed file01 file02 file03 rm file01 rm file02 rm file03
You can find the script here.
Feel free to send comments.
Generated from article.t2t by
txt2tags
$Revision: 1.1 $
$UUID: 2bf3f655-fa15-38aa-9955-05a13680e85b $