Message-ID: <4DBD0D23.1080903@hardwarefreak.com>
Date: Sun, 01 May 2011 02:34:59 -0500
From: Stan Hoeppner <stan@hardwarefreak.com>
To: debian-user@lists.debian.org
Subject: Re: file systems
References: <87wriqjd0t.fsf@towardsfreedom.com>
 <4DAE0FF0.5070805@hardwarefreak.com>	<4DB513E5.5030902@cox.net>
 <4DB5CC72.4070106@hardwarefreak.com>	<4DB5EDAB.6010706@cox.net>
 <4DB60919.6080403@hardwarefreak.com>	<4DB60C9E.4090104@cox.net>
 <4DB630D1.9040005@hardwarefreak.com>	<4DB646ED.8080909@cox.net>
 <4DB67726.1030304@hardwarefreak.com>	<4DB6D6C0.2090704@cox.net>
 <4DB73CB2.5050605@hardwarefreak.com>	<4DB749DE.8030307@cox.net>
 <BANLkTim5VUcUvrm5doFfdoY9dMBooxQLUQ@mail.gmail.com>
 <4DBCD6D2.9090809@hardwarefreak.com>
 <BANLkTi=-CUY-oMqkfgOu4RcAsUcV7ifzeQ@mail.gmail.com>
In-Reply-To: <BANLkTi=-CUY-oMqkfgOu4RcAsUcV7ifzeQ@mail.gmail.com>

On 4/30/2011 11:48 PM, shawn wilson wrote:

> i'm interested in not seeing unsubstantiated opinion on a technical
> mailing list.

That 'opinion' is based, in part, on the following facts, many of which are
in my previous posts to this list.  If you would like, to avoid expressing
'opinion' in the future, I could simply paste the following huge ass text
into every email dealing with XFS, instead of using short hand subjective
phrases such as 'XFS is the overall best Linux FS'.  The following, and
additional evidence freely available, demonstrates this 'opinion' to be fact.

All four US National Nuclear Security Administration (NNSA) labs: LANL, LLNL,
Oak Ridge, and Sandia, as well as NASA Ames and the US Air Force Research
Laboratory in Dayton, Ohio, have all used, or still use, XFS and/or CXFS on
large scale storage, dozens of petabytes of XFS disk total.

NASA Ames has been using XFS for 16+ years, and still do, on the 10,240
processor (originally) Columbia super and the archival servers.  They're
currently running an 800TB CXFS filesystem on SAN storage, and local XFS
filesystems on 215TB, 175TB, and 65TB direct fiber attached storage.

http://www.nas.nasa.gov/Resources/Systems/columbia.html
http://www.nas.nasa.gov/Resources/Systems/archive_storage.html

Professor Steven Hawking's research group has used 4 generations of SGI
supercomputers spanning 14 years running cosmology simulations to support Dr.
Hawking's theories, each machine, as with all SGI supers, running XFS:
http://www.damtp.cam.ac.uk/cosmos/hardware/

Linux Kernel Archives said:

"A bit more than a year ago (as of October 2008) kernel.org, in an ever
increasing need to squeeze more performance out of its machines, made the
leap of migrating the primary mirror machines (mirrors.kernel.org) to XFS.
We site a number of reasons including fscking 5.5T of disk is long and painful,
we were hitting various cache issues, and we were seeking better performance
out of our file system."

"After initial tests looked positive we made the jump, and have been
quite happy with the results.  With an instant increase in performance
and throughput, as well as the worst xfs_check we've ever seen taking 10
minutes, we were quite happy.  Subsequently we've moved all primary mirroring
file-systems to XFS, including www.kernel.org , and mirrors.kernel.org.
With an average constant movement of about 400mbps around the world, and
with peaks into the 3.1gbps range serving thousands of users simultaneously
it's been a file system that has taken the brunt we can throw at it and held
up spectacularly."

The kernel code running on your system Shawn was originally served from an
XFS filesystem.  The Debian kernel team gets their upstream tarball from
kernel.org as everyone does, served up by XFS.  If this fact doesn't carry
weight for a Linux user I don't know what would...

Very Interesting XFS research paper from a few years ago authored by two 
of the principal XFS developers:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.114.1918&rep=rep1&type=pdf

Independent Linux filesystem tests performed by an IBM engineer to track
BTRFS performance during development.  XFS trounces the others in most tests:

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_creates_num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_creates_num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_creates_num_threads=128.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_reads._num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_reads._num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_reads._num_threads=128.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes._num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes._num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes._num_threads=128.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes_odirect._num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes_odirect._num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_random_writes_odirect._num_threads=128.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_sequential_reads._num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_sequential_reads._num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Large_file_sequential_reads._num_threads=128.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Mail_server_simulation._num_threads=1.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Mail_server_simulation._num_threads=16.html

http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Mail_server_simulation._num_threads=128.html

-- 
Stan


Message-ID: <4DBDC98A.3030402@hardwarefreak.com>
Date: Sun, 01 May 2011 15:58:50 -0500
From: Stan Hoeppner <stan@hardwarefreak.com>
To: debian-user@lists.debian.org
Subject: Re: file systems
References: <4DB646ED.8080909@cox.net> <4DB67726.1030304@hardwarefreak.com>
 <4DB6D6C0.2090704@cox.net> <4DB73CB2.5050605@hardwarefreak.com>
 <4DB749DE.8030307@cox.net>
 <BANLkTim5VUcUvrm5doFfdoY9dMBooxQLUQ@mail.gmail.com>
 <4DBCD6D2.9090809@hardwarefreak.com>
 <BANLkTi=-CUY-oMqkfgOu4RcAsUcV7ifzeQ@mail.gmail.com>
 <4DBD0D23.1080903@hardwarefreak.com> <20110501083517.GE928@think.nuvreauspam>
 <20110501125754.GC16834@Tungsten.DarkStar>
In-Reply-To: <20110501125754.GC16834@Tungsten.DarkStar>

On 5/1/2011 7:57 AM, Chen Wei wrote:

> 2) performs well on a lots of small files, maildir and extrace linux
> kernel source for example.

This was XFS Achilles heal until the introduction of Dave Chinner's delayed
logging patch in 2.6.35.  Prior to this XFS was absolutely horrible with
metadata intensive workloads including the two you name above.  Such metadata
workload performance is now on par with EXT3/4 and Reiser.  delaylog is a
mount option up to 2.6.38 and is the default in 2.6.39.

-- 
Stan


Message-ID: <4DBDC583.8090407@hardwarefreak.com>
Date: Sun, 01 May 2011 15:41:39 -0500
From: Stan Hoeppner <stan@hardwarefreak.com>
To: debian-user@lists.debian.org
Subject: Re: file systems
References: <4DB630D1.9040005@hardwarefreak.com> <4DB646ED.8080909@cox.net>
 <4DB67726.1030304@hardwarefreak.com> <4DB6D6C0.2090704@cox.net>
 <4DB73CB2.5050605@hardwarefreak.com> <4DB749DE.8030307@cox.net>
 <BANLkTim5VUcUvrm5doFfdoY9dMBooxQLUQ@mail.gmail.com>
 <4DBCD6D2.9090809@hardwarefreak.com>
 <BANLkTi=-CUY-oMqkfgOu4RcAsUcV7ifzeQ@mail.gmail.com>
 <4DBD0D23.1080903@hardwarefreak.com> <20110501083517.GE928@think.nuvreauspam>
In-Reply-To: <20110501083517.GE928@think.nuvreauspam>

On 5/1/2011 3:35 AM, Andrei Popescu wrote:
> On Du, 01 mai 11, 02:34:59, Stan Hoeppner wrote:
>
> [snip various super-stuff running xfs]
>
> I understand that xfs is great for super-computers[1] and stuff, but how
> is that relevant to a desktop computer with something like this?

The background info I provided relating to supers was in response to Shawn
calling my statement of 'quality FS' an 'opinion'.  If XFS isn't a quality
FS people wouldn't have been using it on $100 million supercomputers for
over 13 years.  And in that 13 years it has seen vast improvements.

> $ df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda6             9.2G  7.3G  1.5G  84% /
> tmpfs                1006M  4.0K 1006M   1% /lib/init/rw
> udev                 1004M  548K 1004M   1% /dev
> tmpfs                1006M     0 1006M   0% /dev/shm
> tmpfs                1006M  164K 1006M   1% /tmp
> /dev/sda7             9.2G  2.7G  6.1G  31% /media/stable
> /dev/sda2              19G  9.9G  7.6G  57% /home
> /dev/sda8             104G   79G   26G  76% /home/amp/big
>
> (actually one of those partitions is on xfs, but that's not my point)

The only real downside to using XFS as a primary desktop filesystem is tool
familiarity and knowledge.  For the casual desktop user XFS may not be all that
suitable due to this, but any power user will be more than happy with it.
As with anything computer related, one needs to read and learn about it
before taking the plunge.  Users who simply select all the defaults during
OS installation need not apply.

Regarding desktop suitability, all SGI MIPS graphics workstations from 1994
onward, including the popular O2 and Octane, used XFS.  The CG effects in
almost every movie between ~1995 and 2002 were created on SGI workstations all
using XFS.  ILM used SGI workstations with XFS from 1994/95 until switching
to commodity AMD Opteron systems around 2003/04.  I don't know what FS they
currently use on their workstations today.  Given the size of the data sets
I'd bet they still use XFS locally, though I don't know if they use CXFS on
their SAN or another cluster filesystem.

-- 
Stan


From: "Boyd Stephen Smith Jr." <bss@iguanasuicide.net>
To: debian-user@lists.debian.org
Subject: Re: file systems
Date: Sun, 1 May 2011 10:40:14 -0500
References: <87wriqjd0t.fsf@towardsfreedom.com>
 <BANLkTi=-CUY-oMqkfgOu4RcAsUcV7ifzeQ@mail.gmail.com>
 <4DBD0D23.1080903@hardwarefreak.com>
In-Reply-To: <4DBD0D23.1080903@hardwarefreak.com>
Message-Id: <201105011040.19169.bss@iguanasuicide.net>

In <4DBD0D23.1080903@hardwarefreak.com>, Stan Hoeppner wrote:
>Independent Linux filesystem tests performed by an IBM engineer to track
>BTRFS performance during development.  XFS trounces the others in most
>tests:

These results are interesting and useful, but I think "trounces" is a poor
description for what XFS does.

Not using barriers undermines data consistency guarantees, I think it is best
to ignore the 2.6.35-rc5-autokern1-ext3-*-ext3, 2.6.35-rc5-autokern1-ext4-*-
ext4-nobarrier, and 2.6.35-rc5-autokern1-xfs-*-xfx-nobarrier entries.

So that btrfs doesn't remain the only filesystem with 2 entries, I'll also
ignore the 2.6.35-rc5-autokern1-btrfs-*-btrfs-nocow entry, as it is non-
default.

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_creates_num_threads=1.html

On the graphs, XFS is, respectively:
2nd, 4th, 2nd, 4th

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_creates_num_threads=16.html

2nd, 1st, 2nd, 1st

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_creates_num_threads=128.html

1st, 1st, 1st, 1st

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_random_writes._num_threads=1.html

1st, 4th, 1st, 4th

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_random_writes._num_threads=16.html

2nd, 1st, 2nd, 2nd

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_La
>rge_file_random_writes._num_threads=128.html

2nd, 4th, 2nd, 2nd

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Ma
>il_server_simulation._num_threads=1.html

5th, 1st, 5th, 1st, 3rd

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Ma
>il_server_simulation._num_threads=16.html

5th, 1st, 5th, 5th, 1st

>http://btrfs.boxacle.net/repository/raid/2.6.35-rc5/2.6.35-rc5/2.6.35-rc5_Ma
>il_server_simulation._num_threads=128.html

4th, 2nd, 4th, 4th, 2nd

I wouldn't say that is a "trouncing", since it doesn't even win in many 
categories.

-- 
Boyd Stephen Smith Jr.                   ,= ,-_-. =.
bss@iguanasuicide.net                   ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy         `-'(. .)`-'
http://iguanasuicide.net/                    \_/


Message-ID: <4DBCE5CC.80905@hardwarefreak.com>
Date: Sat, 30 Apr 2011 23:47:08 -0500
From: Stan Hoeppner <stan@hardwarefreak.com>
To: debian-user@lists.debian.org
Subject: Re: file systems
References: <87wriqjd0t.fsf@towardsfreedom.com>
 <4DAE0FF0.5070805@hardwarefreak.com> <4DB513E5.5030902@cox.net>
 <4DB5CC72.4070106@hardwarefreak.com> <4DB5EDAB.6010706@cox.net>
 <4DB60919.6080403@hardwarefreak.com> <4DB60C9E.4090104@cox.net>
 <4DB630D1.9040005@hardwarefreak.com> <4DB646ED.8080909@cox.net>
 <4DB67726.1030304@hardwarefreak.com> <4DB6D6C0.2090704@cox.net>
 <4DB73CB2.5050605@hardwarefreak.com> <4DB749DE.8030307@cox.net>
 <4DBAFF13.9040109@hardwarefreak.com> <4DBB0279.4020607@cox.net>
 <87fwp0riw9.fsf@psinom.home>
In-Reply-To: <87fwp0riw9.fsf@psinom.home>

On 4/29/2011 1:51 PM, prad wrote:
> Ron Johnson<ron.l.johnson@cox.net>  writes:
>
>> On 04/29/2011 01:10 PM, Stan Hoeppner wrote:
>>> On 4/26/2011 5:40 PM, Ron Johnson wrote:
>>>
>>>> But not being able to fsck the fs that I just created is unacceptable.
>>>
>>> Again, 'xfs_repair -n' is functionally equivalent to 'xfs_check'. They
>>> are two methods (paths) that (should) arrive at the same result. Either
>>> will let you know if the filesystem has errors.
>>>
>>> Have you run 'xfs_repair -n' yet to see if it trips over your per
>>> process memory limit? If it doesn't, you have your fsck and can eat it
>>> too. ;)
>>>
>>
>> I already converted to ext4.
>>
> and i have converted to xfs!
> i am very impressed so far and will try to document my experiences with
> it as a sort of 'noob guide' ... if only to help myself out. :D
> thx again stan! i'm looking forward to learning about the filesystem,
> something i never bothered with in the past.

It shines brightest with heavy multitasking/multiuser IO workloads manipulating
large files when atop Linux MD or hardware RAID.  Its performance is merely
average in most cases on a single disk system.  It's advantage over all other
Linux FSes on single disk systems is online defragmentation.  For example,
if you have a local mailbox file in mbox format (any Mozilla MUA) it will
always get heavily fragmented.  Cron'ing xfs_fsr twice a week will eliminate
the performance hit to your MUA that accompanies such fragmentation.

To get maximum metadata performance you need kernel 2.6.36 or later with
the mount option 'delaylog' in fstab.  In 2.6.39 and later delaylog is
the default.  This will dramatically decrease execution time of things like
unpacking a kernel tar file or running 'rm -rf' on a huge directory tree of a
few thousand files.  It will also increase maildir performance substantially
on busy IMAP servers.

-- 
Stan

