Tuesday, February 26, 2008

XFS is 20x slower than ext3 (with default settings)

Is XFS that bad? Well, at least with default settings, XFS on Debian seems to be blown away by ext3 completely in terms of speed. I don't mind 1.5x slowdown, maybe even 2x, but 20x is a show stopper. I am already using ext3 for any pbuilder builds, because it's a difference to wait for 30s with XFS, compared to 3s with ext3 to extract the base image. And I'll probably switch to ext3 completely, unless someone finds a way how to fix this.

I recently got burned by this when running Sage on my computer, because it compiles a lot of Python files when started for the first time. Normally it should take roughly 15s, but instead it took 6 minutes on my comp and then it triggered a so far undiscovered bug in Sage, that I reported.

Michael Abshoff, the release manager of Sage, suggested that something is FUBAR (Fucked Up Beyond Any Recognition) on my shiny Debian amd64 sid system running on Intel Core Quad, so I said no way, because I really care about this machine, as I use it for larger finite elements calculations and other stuff (like compiling huge deb packages in parallel, like paraview).

So I offered a bet, that I give him an access to this compter, he finds the problem and if it's a problem in my Debian configuration, I'll write to this blog that I am lame, while if it's a problem in Sage, he will write to his blog that he is lame. And I was smiling to myself, how good I am and that I will have some fun too reading planet.sagemath.org with the top post from Michael saying that he is lame.

But then I remembered my old struggle with cowbuilder and XFS and I stopped smiling. See e.g. this wiki I created half a year ago. Something is FUBAR with XFS and Debian. I also asked on the Czech server Root, that is famous for having a lot of experts willing to share their knowledge, and it was quickly revealed, that the problem is with the "nobarrier" option of XFS (my post is here, but it's in Czech).

First, on that amd64 machine, the above problem was fixed after issuing this command:

mount -o remount,rw,nobarrier /dev/sda3 /home/

(notice the "nobarrier" option). You can read some background behind this on the lkml list. Unfortunately, I also have my laptop, and there I already use this "nobarrier" option, and it doesn't help at all. I just created a new ext3 partition and verified that on my laptop, ext3 is around 10x faster than XFS with nobarrier (that was supposed to fix this). I use the latest 2.6.24 kernel from unstable on both.

Time to move from XFS to ext3 on my laptop? Seems like that. I'll leave XFS on the other machine, because I know some other peole have good experience with XFS and the "nobarrier" option seems to fix the problem there.

But as to the bet, yeah, I am lame and I should still learn a lot from Michael. :)