Building Linux 2.6 with ElectricAccelerator and distcc

There are lots of different parallel, distributed build systems in the world besides ElectricAccelerator. In this post, I’m going to share my recent experience with one popular alternative, GNU make combined with distcc.

Distcc uses an interesting approach to accelerating builds. It leverages the parallel facilities built into GNU make itself, and adds a distribution mechanism that enables it to take advantage of networked CPU resources. This week I decided to take a look at distcc 3.1, which was released in December 2008. It’s been some time since I last tried it, so I figured it was worth some time to see how the project has evolved and how it compares to Accelerator in its latest incarnation.

Setup

For this experiment, I chose to build the bzImage and modules targets of the Linux 2.6 kernel for the following reasons:

  • Including modules, the Linux 2.6 kernel build is fairly substantial — around 20,000 source files
  • It’s freely available, should anybody want to replicate my experiments.
  • It’s well-known, so if I have made a foolish error in my tests it will be easier for other people to detect and correct.
  • The build system was deliberately designed to facilitate parallel builds.

I used the following packages in my tests:

  • GNU make 3.79.1
  • Distcc 3.1
  • Linux 2.6.28.1
  • ElectricAccelerator 4.3.1.25685

Finally, my test hardware consisted of 9 systems configured as follows:

  • Dual Xeon 2.4GHz with hyperthreading enabled
  • 8 systems with 1.5 GB RAM; one system with 2 GB RAM
  • Gigabit Ethernet connections on a dedicated switch
  • RedHat Desktop 3, update 8

Process

After downloading the kernel sources, I unpacked them and used make menuconfig to generate a .config file with all default settings, which I saved for reuse to ensure that each test run used an identical configuration. I wrote a simple driver script for the tests:

[sourcecode language=”python”] gver=”make --version | head -1 | awk '{ print $4 }' | sed -e s/,//
lver=”2.6.28.1″
targets=”bzImage modules”
mkdir gmake-$gver
(
rm -rf “linux-$lver”
tar xjf “linux-$lver.tar.bz2”
cd “linux-$lver”
patch -p0 -i ../”linux-$lver.patch”

for i in 1 2 3 4
do
pfx=../gmake-$gver/gmake$i

make distclean
cp ../”linux-$lver.config” .config
make silentoldconfig

(time make
$targets
) < /dev/null > “$pfx.out” 2>&1
done

echo DONE
) < /dev/null > “gmake-$gver/gtest.out” 2>&1
[/sourcecode]

Attentive readers will have noticed that I’m applying a patch to the kernel sources before running the build. That patch just removes a couple instances of the order-only prerequisite feature in the kernel makefiles, because neither gmake 3.79.1 nor Accelerator 4.3.1 support that feature.

Serial build

Collecting serial build times was completely uneventful. The build ran successfully, albeit slowly, but then that’s why we’re here, right?

ElectricAccelerator build

After collecting serial build times, I tweaked the driver script to accomodate building with emake. For these tests, I used the system with 2 GB RAM as both cluster manager and emake host, and configured the remaining systems as agent nodes running three agents each. ElectricMake built the 2.6 kernel out-of-the-box, without any special configuration required. I did run one build to generate a history file, then used that history file for each of the subsequent runs, although for this build, the impact of the history file was negligable (that is, there are very few missing dependencies in the build), which is to be expected given the amount of work put into making the build parallel friendly.

Distcc build

Finally, I retooled the script one more time to accomodate building with distcc. For these runs, I used the system with 2 GB RAM as the build host, and used the remaining systems as distcc servers; I invoked distcc in “pump mode” with gmake -j 16. The build ran successfully, but I found errors in the build log indicating that “pump mode”, the banner feature of distcc 3.x, had been disabled, meaning that the build performance was negatively impacted (error formatted for legibility):

After some investigation I learned that pump mode does not automatically handle header files that are modified during the build. I applied the prescribed workaround, with the addition of the specific header file mentioned in the error message I received, and tried again… with the same result. After several more iterations, covering one and a half days and including some detailed analysis of the build made possible by Accelerator’s file-level annotation, I managed to get distcc working in pump mode. Note that plain distcc worked out-of-the-box; it was only pump mode that gave me trouble (details available on request).

Once I worked out the kinks, I reran the distcc tests with -j 8 and -j 24; and I tried including “localhost” as one of the distcc compile servers. At best these changes had no impact on performance; most of them made the build slightly slower.

Results and Analysis

Build tool Average (4 runs) Standard deviation Comparison to serial
Serial gmake 32m29.25s 4.99s 1.0x
Distcc/gmake 4m25.75s 4.50s 7.35x faster
ElectricAccelerator 2m38.00s 1.41s 12.34x faster

As you can see, both distcc and Accelerator do a good job of accelerating the build, but for raw speed (and, I think, for ease of implementation) Accelerator takes the crown on this one. Why is that so? I think there are two factors that contribute to our success here:

  1. Accelerator distributes all work to the cluster: in addition to compiles, Accelerator distributes code gen, links, packaging, and even makefile parsing to the cluster. This significantly increases the amount of work that can be done in parallel, and reduces the amount of work performed on the build host itself, preventing it from becoming a bottleneck.

 

  • Accelerator aggressively parallelizes recursive make invocations. In a variety of common situations, gmake does not parallelize recursive makes, even when invoked with -j. For example:

    In this situation, gmake will not parallelize the work in the recursive makes, and in fact it cannot — to do so runs the risk of breaking the build. But Accelerator can and does parallelize the recursive makes, and it can do so safely because of our conflict detection and resolution technology. This often gives us an edge over other build systems, particularly if there is substantial work in recursive makes. Of course, since distcc uses gmake to handle parallelization, it is subject to the same limitations.

 

It’s worth mentioning that Accelerator also has an edge in terms of the artifacts left over after the build completes. When the distcc builds finished, I had the build output and the standard build log. When the Accelerator builds finished, I had those same artifacts as well as an Accelerator annotation file, which as you know provides a gold mine of performance and dependency information about the build, which personally I find indispensable.

Conclusion

Although distcc is more capable now than ever before, in the most important measure — raw speed — Accelerator still beats it hands down. I think this experiment also underscores the shortcomings of distcc’s approach to build acceleration — without distributing more work to the cluster, I don’t know that distcc will ever be able to match Accelerator for speed, at least for large builds that produce multiple outputs. For smaller, simpler builds, who knows — but then, that’s not really the target that Accelerator is aiming for.

Follow me

Eric Melski

Eric Melski was part of the team that founded Electric Cloud and is now Chief Architect. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin. Eric also writes about software development at http://blog.melski.net/.
Follow me

Share this:

6 responses to “Building Linux 2.6 with ElectricAccelerator and distcc”

  1. Joey says:

    Would you be so kind as to describe what you did to get distcc’s pump-mode working with your kernel build? I’m getting the same exact error as you did and the manpage’s suggestions do not fix it.

    Thanks!

  2. Nils Klarlund says:

    Please see:

    http://lists.samba.org/archive/distcc/2009q3/003976.html

    for how to inform distcc’s pump mode that the ground is changing under it.

    Nils

  3. Fergus Henderson says:

    When we benchmarked distcc’s pump mode, we noticed that of all the benchmarks we tried, the Linux kernel was the one on which it gave the least speed-up.

    So I would be interested in seeing performance comparisons for other benchmark applications, such as the ones in the “bench” directory of the distcc sources.

  4. Alexk says:

    New challange for EA and distcc – build of Android OS…

    is it possible to do such comparison? Will be good to see EA in action.

    • Eric Melski says:

      @Alexk: thanks for the suggestion! It is definitely possible to do an Accelerator versus gmake/distcc comparison on Android, but it will be some time before I can get to that. Unfortunately Android more-or-less requires Ubuntu, but the cluster I was using is installed with RHEL. I could easily build out a cluster of virtual machines running Ubuntu, but that’s not a very good environment for performance testing. So as I said — great idea, but I may not get to do it for a while.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Subscribe via RSS
Click here to subscribe to the Electric Cloud Blog via RSS

Subscribe to Blog via Email
Enter your email address to subscribe to this blog and receive notifications of new posts by email.