A few months ago, I took a look at the scalability of SCons, a popular Python-based build tool. The results were disappointing, to say the least. That post stirred up a lot of comments, both here and in other forums. Several people pointed out that a comparison with other build tools would be helpful. Some suggested that SCons’ forte is really incremental builds, rather than the full builds I used for my test. I think those are valid points, so I decided to revisit this topic. This time around, I’ve got head-to-head comparisons between SCons and GNU make, the venerable old workhorse of build tools, as I use each tool to perform full, incremental, and clean builds. Read on for the gory details — and lots of graphs. Spoiler alert: SCons still looks pretty bad.
As before, my test system is a dual 2.4 GHz Intel Xeon, with 2 GB RAM. But since the previous set of tests, my system has been upgraded to RHEL4; there’s a new version of Python; and there’s even a new version of SCons — the big 2.0. So the setup is now:
- RedHat Enterprise Linux 4 (update 8, kernel version 2.6.9-89.ELsmp)
- Dual 2.4 GHz Intel Xeon, with hyperthreading enabled
- 2 GB RAM
- Python 2.7
- SCons v2.0.0.final.0.r5023
- GNU make 3.81
As previously, the test builds consists of a bunch of compiles and links: N C files, each with a unique associated header, spread across N/500 directories (to ensure there are no filesystem scalability effects), are compiled, then bundled into a standard archive library. Every 20th object is linked into an executable along with the archive. The build tree is generated using a Perl script, which generates both SConstruct files and Makefiles for building. One difference between the two builds: when using GMake, I added the -MMD flag to gcc, to generate additional dependency information; each timed full build was then preceded by a full build to generate all the dependencies and a clean build to nuke all the generated output. I felt that this gave a more realistic comparison to SCons, which employs elaborate dependency analysis logic to ensure accurate incremental builds.
Round 1: Full builds
Again, my interest is primarily full, from scratch builds:
As expected, SCons’ performance is pretty miserable: by the time we get to a build with several thousand source files, the build time is already over 15 minutes, and with the same n2 growth we saw last time, the times race past one hour, then two, finally hitting nearly 5 hours for a build containing just 50,000 source files.
GNU make shows a much more sedate, linear growth — in fact it neatly keeps pace with a simple shell script that does all the same compiles and links as the regular build, but without any of the overhead of a build tool. Yes, I know that a shell script is no substitute for a proper build tool. It just serves to give us an idea of how long it takes just to do the work in the build — a lower bound on the build time. No build tool running serially can run the build any faster than this (obviously we can do better, by a constant factor, if we use parallel build features).
Using that lower bound, we can compute the amount of overhead introduced by the build tool. This is all the time that the tool spends doing things other than actually running the commands needed to execute the build: parsing SConstruct files or makefiles, building the dependency graph, traversing that graph, computing command-lines, etc. Viewing this overhead as a percentage of the total build time gives us an easier-to-digest comparison between SCons and GMake:
Even with relatively few files — around 2,000 — over 50% of the total build time is wasted by overhead with SCons. In comparison, GMake has barely any overhead on a build of that size. At the other end of the range, SCons overhead accounts for nearly 90% of the total build time. GMake overhead has increased as a percentage of the total time too, but only to a modest 20% — still significantly less than SCons. In fact, with 50,000 files, GMake overhead is less than half of SCons overhead with just 2,000 files!
OK, just a couple more things to touch on here before we look at incremental build times: first, you probably noticed the sudden hook in the graph for SCons full build times, between 45,000 and 50,000 files. That means that at that level, on my system, some other factor has kicked in to influence the times. I believe this is because the system has started to page heavily, as SCons’ memory footprint has grown to consume all available RAM. Second, if you compare the SCons times in this article with those in the previous article, you’ll see that this time around, the SCons times are a bit better than last time — about 30% on average. Keep in mind that there are several differences between this setup and the previous: a new OS, a new version of Python, a new version of SCons. Even a new version of the compiler and other tools used during the build. I did try to pin down exactly which factors contributed to this improvement by testing various combinations of versions of Python and SCons (for example, Python 2.6.2 with SCons 2.0; or Python 2.7 with SCons 1.2.0); none of these tests produced any significant change in performance, so I presume that the performance difference is most likely due to the new operating system and tools. I choose not to pursue the matter further than that.
Round 2: Incremental builds
A lot of people claimed that full builds “don’t really matter; developers only really do incremental builds”. So let’s take a look at incremental build performance:
For these builds, I ran a full build, then patched one C file and its associated header. This caused a rebuild of one object file, followed by a rebuild of the archive and the executable that the object feeds into. The actual time to execute just those commands is a paltry one-tenth of a second, so with either tool, the incremental time is dominated by the overhead added by the tool itself. Even so, it’s obvious that SCons adds considerably more overhead than GMake. Even with a small build containing just 2,000 files, SCons burns about 35 seconds to do one-tenth of a second of work. GMake does the same build in about 3 seconds. For the 50,000 file build, SCons ran for about 25 minutes, again just to do one-tenth of a second of actual work; GMake ran for about 9 minutes.
One thing I find especially interesting about this graph is that supposedly the problem with full builds is that they force SCons to constantly rescan the dependency graph. That shouldn’t be necessary in an incremental build, and yet we still see what looks like an O(n2) growth in build time.
Round 3: Clean builds
The last build comparison was a clean build: scons -c versus gmake clean:
At last we have found a build variant where SCons performance is in the same ballpark as GMake! Of course, according to the SCons docs, you’ll never need to do a clean build with SCons (because with SCons your dependencies are perfectly accurate), so maybe this data point is not actually interesting to SCons users.
Sudden death overtime: Memory usage
One last comparison: memory usage. This metric is of particular interest because it puts a hard limit on the maximum size of the build that the tool can handle. I’ve learned the importance of memory efficiency the hard way — long nights in “firefighting” mode to improve memory usage to accomodate this or that customer’s enormous build. I’d hoped that the new versions of Python and SCons used in this test would prove beneficial for SCons memory footprint. Unfortunately, the opposite is true: memory usage is about 8% worse now than it was the last time I ran these tests, and as you can see here, SCons uses about 4 1/2 times as much memory as GMake:
With 50,0000 files, SCons uses just a hair less than 2GB of memory, enough to cause my test system (with 2GB of RAM) to start swapping. In comparison, GMake needs just 440MB of memory for 50,000 files. At that rate, GMake won’t start to thrash my system until the build has grown to more than 225,000 files. Unfortunately, it seems likely that there’s not much that the SCons developers can do to fix this: because SCons is implemented in Python, they are at the mercy of the Python runtime implementation. There’s just not enough control over low-level details like memory allocation when you’re using an interpreted language.
We can clearly see now that it is not simply “the nature of the beast” that SCons performance is so bad. GMake scales much more gracefully, both in terms of elapsed time, and in memory usage, on a variety of scenarios: full, one-touch incremental, and clean builds. From the discussions that the previous post sparked, we know that the primary problem is an inefficient O(n2) algorithm in the SCons implementation. It seems that SCons is academically interesting, but ultimately not practical for any non-trivial build (as some of my customers are now finding out the hard way).
A lot of SCons fans, in defense of SCons, say things like “Well, the SCons developers made a concious decision to focus on correctness rather than performance.” But what good is a correct system that is so slow it’s unusable? How can you continuously deliver, if you’re builds take way too long. And why should you have to choose between speed and correctness in the first place. For me, the choice is a no-brainer: my time is too precious to waste on a slow build tool.