The marquee feature in ElectricAccelerator 5.0 is Electrify, a new front-end to the Accelerator cluster that allows us to distribute work from a wide variety of processes in addition to the make-based processes that we have always managed. One example is SCons, an alternative build system implemented in Python that has a small (compared to make) but apparently growing (slowly) market share. It’s sometimes touted as an ideal replacement for make, with a long list of reasons why it is considered superior. But not everybody likes it. Some have reported significant performance problems. Even the SCons maintainers agree, SCons “Can get slow on big projects”.
Of course that caught my eye, since making big projects build fast is what I do. And you can’t really practice Continuous Delivery of anything, if you’re stuck waiting hours for your builds to run. What exactly does it mean that SCons “can get slow” on “big” projects? How slow is slow? How big is big? So to satisfy my own curiosity, and so that I might better advise customers seeking to use SCons with Electrify, I set out to answer those questions. All I needed was some free hardware and some time. Lots and lots and lots of time.
My test environment was as follows:
- RedHat Desktop 3 (kernel version 2.4.21-58.ELsmp)
- Dual 2.4 GHz Intel Xeon with hyperthreading enabled
- 2 GB RAM
- SCons v1.2.0.r3842
- Python 2.6.2
The test build consists of (a lot of) compiles and links. Starting from the bottom, we have N C files each with a unique associated header file. The C files and headers were spread across N/500 directories in order to eliminate filesystem scalability concerns. Both the C files and the header files are trivial: the header only includes stdio.h; the C file includes the associated header and a second, shared header, then defines a trivial function. Objects are collected into groups of 20 and stored into a standard archive. Every 20th object is linked into an executable along with the archive. The build is generated using a script written by one of our talented QA engineers for testing Accelerator.
Overall build time
The primary concern was naturally end-to-end build time for a from-scratch build. I used the standard Linux time utility to capture this data, and averaged the results from two runs (except for the build with 40,000 C files, because that just took too long):
That doesn’t look too good! In fact, that curve looks suspiciously like the function f(x) = x2. Let’s plot that on our graph:
That looks like a pretty close fit — so it seems that the build duration increases in proportion to the square of the number of input files. That’s bad news — as you can see, that very quickly adds up to outrageously long build times.
Ruling out other causes
It’s possible that this performance degradation is due to some factor other than SCons. To rule out that possibility, I created a shell script that runs the exact set of commands, in the same order, that the SCons-based build had, and timed the execution of that script. The work done by that script is the actual work of the build: the bare minimum that must be done to compile and link all of the source files. By definition, everything else is overhead. Let’s add that data to the graph:
As expected, the time needed for the actual work grows linearly in proportion to the number of C files in the build. That means that the performance degradation is not due to some other component of the system — if it were, we would have seen a similar problem with the simple shell script. Instead, the problem is clearly in SCons itself.
Comparing overhead to actual work
Now that we know the amount of time for the actual work, we can compute the amount of time spent on overhead introduced by SCons — that’s just the difference between the “overall build time” and the “actual work” lines in our graph. For example, with 40,000 C files, the SCons build time is about 4 1/2 hours; the actual work time is about 25 minutes. SCons is adding more than four hours of overhead to the build!
Let’s put that into terms that are a little easier to grok: rather than looking at the absolute numbers, we’ll look at the overhead as a percentage of the total build time.
Even with only a few hundred files, SCons overhead represents 50% of the total build time; with 10,000 C files, SCons overhead represents 75% of the total build time; and with 40,000 C files, SCons overhead accounts for a whopping 90% of the total time!
The final metric that I tracked was SCons memory utilization, using the built-in –debug=memory flag. This metric is of particular interest to me, since I’ve spent a lot of time streamlining Accelerator’s memory usage so that it can accommodate truly enormous builds (millions of compiles). After the disastrous build time results, I was relieved to see that memory usage seems to grow linearly with the number of source files in the build (NB: here I’m counting total source files, including both C files and headers, not only C files):
Unfortunately, although the growth is linear, the rate of growth is quite high: each additional source file adds more than 19,000 bytes (!) to the memory footprint. At that rate, SCons will exhaust the available memory address space for a 32-bit process at only about 110,000 total source files.
These results paint a pretty grim picture for SCons: based on the overall build times, I can’t imagine anybody seriously using SCons for builds with more than a couple thousand files. And even if you were willing to put up with the long builds, the memory usage data indicates that SCons has a hard limit of around 110,000 total source files.
Are there any SCons experts out there able to explain why SCons seems to perform so badly here?