Subbuilds: build avoidance done right

I’ve heard it said that the best programmer is a lazy programmer. I’ve always taken that to mean that the best programmers avoid unnecessary work, by working smarter and not harder; and that they focus on building only those features that are really required now, not allowing speculative work to distract them.

I wouldn’t presume to call myself a great programmer, but I definitely hate doing unnecessary work. That’s why the concept of build avoidance is so intriguing. If you’ve spent any time on the build speed problem, you’ve probably come across this term. Unfortunately it’s been conflated with the single technique implemented by tools like ccache and ClearCase winkins. I say “unfortunate” for two reasons: first, those tools don’t really work all that well, at least not for individual developers; and second, the technique they employ is not really build avoidance at all, but rather object reuse. But by co-opting the term build avoidance and associating it with such lackluster results, many people have become dismissive of build avoidance.

Subbuilds are a more literal, and more effective, approach to build avoidance: reduce build time by building only the stuff required for your active component. Don’t waste time building the stuff that’s not related to what you’re working on now. It seems so obvious I’m almost embarrassed to be explaining it. But the payoff is anything but embarrassing. On my project, after making changes to one of the prerequisites libraries for the application I’m working on, a regular incremental takes 10 minutes; a subbuild incremental takes just 77 seconds:

Standard incremental:
609s
Subbuild incremental:
77s

Not bad! Read on for more about how subbuilds work and how you can get SparkBuild, a free gmake- and NMAKE-compatible build tool, so you can try subbuilds yourself.

What is a subbuild?

A subbuild is just the smallest part of a full build tree that must be built in order to completely build a single component of the build, including all its prerequisites. For example, my project consists of several applications and the libraries they depend on. Each of these components resides in a separate directory, and we use recursive make invocations to build everything. (Nota bene: if you have a non-recursive make then you probably already enjoy many of the benefits of subbuilds, but you should definitely still check out the other features of SparkBuild!)

The dependency graph for my project looks like this:

You can see that to build the agent component, for example, we only need to build the util, xml, and http libraries, and the agent application code, of course:

This subset defines the agent subbuild.

Subbuilds and developers

What makes subbuilds really interesting for developers is the realization that usually you’re working on just one component at a time. For example, on any given day I might be working on the agent component, or the cm, but rarely both. Most of the edits I make will be on code in the agent directory, with occassional edits to the agent’s prerequisites. As I’m running through the edit-compile-test cycle, I have some choices about how to run the build. The most natural thing for me is to simply run make in the agent directory. After all, most of the changes I make are in that directory, so that will do the right thing most of the time. Of course, if I have made changes to any of the prerequisites, or if I resync with the source depot and pick up somebody else’s changes in one of those prerequisites, I’ll probably get a busted build.

The next most obvious approach is a rebuild from the root of my source tree. This ensures that I always update all the pieces I need for the agent, but at the cost of also building components that are irrelevant to my current focus: if I’m just trying to rebuild to run the agent’s unit tests, there’s no need for me to rebuild the cm application, or the ldap library.

The best choice is the agent subbuild, the minimum set of things that must be built to be sure that the agent component is fully up-to-date. But although it’s possible on a small project like this to execute the subbuild manually, it’s a nuisance, and on a bigger project it may not be practical or even possible. You need a build tool that can automatically determine which parts of the build make up the subbuild for any component, and then automatically execute that subbuild. That tool is SparkBuild emake.

Subbuilds with SparkBuild

Subbuilds with SparkBuild start with a full build, during which emake captures information about which targets are produced by each submake. In subsequent builds, emake references that database anytime it can’t find a rule to build a particular target. If a match is found, emake runs the corresponding submake before proceeding. For example, the rule for the actual agent target looks like this:

In a normal build, gmake would see the dependency on $(OUT)/xml/xml.a and use that file if it existed already, regardless of whether it was actually up-to-date; or report “no rule to make” if the file did not exist. With SparkBuild, emake checks the subbuild database for an entry matching $(OUT)/xml/xml.a and sees that it must run make in the xml directory before proceeding. Like magic, each of the agent’s prerequisites is updated without requiring me to take any action other than swapping emake –emake-subbuild-db=my.db for gmake in my build command-line.

Still not convinced that it’s worth a look? Here’s some more concrete results comparing a few different build scenarios from my project. These comparisons assume that I’m actively working on the agent component, and that I ran either a standard incremental, from the root of the source tree, or a subbuild using SparkBuild emake:

Normal builds versus subbuilds (serial build time, shorter is better)
No changes, standard:
19s
No changes, subbuild:
0.5s

Changes in agent, standard:
81s
Changes in agent, subbuild:
31s

Changes in util, standard:
609s
Changes in util, subbuild:
77s

Full build:
729s
Subbuild build:
118s

Conclusion and Availability

Tools like ccache and ClearCase winkins have co-opted the term build avoidance, but in fact they do object reuse, not build avoidance, and they are not very useful for developer builds. Subbuilds are a simple but highly effective approach to build avoidance that save significant time during developer builds by literally skipping parts of the build tree that are unrelated to your current focus.

If you want to try out subbuilds yourself, you can download SparkBuild from www.sparkbuild.com. It’s completely free (as in beer), so you’ve got nothing to lose… except those long coffee breaks, of course!

Follow me

Eric Melski

Eric Melski was part of the team that founded Electric Cloud and is now Chief Architect. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin. Eric also writes about software development at http://blog.melski.net/.
Follow me

Share this:

4 responses to “Subbuilds: build avoidance done right”

  1. […] Subbuilds: build avoidance done right « The Electric Cloud Blog blog.electric-cloud.com/2009/10/21/subbuilds-build-avoidance-done-right – view page – cached I’ve heard it said that the best programmer is a lazy programmer. I’ve always taken that to mean that the best programmers avoid unnecessary work, by working smarter and not harder; and that they… (Read more)I’ve heard it said that the best programmer is a lazy programmer. I’ve always taken that to mean that the best programmers avoid unnecessary work, by working smarter and not harder; and that they focus on building only those features that are really required now, not allowing speculative work to distract them. (Read less) — From the page […]

  2. Social comments and analytics for this post…

    This post was mentioned on Twitter by ElectricCloud: #Build avoidance done right (even better: it costs you nothing). Read about it here: http://bit.ly/lItZt

  3. Jean says:

    Pity the discussion at uberVU is not available here… To sum it up, “build avoidance” vs “object reuse” muddles the water: the problem is broken makefiles. “Build avoidance” in emake tries to work around the brokenness instead of helping to do the right thing: non-recursive make. The examples above are broken themselves, see http://make.paulandlesley.org/rules.html, rule 3: “Life is simplest if the targets are built in the current working directory.”

  4. I have to echo Jean’s comment – this whole “subbuild” approach is _exactly_ what make normally does anyway, except that there’s an extra DB layer built in to work around the fact that the original makefiles were poorly constructed using recursive make, rather than non-recursive.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Subscribe via RSS
Click here to subscribe to the Electric Cloud Blog via RSS

Subscribe to Blog via Email
Enter your email address to subscribe to this blog and receive notifications of new posts by email.