Makefile performance: recursive make

Are you using the best method for invoking submakes in a recursive make based build? A simple change can turn on the afterburners by unlocking parallelism.

The central feature of a build system based on recursive make is, of course, the use of submakes to process subcomponents of the build. Typically you’ll use one submake invocation per subdirectory in the build tree. All you need to do is run $(MAKE) -C subdir and you’re off and running. That’s fine if you have just one subdirectory, but what if you have several subdirectories to make in? What’s the right way to set it up for optimum performance?

One rule to … rule them all, and in the darkness bind them

One obvious approach is something like this:

Or the slightly more sophisticated shell-loop variant:

Both of these are straightforward to implement and easy to understand — and each torpedoes your parallel build performance. Here’s why: when you run your build with gmake -j N, gmake will run up to N jobs (targets) simultaneously, limited by the number of currently runnable jobs. When one job finishes, another will be dispatched to take its place (again, limited by the number of runnable jobs). When a submake is invoked, an elaborate jobserver system is used to coordinate the parent make and the submake so that the submake can run jobs in parallel too, as long as the total number of jobs across all makes does not exceed N.

Sounds great, and it is — until you hit a construct like one of these. Then your massive parallelism comes to a screeching halt as you dribble through each submake one at a time. Why? In these makefiles, there is only one job; and, gmake doesn’t run the next command in the command script until the current command finishes, of course. That means that $(MAKE) -C partA has to run to completion before $(MAKE) -C partB will even start. Sure, you get parallelism within partA, but if partA is small, there may not be enough work there to really get the parallelism you were hoping for. Consider this contrived example:

Makefile

 

partA/Makefile, partB/Makefile, partC/Makefile

If you run this build with gmake -j 20, you might expect the entire thing to take no more than about 3 seconds — there’s fifteen total jobs that do “work”, and they are all parallelizable. In reality, thanks to the limitation I’ve described, this build runs for 9 seconds — three times as long.

Fortunately, there is a better way.

Let your make do the walking

The ideal way to setup your recursive makes is to use separate target for each submake:

Makefile

By creating a separate target for each submake, we’ve moved the management of the submakes from the shell to gmake itself, and thus enabled gmake to run all of the submakes in parallel. If we run this build with gmake -j 20, the build completes in about 3 seconds, as expected. Note that the functionality of the system is unchanged: we will still recurse into each of three directories and run the jobs in each as before. A serial build will even produce identical output, processing each directory in the same order as the original. As an added benefit, now we can invoke just one component of the build directly from the command-line, where before we had no choice but to run the entire mess.

There is one minor drawback to this approach: if there are dependencies between the submakes, you must now explicitly declare them. For example, if anything in partB depends on something from partA, you must add the dependency:

By breaking the submakes into separate targets we’ve removed the implicit serialization between them (caused by the way gmake processes commands in a makefile). To account for that we add explicit dependencies where necessary. That’s just good makefile hygiene anyway though. Note that if you are using ElectricAccelerator, you can skip adding the explicit dependencies and instead rely on Accelerator’s dependency discovery; that’s not only easier, but gives you even better performance.


This article is one of several looking at different aspects of makefile performance. If you liked this article, you may enjoy the others in the series:

Follow me

Eric Melski

Eric Melski was part of the team that founded Electric Cloud and is now Chief Architect. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin. Eric also writes about software development at http://blog.melski.net/.
Follow me

Share this:

2 responses to “Makefile performance: recursive make”

  1. Your point is very good, but no article that deals with recursive make is complete without a reference to Peter Miller’s article _Recursive Make Considered Harmful_: http://miller.emu.id.au/pmiller/books/rmch/

  2. Sivakumar says:

    How about when you use variables.
    SUBDIRS = partA partB partC

    subdirs:
    for i in ${SUBDIRS}; do
    ${MAKE} -C $$i all;
    done

    How to handle this situation?

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Subscribe via RSS
Click here to subscribe to the Electric Cloud Blog via RSS

Subscribe to Blog via Email
Enter your email address to subscribe to this blog and receive notifications of new posts by email.