Untangling Parallel Build Logs

I spend most of my time with ElectricAccelerator working on the “big” features — performance, scalability, fault-tolerance. It’s easy to forget that there are a ton of “little” features that can themselves make a big difference in the value of the system. Case in point: the build log. If you have any experience with parallel build systems, you know what a mess the build log becomes because you have any number of parallel commands all dumping output to a single logfile simultaneously. The output from each command gets interleaved with the output from other commands. Worse, the error messages get jumbled up too, so it becomes difficult to tell which commands are producing the errors.

I was reminded of this issue when it popped up again on the GNU make-help mailing list. Take a look at this recent post:

In parallel mode (with -j option), the outputs from different rules are intermixed. I’m wondering if there is a way to order the outputs as if make is run in serial mode, but it should still achieve the speed of parallel mode.

Unfortunately, as this poster discovered, there is no really good solution to this problem, at least not unless there is help from the build tool itself. Some people have tried tricks like colorizing the output by piping each command through a filter that sets the font color before printing the output, but any such solution is going to be unreliable because it doesn’t address the fundamental problem of having multiple processes trying to write output simultaneously. Plus it’s a real nuisance to modify your makefiles to add the filter.

How ElectricAccelerator Does Build Logs

Seeing that message on make-help reminded me that I take for granted the way that Accelerator handles the build log. Simply put, Accelerator emits the build log in correct serial order, without any interleaving of output from different commands — even though those commands are running in parallel and frequently out-of-order. The log you get at the end of the build is identical to the log you would have gotten if you’d run a single-threaded, serialized build.

Think about that for a second: the build log is emitted in serial order. No more scrambled build logs. No more kludgy partial solutions, because we solved the problem at the level of the build tool. You just get consistent, predictable, useful logs. Every time.

Consider this trivial makefile:

Now compare the output when run by regular old serial gmake with the output when run by parallel gmake -j 4, and output from Accelerator (emake) (red text indicates places where the output differs from serial gmake, bold text indicates places where the output differs from the previous run of the same make variant):

serial parallel gmake parallel gmake parallel gmake emake emake emake
a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4
a-1
b-1
c-1
b-2
a-2
c-2
a-3
b-3
c-3
b-4
a-4
c-4
a-1
b-1
c-1
a-2
b-2
c-2
a-3
b-3
c-3
a-4
b-4
c-4
a-1
c-1
b-1
a-2
c-2
b-2
a-3
c-3
b-3
a-4
c-4
b-4
a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4
a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4
a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4

The parallel gmake output is different from serial gmake on every run, and what’s worse, the parallel runs are not even consistent with each other from one run to the next! On the other hand, the output from Accelerator is identical to serial gmake, every time.

Anything we can do, we can do better

Of course we’re never satisfied with simply solving a problem that plagues users around the world. We want to provide a solution that is so far superior to the alternatives that you’d be crazy not to want it. Therefore, we created annotated build logs, or simply annotation, in late 2003. Annotation is a version of your build log marked up with XML to provide bundles of information above and beyond the standard build output log, in a format that is easily searched and manipulated. This is the solution that everybody wishes they had, and the solution that the developers of other parallel make tools wish they had thought of first. Here’s a bit of the annotation for our example build:

There are a few things in particular that I want to point out in this snippet:

  1. Every job in the build is reported separately from every other job, with the commands and output from each easily identifiable.
  2. You can generate the traditional build output log from annotation simply by concatenating the text in the <output> tags.
  3. Annotation includes even those commands that are not echoed to the standard build output log because of the use of the “silent” prefix (@) in the Makefile!

If you are comfortable with XML you can probably already see the benefits here: you could easily put together some scripts with Perl or even just with xmlgrep and xmlfmt that search the content of <output> tags for things that look like warnings or errors, then reports precisely which job produced that output. Alternatively, you can use our annolib library to build that script:

[sourcecode language=”java”] #!/usr/bin/tclsh

load annolib.so

set anno [anno create] set xml [open emake.xml r] $anno load $xml
close $xml

$anno jobiterbegin
while { [$anno jobitermore] } {
set job [$anno jobiternext] set match false
foreach commandBlock [$anno job commands $job] {
foreach {lineNumber argv outputs} $commandBlock {
foreach {src text} $outputs {
if { [regexp -nocase {(warning)|(error)} $text] } {
set match true
}
}
}
}
if { $match } {
puts “Problem found in job $job ([$anno job name $job]):”
foreach commandBlock [$anno job commands $job] {
foreach {lineNumber argv outputs} $commandBlock {
puts “t$argv”
foreach {src text} $outputs {
if { $src == “prog” } {
puts “t$text”
}
}
}
}
}
}
[/sourcecode]

Just like that, you have a simple script you can use as a starting point for your own post-build reporting on errors and warnings in your build.

So with ElectricAccelerator, you get the best of both worlds: fast parallel builds, and build logs that you can actually use. Just another way that Accelerator rocks.

Follow me

Eric Melski

Eric Melski was part of the team that founded Electric Cloud and is now Chief Architect. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin. Eric also writes about software development at http://blog.melski.net/.
Follow me

Share this:

10 responses to “Untangling Parallel Build Logs”

  1. czrpb says:

    Our solution was to create individual log files for each parallel part as it tends to turn out that the parallel parts are owned by different people. Since they only care about “their” output, they get only the info they want.

    The actual hard part — in either of our solutions AFAICT — is tracking down when one parallel part encroaches on another. This is when it is actually important to have the log in the order of execution. In our case you simply cat all logs and sort, since the first field in our log is the date/time stamp.

    • Eric Melski says:

      @czrpb: because it uses a virtual filesystem and some sophisticated (IMHO!) technology, ElectricAccelerator protects you against the kinds of encroachment you describe, so you are guaranteed to have a build that is both fast and correct, even if your dependencies are not completely specified.

      You may also be interested in another article I wrote on this topic for CM Crossroads, which uses a small utility program in conjunction with overriding the SHELL variable in a makefile to ensure that the outputs don’t get mixed up: Descrambing parallel build logs

  2. czrpb says:

    that is cool! we have our own build system (of course!) on top of make (instead of make of makes).

    but we have found people messing around in the $INCLUDE_DIRS!! so how do you catch that stuff? you know rms, mvs, etc?

    • Eric Melski says:

      @czrpb: ElectricAccelerator’s virtual filesystem allows us to intercept all filesystem activity. Each command in the build is essentially run in a sandbox so that we can guarantee that the filesystem state it perceives is identical to what it would have seen if the build were run serially, and so that we can ensure that commands running in parallel cannot interfere with each other. Later, if we detect that two jobs which were run in parallel should not have been, we roll back the modifications made by the latter job and rerun it with the correct filesystem state. Like I said — magic!

  3. czrpb says:

    Hi! Thx! If I might drag this out with (just) one more question: Say you have three units: A, B, & C. Say A & B may be run in parallel, and C depends on A & B. Say both A & B provide a header for C: The same named and pathed header. Say that the correct one from C’s perspective is the one from B. How can the system know which one is right? Obviously, a valid serialized execution might be: A, then B, then C. But equally valid is: B, then A, then C.

    Obviously though only one unit should be providing the header, but what happens if the developers of A and B decide to change who is the owner of this header — from B to A — but B forgets to stop providing it?

    ( OT: Do you have any posts on the parts of a complete build system outside the compile stage/step? This is where most of our issues occur… )

    • Eric Melski says:

      @czrpb: That’s an excellent question. The answer is simple: the correct order is defined by the makefile itself. There may not be explicit dependencies that fully articulate the dependencies, but there is a defined order in which the jobs will run in serial mode, and Accelerator preserves that ordering. For example, if the makefile looks like this:

      There are no explicit dependencies between A, B and C, but there is a well-defined serial order between the jobs: first A, then B, then C and finally “all”. Technically gmake does not explicit guarantee this left-to-right ordering, but that is how gmake has always worked, and there are so many makefiles that are dependent on that ordering that it’s hard to imagine they would change it (although people have suggested doing things like randomizing the order in these cases, for the sole purpose of forcing users to completely express the dependencies).

      So even though both of your alternatives “may be” valid serial orderings, only one is the ordering produced by gmake run in serial mode, and that is the ordering that Accelerator uses.

  4. czrpb says:

    Great! Thx! It is of course the case that if A & B are running in parallel and A is 10x in runtime than B then certain actions during A’s execution — such as the aforementioned mvs or cps — might be executed AFTER similar cmds executed during B, even if A always starts “first” right?

    Anyway, thanks for replying!

    • Eric Melski says:

      @czrpb: Yes, that is true of course, but that’s precisely the sort of thing that Accelerator’s virtual filesystem allows us to detect. Suppose that your makefile is like this:

      In a regular parallel gmake build, A and B might start at the same time, but B will obviously finish first. A then overwrites the file “foo”. When C runs, it ends up printing “fail” instead of “pass” as it would have in a serial build.

      In a parallel build using ElectricAccelerator, we detect that B should have waited until after A finished to run, based on the fact that B was modifying the same file “foo”. Accelerator then declares B to be “in conflict” and runs it again, this time with the correct filesystem context. If C has already run by this point, Accelerator declares it to be in conflict as well (since it also used the wrong filesystem context); C is then run again, with the correct context. Of course all of this happens behind the scenes, transparent to the user. All that the user sees is that the build produces the same result that it would have had it run serially.

  5. czrpb says:

    Ok, ok, i just can not resist: You say “All that the user sees is that the build produces the same result that it would have had it run serially.” But in a realistic situation — not this or any simplified example — there IS no consistent “serial” ordering right? There may be an ordering that occurs in 80% of the cases, but what about the non-insignificant other 20%?

    2nd, I am not understanding why you say “B should have waited” for A? Why not the other way around? “[B]ased on the fact that B [or A] was modifying the same file” works as a justification for either unit to have to wait for the other or have the other wait on it yes?

    I guess if we make this small change to the current example, what would Accelerator — when run in parallel — do and why?

    • Eric Melski says:

      @czrpb: no worries, I enjoy the questions. I think you’re confusing the ordering that occurs when gmake runs the build in parallel and the ordering that occurs when gmake runs the build serially. On your last example, you are absolutely correct in that if you run “gmake -j 2” the order in which A and B finish is completely non-deterministic.

      On the other hand, if you run gmake in serial mode (simply “gmake”, or “gmake -j 1” if you prefer), the order is completely, 100% deterministic: A will always be run first. B will always be run second. This ordering, the ordering you get when you run gmake in serial mode, is what I refer to as the serial order. There is no randomness, no uncertainty, no non-determinism. The order is what it is, and it is defined by the makefile. Accelerator uses that defined ordering to make the decision about which jobs ran with an incorrect filesystem context. Any that are determined to have been run incorrectly are discarded and run again with the correct context.

      If you run your sample makefile with gmake in parallel, the results are non-deterministic, and this is exactly the problem with gmake in parallel mode (unless you have completely correctly articulated all dependencies across the entire build). If you run it with Accelerator, the results are 100% predictable: the user will see “pass” printed on the console, every time.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Subscribe via RSS
Click here to subscribe to the Electric Cloud Blog via RSS

Subscribe to Blog via Email
Enter your email address to subscribe to this blog and receive notifications of new posts by email.