ElectricCommander Feature Spotlight: The Batch API

This is the first in a series of spotlights on some ElectricCommander features that you may not be using as much as you could or which you may not even be aware of. In this installment, I illustrate why the batch api is an important tool in your arsenal as a procedure developer and how to use it.

What is the Batch API?

In a typical ec-perl script, you issue requests to the server one at a time and process each result before issuing the next request. For example:

[sourcecode language=”perl”] my $xpath = $ec->getProperty("foo");
my $fooVal = $xpath->findvalue("//value")->value();
… some logic with $fooVal …
$xpath = $ec->getProperty("bar");
my $barVal = $xpath->findvalue("//value")->value();
… some logic with $barVal …
[/sourcecode]

Each of these requests is a round-trip to the server. Thus, the script blocks twice waiting for responses. The Batch API allows you to aggregate multiple requests, send them all together in a batch, and wait for the aggregated response. The example above could be implemented like this:

[sourcecode language=”perl”] my $batch = $ec->newBatch("parallel");
my $fooReqId = $batch->getProperty("foo");
my $barReqId = $batch->getProperty("bar");
$batch->submit();

# Get the value of the foo property from its response. We can’t be
# lazy and do the same "//value" xpath query as before because
# that’ll do the search from the root of the entire xml response,
# which includes *both* responses. Instead, we do a relative
# query starting at the appropriate "response" node.
my $fooVal = $batch->findvalue($fooReqId, "property/value")
->value();
… some logic with $fooVal …
my $barVal = $batch->findvalue($barReqId, "property/value")
->value();
… some logic with $barVal …
[/sourcecode]

The Batch API addresses three shortcomings of the traditional API:

  1. It allows you to “batch up” multiple requests, reducing the number of round-trips to the server.
  2. It allows you to run multiple requests in parallel, as shown above in the getProperty example.
  3. It allows you to run multiple requests in a single transaction; all succeed, or no change is persisted in the server.

Each of these shortcomings are handled by a batch “mode” specified when creating a batch object:

serial
run the requests in the batch one after the other, serially, in different transactions. If a request in the batch fails, the other requests will still run. Each individual response indicates success or failure for that request.
parallel
run the requests in the batch in parallel, in different transactions. If a request in the batch fails, the other requests will still run. Each individual response indicates success or failure for that request.
single
run the requests in the batch one after the other, serially, in the same transaction. If a request in the batch fails, the transaction is rolled back, thus reverting all requests in the batch. Walk through each response to find the one that failed and why; all subsequent responses have a generic “batch failed” error to indicate that those requests didn’t get run at all because of the earlier error.

Performance Comparison

The following tests show how much performance gain the batch api provides for various scenarios.

Procedure Creation Test

In this test, we create procedures of various sizes with the batch api (single transaction mode) and without the batch api. All ec-perl scripts establish an ssl connection with the server when submitting the first message, regardless of whether that first message is an ordinary request or a batch of requests. The test results reported below factor out this cost by issuing an innocuous getServerInfo call, then measuring the time it takes to issue the createStep requests.

This graph shows that using the batch api reduces procedure creation time by about 50% for medium to large procedures and the time difference is negligible (less than 1 second) for procedures with 10 steps. In this scenario, the local agent on the Commander server issued the create step requests in a job. It’s interesting to note what happens if the machine issuing the requests is on a WAN; to emulate that, I ran the test from a home network VPNed into Electric Cloud with a Commander server in our office:

As one would expect, the round trip costs are somewhat larger now, so using the batch api reduces procedure creation time by about 66%. Again, not a lot of impact for small procedures (in this case, it took about 1 second to create such a procedure either way). Also, because the batch api uses one round trip to create the procedure, the WAN versus local-agent numbers are virtually the same, showing that most of the time is spent in server processing and the I/O costs are negligible in comparison:

Parallel GetProperty Test

In this test, we time how long it takes to get varying numbers of properties with a parallel batch and without the batch api. As for the procedure creation test, we factor out the cost of establishing the ssl connection to the server.

The difference here isn’t as dramatic as in the procedure creation case, but it’s still quite significant for medium and large numbers of getProperty calls with roughly a 30% time savings when getting properties on the local agent and about 55% savings when getting properties over a WAN.

Conclusions

The batch api provides significant performance benefit in both LAN and WAN environments for medium and large sets of requests. For small sets, the benefit is not very high, but it’s certainly no worse than issuing requests one at a time. Also, for scripts that contact the server repeatedly with a few requests every time, the little savings from each small batch request can add up. Of course, if it’s possible to structure such scripts to have a few large batches of requests instead, you’ll achieve the best speed-up.

For more details and examples, see the Using ectool and the Commander API section of online help.

Sandeep Tamhankar

Sandeep Tamhankar is Principal Engineer in Ship.io, Electric Cloud's mobile CI/CD cloud service.

He joined the Electric Cloud engineering team in 2003 to work on various facets of emake and the Accelerator Cluster Manager and wrote the first version of ElectricRunner. Sandeep was a member of the team that began to design and implement ElectricCommander in 2005, where he wrote the Commander Agent and implemented some of the core features of the Commander server.

Sandeep holds a BS in Computer Science from Cornell University and an MS in Computer Science from Stanford University.

Latest posts by Sandeep Tamhankar (see all)

Share this:

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

Subscribe via RSS
Click here to subscribe to the Electric Cloud Blog via RSS

Subscribe to Blog via Email
Enter your email address to subscribe to this blog and receive notifications of new posts by email.