[Scons-users] Using scons with SGE

Peter Kerpedjiev pkerpedjiev at gmail.com
Sun Nov 30 14:13:56 EST 2014


Hi,

I would really like to use scons with the Sun Grid Engine and I'd like 
to ask you all what the best way to do so is. The issue is chaining 
build steps. For the most part, I use a series of Commands to create a 
pipeline for data analysis. So in then end I would have something like this:

inputs = ['in1.txt', 'in2.txt', 'in3.txt', 'in4.txt']

outputs = ['out1.txt', 'out2.txt', 'out3.txt', 'out4.txt']

for input,output in zip(inputs, outputs):

     Command(target=output, source=input, action='blah $SOURCE > $TARGET')

Command(target='sum.txt', source=outputs, action='blah2 $SOURCES > $TARGET')


So the easiest way to run this on the SGE is to prefix the command 
'blah' with the submit command (i.e. 'qsub blah $SOURCE > $TARGET'). 
This creates a problem, however, since 'qsub' is asynchronous and 
returns immediately. We can make it synchronous using the '-s' option, 
but then we end up with the same problem as having not used it in the 
first place, namely that we have to wait for it to finish. The final 
option is to make it synchronous and run scons with some number of 
threads roughly equal to how many spots are available to the grid 
engine. This more or less works, but seems a little bit hacky.

My question is, is there a preferred way of submitting jobs to a grid 
engine with scons such that dependent steps wait on each other and 
parallel steps are submitted at once?

-Peter


More information about the Scons-users mailing list