[Scons-users] modifying/extending SCons to run jobs on Torque cluster

Bill Deegan bill at baddogconsulting.com
Sat Oct 4 00:42:50 EDT 2014


Is there functionality for directing the scons command to
use a different .sconsign.db file, and an API for
reading/merging/writing entries from different files?

No. (Unless of course you write it.. :)

-Bill

On Fri, Oct 3, 2014 at 12:14 PM, Thomas Lippincott <tom.lippincott at gmail.com
> wrote:

> It's definitely true that I've gotten a lot out of simply switching
> command-line builders to just submit the command along with some minimal
> configuration.  It sounds like it could be simplified even further, with
> your suggestion of redefining the environment variable(s).
>
> Just submitting the job isn't enough though, SCons needs to track the
> job, so it doesn't proceed to any targets that depend on it.  One nice
> haspect of Torque is that jobs can have dependencies on other jobs: this
> could be useful if SCons dependencies could be programmatically
> translated to Torque dependencies.
>
> Torque does have extensive resource-request functionality, so this can
> help mitigate jobs being too small.
>
> To me, the great appeal of deeper changes is the ability to use
> non-command-line builders without modification.  I still like the idea
> of running separate SCons invocations on each node, each building a
> fragment of the dependency tree, but I'd need to stop them from
> contending for access to the .sconsign.db file, and settle up on the
> main node.  Is there functionality for directing the scons command to
> use a different .sconsign.db file, and an API for
> reading/merging/writing entries from different files?
>
> -Tom
>
>
>
>
>
> On 10/03/2014 12:13 PM, Bill Deegan wrote:
> > Does Torque allow you to request resources? (this is one way we select
> > which type of node to run jobs on in SGE).
> > If so you could just specify the "BIGJOB" resource  and only mark certain
> > nodes as having it, and request that when running the "big" jobs..
> >
> > -Bill
> >
> > On Fri, Oct 3, 2014 at 2:57 AM, Dirk Bächle <tshortik at gmx.de> wrote:
> >
> >> Hi Thomas,
> >>
> >> I'd like to basically second what Bill said. On a techical level, you
> can
> >> certainly subclass/rewrite the Node/Taskmaster classes...and there have
> >> been requests for more info about it in the past. But it's an awful lot
> of
> >> work, and all the people that wanted to try anyway, seemed to have
> given up
> >> at one point.
> >>
> >> (* switching to meta-level mode *)
> >> My understanding of your problem/project is, that you try to use SCons
> as
> >> a "driver" to your scheduling system. In a way, you want to
> "traffic-shape"
> >> the single build processes, to let them run on a multiprocessor
> >> machine/cluster (I have tinkered with openPBS on a 48-core Linux cluster
> >> some years ago).
> >>
> >> If your build process is based on files and their dependencies, the
> >> current Node class and the Taskmaster should provide all the information
> >> you need, for deciding whether a single part of the project has to be
> >> rebuilt or not. The Taskmaster already prepares info packets for you, in
> >> the form of the Job class instances, that then only have to be executed
> >> somewhere.
> >>
> >> And this is probably the best place where your extension could come into
> >> play. You could try to derive from the "Job" class and extend it, such
> that
> >> it is also able to run a single build step via your scheduler system
> (you
> >> seem to have that ready in your custom Builder).
> >> Quoting a part of your original email
> >>
> >> On 02.10.2014 21:24, Thomas Lippincott wrote:
> >>
> >>> [...]
> >>>
> >>> I would like to do something like subclassing the TaskManager to be
> able
> >>> to examine the dependency tree and choose subtrees to submit as Torque
> >>> jobs.
> >>>
> >> This is where the real problem is: deciding which nodes to build via
> >> Torque (or another scheduler), and which not, is super hard. You're
> trying
> >> to implement a second scheduler...which requires you to "add more
> >> information" to the system. I don't think you'll be able to compute an
> >> efficient schedule (taking the actual cluster/machine where things are
> >> executed into account) just by traversing the dependency tree.
> >>
> >> I'll go one step further and state that I wouldn't touch this problem
> with
> >> a ten foot pole.
> >> (* meta-level mode off *)
> >>
> >> Instead, I would stick to manually marking the nodes that are eligible
> to
> >> getting scheduled, within the SConscripts. You could write a
> >> wrapper/decorator method like:
> >>
> >>   prog = TorqueJob(env.Program('main',Glob('*.cpp')),
> >> required_mem="2GB",..., other keys)
> >>
> >> for "tagging" the target node "main" in this case. The Node class
> already
> >> has the member "attributes" which you can use to store meta-information
> >> about it (this is what the Java builder does, for example).
> >>
> >> In your custom Job class you can then take the final steps to check
> >> whether the current target should be built via Torque (or locally, if
> the
> >> overall load on the cluster is too high already), based on your
> meta-infos
> >> as given by the user. Then, setup the correct environment for this,
> before
> >> scheduling the actual command-line action.
> >> Just regard that the Taskmaster expects these single Job executions to
> be
> >> blocking. So you start a build step, and when it finished executing it's
> >> either complete (target got built) or it failed. This info and
> behaviour is
> >> crucial to the currently implemented algorithm...
> >>
> >> So much for my thoughts, I hope it gives you a few new ideas.
> >>
> >> Best regards,
> >>
> >> Dirk
> >>
> >>
> >> _______________________________________________
> >> Scons-users mailing list
> >> Scons-users at scons.org
> >> https://pairlist4.pair.net/mailman/listinfo/scons-users
> >>
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20141003/0101dd90/attachment-0001.html>


More information about the Scons-users mailing list