[Scons-users] Ensuring complete dependency coverage

William Deegan bill at baddogconsulting.com
Tue Jul 1 02:01:45 EDT 2014


Karl,

There are already scanners for Java and many other languages.
This is the preferred method to handle dependancies in SCons and one of the reasons that SCons typically does better and handling these issues than other build systems.

Here’s a list of community supported Builders() which include scanners in addition to those which are part of the SCons distribution.

http://scons.org/wiki/CustomBuilders
and
http://scons.org/wiki/ToolsIndex

-Bill

On June 30, 2014 at 7:50:09 PM, Karl Ostmo (kostmo at gmail.com) wrote:

Hi Bill,

Thanks for the link, looks like an interesting project.  Yes, the Python example is a real one, but is not the only language in our codebase for which this problem occurs.

For instance, we also have Bash scripts, Perl Scripts, Java, etc.  I'm not sure that writing a custom SCanner for each language is a scalable approach to guard against dependency omission in all of them.

Karl


On Mon, Jun 30, 2014 at 3:51 PM, Bill Deegan <bill at baddogconsulting.com> wrote:
Karl,

Welcome to complicated SCons builds! You're in good company.

Assuming your Python example is one real example for you, then ideally there would be a python source scanner which would run, then if additional dependencies are added to the python script you will execute in your build, it's signature would change and then any changes to the script or its dependencies would cause the script to be executed and produce whatever it produces.

So (IMHO) the "right" way to resolve your issue is to make sure it never happens rather than trying to check for errors.
A scanner would do this.

I don't think there is a Python Builder (which would have an attached scanner) at this point, though perhaps such a scanner wouldn't be too hard to make using this:
http://furius.ca/snakefood/

-Bill



On Mon, Jun 30, 2014 at 10:37 AM, Karl Ostmo <kostmo at gmail.com> wrote:
We would like to make use of the SCons cache to speed up some of our continuous integration builds.  However, if a target's dependencies are not fully declared in SCons, it is possible that SCons may retrieve a stale artifact from the cache when a user changes an "undocumented" dependency.

As a concrete example, say that a Python script generates some artifact.  In the SConstruct file, the Python script is declared as a dependency of the artifact, and the cache works fine.  If the Python script is changed, then the existing cached artifact will not be used, and a new artifact will be placed in the cache instead, as expected.

However, say that someone decides to refactor the Python script by splitting it into two files.  The first Python file imports a function from the second Python file.  That function materially affects the generated artifact.  Unfortunately, the author forgets to declare the second Python file as a dependency in the SConstruct.  Next time a build is executed after the second script is changed, an improper, stale artifact is retrieved from the cache (using the function from before the change).

What is a good strategy to catch and prevent such file omissions from the dependency tree?  My first thought was to query SCons for the set of all "leaf" (non-derived) dependencies, then delete everything* that is not in that list (let's call these "superfluous files") before invoking the scons build.  Unfortunately, all of the SCons{truct,script} files, in addition to all files in the "site_scons" directory are deleted via this method, since those files are not tracked as build dependencies.  Without those files, we can't invoke the scons build.

*By "everything", I mean everything checked out from our Git repo.  Everything needed to build the final artifacts is stored in the Git repo, except for a few very "stable" dependencies (e.g. the Python binary itself).

My second approach was to manually whitelist all files that SCons might need, protecting them from the deletion step.  The whitelist might use a heuristic, such as "all files matching the name "SConstruct" and "SConscript", in addition to all files inside the "site_scons" directory.  However, there are at least two problems with this approach:
(1) Not all of our "SConscript" files actually have that exact name; one of them might be named "SCons-unit-tests" and another might be "SCons-application".  These exceptions must be manually maintained.
(2) Someone might end up storing a Python script inside the "site_scons" directory and use it as part of an actual build step.  If we spare all content of the "site_scons" directory from deletion, we won't be able to catch the case where someone forgets to specify that file as a build dependency.
(1) and (2) defeat the point of this deletion exercise, which is to catch omissions in files manually maintained by human authors.

My third approach was to attempt to delete all "superfluous files" while the scons build was running, as part of the SConstruct script.  In that case, it doesn't matter whether we delete any of the SConscript files, because they have already been parsed and are running from memory.  That way we don't need to bother with maintaining a "whitelist" to protect certain files.  We have a toplevel SConstruct file, and at the very end of that file is where the list of superfluous files is computed and then deleted.  At that point, scons is still in the "parsing" phase, but then immediately continues on to the "build" phase.

This third approach seems to "mostly" work, but I have observed some peculiar behavior regarding the computation of implicit dependencies (C header files) normally found by a Scanner.  Sometimes when I perform the deletion step, a few of the header files go missing from the the dependency list.  I fear that there may be some computation running *after* my deletion step but *before* the "parsing" phase is complete, and I am interfering with it.

I would prefer that there was a way to make this deletion step happen during the "build" phase.  At that point, all of the implicit header dependencies should have been computed.  However, I'm not sure of a good way to declare certain build step to be absolutely "first" -- before any other build step.  I do have a list of the "leaves" (upstream-most dependencies) already, and I thought that maybe I could rig the desired behavior by making those files depend on a deletion Command.  However, I recall things not working properly in other situations when the "target" of a build step already exists as a file checked in to one's repo.  In that case, there is no build step required, because the targets are already present on the filesystem.

Karl



_______________________________________________
Scons-users mailing list
Scons-users at scons.org
http://four.pairlist.net/mailman/listinfo/scons-users



_______________________________________________
Scons-users mailing list
Scons-users at scons.org
http://four.pairlist.net/mailman/listinfo/scons-users


_______________________________________________  
Scons-users mailing list  
Scons-users at scons.org  
http://four.pairlist.net/mailman/listinfo/scons-users  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://four.pairlist.net/pipermail/scons-users/attachments/20140630/6d9893a5/attachment.html>


More information about the Scons-users mailing list