[Scons-users] Ensuring complete dependency coverage
Karl Ostmo
kostmo at gmail.com
Mon Jun 30 13:37:59 EDT 2014
We would like to make use of the SCons cache to speed up some of our
continuous integration builds. However, if a target's dependencies are not
fully declared in SCons, it is possible that SCons may retrieve a stale
artifact from the cache when a user changes an "undocumented" dependency.
As a concrete example, say that a Python script generates some artifact.
In the SConstruct file, the Python script is declared as a dependency of
the artifact, and the cache works fine. If the Python script is changed,
then the existing cached artifact will not be used, and a new artifact will
be placed in the cache instead, as expected.
However, say that someone decides to refactor the Python script by
splitting it into two files. The first Python file imports a function from
the second Python file. That function materially affects the generated
artifact. Unfortunately, the author forgets to declare the second Python
file as a dependency in the SConstruct. Next time a build is executed
after the second script is changed, an improper, stale artifact is
retrieved from the cache (using the function from before the change).
What is a good strategy to catch and prevent such file omissions from the
dependency tree? My first thought was to query SCons for the set of all
"leaf" (non-derived) dependencies, then delete everything* that is not in
that list (let's call these "superfluous files") before invoking the scons
build. Unfortunately, all of the SCons{truct,script} files, in addition to
all files in the "site_scons" directory are deleted via this method, since
those files are not tracked as build dependencies. Without those files, we
can't invoke the scons build.
*By "everything", I mean everything checked out from our Git repo.
Everything needed to build the final artifacts is stored in the Git repo,
except for a few very "stable" dependencies (e.g. the Python binary itself).
My second approach was to manually whitelist all files that SCons might
need, protecting them from the deletion step. The whitelist might use a
heuristic, such as "all files matching the name "SConstruct" and
"SConscript", in addition to all files inside the "site_scons" directory.
However, there are at least two problems with this approach:
(1) Not all of our "SConscript" files actually have that exact name; one of
them might be named "SCons-unit-tests" and another might be
"SCons-application". These exceptions must be manually maintained.
(2) Someone might end up storing a Python script inside the "site_scons"
directory and use it as part of an actual build step. If we spare all
content of the "site_scons" directory from deletion, we won't be able to
catch the case where someone forgets to specify that file as a build
dependency.
(1) and (2) defeat the point of this deletion exercise, which is to catch
omissions in files manually maintained by human authors.
My third approach was to attempt to delete all "superfluous files" while
the scons build was running, as part of the SConstruct script. In that
case, it doesn't matter whether we delete any of the SConscript files,
because they have already been parsed and are running from memory. That
way we don't need to bother with maintaining a "whitelist" to protect
certain files. We have a toplevel SConstruct file, and at the very end of
that file is where the list of superfluous files is computed and then
deleted. At that point, scons is still in the "parsing" phase, but then
immediately continues on to the "build" phase.
This third approach seems to "mostly" work, but I have observed some
peculiar behavior regarding the computation of implicit dependencies (C
header files) normally found by a Scanner. Sometimes when I perform the
deletion step, a few of the header files go missing from the the dependency
list. I fear that there may be some computation running *after* my
deletion step but *before* the "parsing" phase is complete, and I am
interfering with it.
I would prefer that there was a way to make this deletion step happen
during the "build" phase. At that point, all of the implicit header
dependencies should have been computed. However, I'm not sure of a good
way to declare certain build step to be absolutely "first" -- before any
other build step. I do have a list of the "leaves" (upstream-most
dependencies) already, and I thought that maybe I could rig the desired
behavior by making those files depend on a deletion Command. However, I
recall things not working properly in other situations when the "target" of
a build step already exists as a file checked in to one's repo. In that
case, there is no build step required, because the targets are already
present on the filesystem.
Karl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://four.pairlist.net/pipermail/scons-users/attachments/20140630/ed09498d/attachment.html>
More information about the Scons-users
mailing list