[Scons-users] CacheDir race during parallel Windows builds?

William Blevins wblevins001 at gmail.com
Thu Aug 4 16:12:53 EDT 2016


Andrew,

I haven't gone through the links in detail, but something that *might* be
related:
https://bitbucket.org/scons/scons/pull-requests/347/avoid-using-__slots__-on-node-and-executor/diff

This above link is to a recent patch that caught several cases of files
being opened without using the "with <file> as <name>" construct to
explicitly close files after use in SCons/Node/__init__.py and
SCons/Scanner/c.py This might cause problems with timely file handle
cleanups (especially on Windows which tends to do some odd file buffering
IMHO). You may want to clone the latest and see if that makes a difference.
Ideally, the latest is functional with all the 2->3 code changes. Or
consider just doing a direct monkey patch for brevity sake.

Hope that helps,
William

On Thu, Aug 4, 2016 at 3:20 PM, Andrew C. Morrow <andrew.c.morrow at gmail.com>
wrote:

>
> Hi -
>
> At MongoDB, we recently started using CacheDir in our CI system. This has
> been a big success for reducing rebuild times for our Linux builds,
> however, we were surprised to find that our Windows builds started failing
> in a very alarming way:
>
> Please see the following log file: https://evergreen.
> mongodb.com/task_log_raw/mongodb_mongo_master_windows_
> 64_2k8_debug_compile_81185a50aeed5b2beed2c0a81b381a
> 482489fdb7_16_08_02_20_24_46/0?type=T
> <https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_debug_compile_81185a50aeed5b2beed2c0a81b381a482489fdb7_16_08_02_20_24_46/0?type=T>
>
> The log lines of interest are:
>
> [2016/08/02 17:31:09.642] Retrieved `build\cached\mongo\base\data_type_terminated_test.obj'
> from cache
>
> Here, we see that we retrieved this .obj file from the cache. Nine seconds
> later, we try to use that object in a link step:
>
> [2016/08/02 17:31:18.921] link /nologo /DEBUG /INCREMENTAL:NO
> /LARGEADDRESSAWARE /OPT:REF /OUT:build\cached\mongo\base\base_test.exe
> build\cached\mongo\base\data_range.obj ...   build\cached\mongo\base\data_type_terminated_test.obj
> ...
>
> The link fails, claiming that the data_type_terminated_test.obj file
> cannot be opened:
>
> [2016/08/02 17:31:20.363] LINK : fatal error LNK1104: cannot open file
> 'build\cached\mongo\base\data_type_terminated_test.obj'
>  [2016/08/02 17:31:20.506] scons: *** [build\cached\mongo\base\base_test.exe]
> Error 1104
>
> We are using a vendored copy of SCons 2.5.0. The only modification is this:
>
> https://github.com/mongodb/mongo/commit/bc7e4e6821639ee766ada834839756
> 68af98f367#diff-cc7aec1739634ca2a857a4d4227663aa
>
> This change was made so that the atime of files in the cache is
> fine-grained accurate, even if the underlying filesystem is mounted noatime
> or relatime, so that we can prune the cache based on access time. We would
> like to propose this change to be upstreamed, but that is a separate email.
>
> SCons was invoked as follows from within an SSH session into cygwin (you
> can see it at the top of the build log as well):
>
> python ./buildscripts/scons.py --dbg=on --opt=on --win-version-min=ws08r2
> -j$(( $(grep -c ^processor /proc/cpuinfo) / 2 )) MONGO_DISTMOD=2008plus
> --cache --cache-dir='z:\data\scons-cache\9d73adcd-19eb-46f2-9988-b8594ba5a3d1'
> --use-new-tools all dist dist-debugsymbols distsrc-zip
>  MONGO_VERSION=3.3.10-250-g81185a5
>
> The 'python' here is Windows python, not cygwin, and PyWin32 is installed.
>
> The system on which this build ran is running Windows 2012 on a dedicated
> spot AWS c3.4xlarge instance, and the toolchain is Visual Studio 2015 The
> Z drive, where the cache directory is located, is locally connected NTFS
> via AWS ephemeral/instance storage.
>
> We have since backed out using the Cache on our Windows builds, which is
> disappointing - Windows builds take forever compared to others, and we were
> really hoping that CacheDir would be a big help here.
>
> Has anyone seen anything like this, or has some ideas what may be going
> wrong here? I know there have been some other recent threads about problems
> with Windows and build ordering, but this seems different - the retrieval
> of the file from the Cache was correctly ordered, but it doesn't appear to
> have been effective.
>
> I'm happy to provide any additional information if it will help us get
> Windows CacheDir enabled builds working.
>
> Thanks,
> Andrew
>
>
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160804/d312b73f/attachment.html>


More information about the Scons-users mailing list