[Scons-users] CacheDir race during parallel Windows builds?

Andrew C. Morrow andrew.c.morrow at gmail.com
Sat Aug 6 18:50:53 EDT 2016


Hi William -

Thanks for the suggestion. I picked the relevant changes from that pull
request to into the MongoDB vendored copy of SCons:

$ git diff
diff --git
a/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Node/__init__.py
b/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Node/__init__.py
index 3ce481b..0b980a8 100644
--- a/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Node/__init__.py
+++ b/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Node/__init__.py
@@ -210,7 +210,8 @@ def get_contents_file(node):
         return ''
     fname = node.rfile().get_abspath()
     try:
-        contents = open(fname, "rb").read()
+        with open(fname, "rb") as fp:
+            contents = fp.read()
     except EnvironmentError, e:
         if not e.filename:
             e.filename = fname
diff --git
a/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Scanner/C.py
b/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Scanner/C.py
index 4c61187..57c8b99 100644
--- a/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Scanner/C.py
+++ b/src/third_party/scons-2.5.0/scons-local-2.5.0/SCons/Scanner/C.py
@@ -58,12 +58,11 @@ class SConsCPPScanner(SCons.cpp.PreProcessor):
         return result
     def read_file(self, file):
         try:
-            fp = open(str(file.rfile()))
+            with open(str(file.rfile())) as fp:
+                return fp.read()
         except EnvironmentError, e:
             self.missing.append((file, self.current_file))
             return ''
-        else:
-            return fp.read()

 def dictify_CPPDEFINES(env):
     cppdefines = env.get('CPPDEFINES', {})

However, the same error occurs after this change, so I think there is
something else going on here. Whether that is a bug in our SCons setup,
SCons itself, Windows and/or NTFS, or somehow AWS, I have no idea. I can
consistently reproduce this issue on an AWS instance I have set up, and we
would really like CacheDir to work. Is there some additional information I
can provide to help debug this?

Thanks,
Andrew


On Thu, Aug 4, 2016 at 4:12 PM, William Blevins <wblevins001 at gmail.com>
wrote:

> Andrew,
>
> I haven't gone through the links in detail, but something that *might* be
> related: https://bitbucket.org/scons/scons/pull-requests/347/avoid-
> using-__slots__-on-node-and-executor/diff
>
> This above link is to a recent patch that caught several cases of files
> being opened without using the "with <file> as <name>" construct to
> explicitly close files after use in SCons/Node/__init__.py and
> SCons/Scanner/c.py This might cause problems with timely file handle
> cleanups (especially on Windows which tends to do some odd file buffering
> IMHO). You may want to clone the latest and see if that makes a difference.
> Ideally, the latest is functional with all the 2->3 code changes. Or
> consider just doing a direct monkey patch for brevity sake.
>
> Hope that helps,
> William
>
> On Thu, Aug 4, 2016 at 3:20 PM, Andrew C. Morrow <
> andrew.c.morrow at gmail.com> wrote:
>
>>
>> Hi -
>>
>> At MongoDB, we recently started using CacheDir in our CI system. This has
>> been a big success for reducing rebuild times for our Linux builds,
>> however, we were surprised to find that our Windows builds started failing
>> in a very alarming way:
>>
>> Please see the following log file: https://evergreen.mongod
>> b.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_
>> debug_compile_81185a50aeed5b2beed2c0a81b381a482489fdb7_16_
>> 08_02_20_24_46/0?type=T
>> <https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_debug_compile_81185a50aeed5b2beed2c0a81b381a482489fdb7_16_08_02_20_24_46/0?type=T>
>>
>> The log lines of interest are:
>>
>> [2016/08/02 17:31:09.642] Retrieved `build\cached\mongo\base\data_type_terminated_test.obj'
>> from cache
>>
>> Here, we see that we retrieved this .obj file from the cache. Nine
>> seconds later, we try to use that object in a link step:
>>
>> [2016/08/02 17:31:18.921] link /nologo /DEBUG /INCREMENTAL:NO
>> /LARGEADDRESSAWARE /OPT:REF /OUT:build\cached\mongo\base\base_test.exe
>> build\cached\mongo\base\data_range.obj ...   build\cached\mongo\base\data_type_terminated_test.obj
>> ...
>>
>> The link fails, claiming that the data_type_terminated_test.obj file
>> cannot be opened:
>>
>> [2016/08/02 17:31:20.363] LINK : fatal error LNK1104: cannot open file
>> 'build\cached\mongo\base\data_type_terminated_test.obj'
>>  [2016/08/02 17:31:20.506] scons: *** [build\cached\mongo\base\base_test.exe]
>> Error 1104
>>
>> We are using a vendored copy of SCons 2.5.0. The only modification is
>> this:
>>
>> https://github.com/mongodb/mongo/commit/bc7e4e6821639ee766ad
>> a83483975668af98f367#diff-cc7aec1739634ca2a857a4d4227663aa
>>
>> This change was made so that the atime of files in the cache is
>> fine-grained accurate, even if the underlying filesystem is mounted noatime
>> or relatime, so that we can prune the cache based on access time. We would
>> like to propose this change to be upstreamed, but that is a separate email.
>>
>> SCons was invoked as follows from within an SSH session into cygwin (you
>> can see it at the top of the build log as well):
>>
>> python ./buildscripts/scons.py --dbg=on --opt=on --win-version-min=ws08r2
>> -j$(( $(grep -c ^processor /proc/cpuinfo) / 2 )) MONGO_DISTMOD=2008plus
>> --cache --cache-dir='z:\data\scons-cache\9d73adcd-19eb-46f2-9988-b8594ba5a3d1'
>> --use-new-tools all dist dist-debugsymbols distsrc-zip
>>  MONGO_VERSION=3.3.10-250-g81185a5
>>
>> The 'python' here is Windows python, not cygwin, and PyWin32 is installed.
>>
>> The system on which this build ran is running Windows 2012 on a dedicated
>> spot AWS c3.4xlarge instance, and the toolchain is Visual Studio 2015
>> The Z drive, where the cache directory is located, is locally connected
>> NTFS via AWS ephemeral/instance storage.
>>
>> We have since backed out using the Cache on our Windows builds, which is
>> disappointing - Windows builds take forever compared to others, and we were
>> really hoping that CacheDir would be a big help here.
>>
>> Has anyone seen anything like this, or has some ideas what may be going
>> wrong here? I know there have been some other recent threads about problems
>> with Windows and build ordering, but this seems different - the retrieval
>> of the file from the Cache was correctly ordered, but it doesn't appear to
>> have been effective.
>>
>> I'm happy to provide any additional information if it will help us get
>> Windows CacheDir enabled builds working.
>>
>> Thanks,
>> Andrew
>>
>>
>>
>> _______________________________________________
>> Scons-users mailing list
>> Scons-users at scons.org
>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>
>>
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160806/330e9a4f/attachment.html>


More information about the Scons-users mailing list