[Scons-users] CacheDir race during parallel Windows builds?
Jason Kenny
dragon512 at live.com
Mon Aug 8 09:55:57 EDT 2016
I am curious on what you find, please let us know what you discover.
I am thinking more and more the linker issue is the windows linker trying to lock the file that prevents any file handles with write permission to be open on it. That’s is just my gut feeling.
Jason
From: Scons-users [mailto:scons-users-bounces at scons.org] On Behalf Of Andrew C. Morrow
Sent: Monday, August 8, 2016 7:31 AM
To: SCons users mailing list <scons-users at scons.org>
Subject: Re: [Scons-users] CacheDir race during parallel Windows builds?
On Sun, Aug 7, 2016 at 6:35 PM, Jason Kenny <dragon512 at live.com <mailto:dragon512 at live.com> > wrote:
Hi,
So let me go over what we know:
1) no cache and serial build -> worked
2) no cache and -j build -> Worked
3) cache and serial build -> Worked
4) cache and -j build -> Fail constantly
Correct, with two caveats:
1) I've never actually attempted case 1, on any platform. I can, if you think it would provide any value, but I'm nearly certain that it works every time.
2) These are the results on Windows; we have so far never observed the case 4 errors on Linux, OS X, or Solaris.
>From this it would seems to be having the cache on and a parallel build. My guess is that a thread was doing something with the file and the main thread was doing something else to have this happen.
Then I did a simple test.
Basically I opened an object file I just built manually in different python interactive shell I opened it only as “r” and left it open.
I could link the program in a different shell.
If it opened the file with “a” or “r+” ( anything with implied write), the program would not link with a “LINK : fatal error LNK1104: cannot open file 'hello.obj'”.
I am guessing that the linker has some “exclusive” read mode set that fails is the object file is opened with a write mode. If I try to do this on Linux it looks like it works fine even is python has an open handle Write handle open. Also if I do this with different processes on windows it seem to be fine as well. I think the linker is locking the file while it does some work to prevent it from changing while it is busy making the PE format of the finial output.
Based on this I would suggest we have a race in SCons with cacheDir set in which python has a write mode handle open on the object file that was not closed yet. I did this test on Windows 10 with VS 2015 ( I tested linux on the bash shell feature on windows 10 and doubled checked on Ubuntu in a VM). The race I would assume to be something with the actions running a link command while the main thread is doing something with that file. Or there is something else touching that file.
I don’t know enough of the pathways with cacheDir at the moment to say want would be going on.
Nor do I. I'm going to enlist the help of one of our local Windows experts to see if he can help with tooling that will show us exactly what the conflict is. I'll report back any findings.
I don’t think Parts File tweaks would help much with solving this problem at the moment. Given 4) is the only time this happen, this seems to be a SCons issue.
I agree that it appears to be, but until we have a root cause it is of course not possible to be sure.
Thanks,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160808/d729e8f4/attachment.html>
More information about the Scons-users
mailing list