[Scons-users] CacheDir race during parallel Windows builds?

Andrew C. Morrow andrew.c.morrow at gmail.com
Mon Aug 8 08:30:33 EDT 2016


On Sun, Aug 7, 2016 at 6:35 PM, Jason Kenny <dragon512 at live.com> wrote:

>
>
> Hi,
>
>
>
> So let me go over what we know:
>
> 1) no cache and serial build -> worked
>
> 2) no cache and -j build -> Worked
>
> 3) cache and serial build -> Worked
>
> 4) cache and -j build -> Fail constantly
>

Correct, with two caveats:

1) I've never actually attempted case 1, on any platform. I can, if you
think it would provide any value, but I'm nearly certain that it works
every time.
2) These are the results on Windows; we have so far never observed the case
4 errors on Linux, OS X, or Solaris.



>
> From this it would seems to be having the cache on and a parallel build.
> My guess is that a thread was doing something with the file and the main
> thread was doing something else to have this happen.
>
> Then I did a simple test.
>
>
>
> Basically I opened an object file I just built manually in different
> python interactive shell I opened it only as “r” and left it open.
>
> I could link the program in a different shell.
>
> If it opened the file with “a” or “r+” ( anything with implied write), the
> program would not link with a “LINK : fatal error LNK1104: cannot open file
> 'hello.obj'”.
>
>
>
> I am guessing that the linker has some “exclusive” read mode set that
> fails is the object file is opened with a write mode. If I try to do this
> on Linux it looks like it works fine even is python has an open handle
> Write handle open. Also if I do this with different processes on windows it
> seem to be fine as well. I think the linker is locking the file while it
> does some work to prevent it from changing while it is busy making the PE
> format of the finial output.
>
>
>
> Based on this I would suggest we have a race in SCons with cacheDir set in
> which python has a write mode handle open on the object file that was not
> closed yet. I did this test on Windows 10 with VS 2015 ( I tested linux on
> the bash shell feature on windows 10 and doubled checked on Ubuntu in a
> VM). The race I would assume to be something with the actions running a
> link command while the main thread is doing something with that file. Or
> there is something else touching that file.
>
>
>
> I don’t know enough of the pathways with cacheDir at the moment to say
> want would be going on.
>

Nor do I. I'm going to enlist the help of one of our local Windows experts
to see if he can help with tooling that will show us exactly what the
conflict is. I'll report back any findings.



>
>
> I don’t think Parts File tweaks would help much with solving this problem
> at the moment. Given 4) is the only time this happen, this *seems* to be
> a SCons issue.
>

I agree that it appears to be, but until we have a root cause it is of
course not possible to be sure.

Thanks,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160808/cb6124d3/attachment-0001.html>


More information about the Scons-users mailing list