[Scons-users] Possible 2.5.1 regression on Windows? (was: Unreliable build problem)

Hill, Steve (FP COM) Steve.Hill at cobham.com
Thu May 4 12:32:01 EDT 2017


Hi Bill,

I've spent a bit of time on-and-off looking at this today and can give a bit more information:

1) It is probably not true to say that this did not exist in 2.3.6 (I have wrapped open/file/Popen to retry as we do see this very occasionally - which I put down to some environmental issue, although we had ruled out AV and Windows search indexing) but 2.5.1 exhibits this far, far more frequently.

2) I added the following to investigate whether race hazards around the setting of the O_NOINHERIT flag was involved:

    import threading
    fileLock = threading.RLock()

    class WithLock(object):
        def __init__(self, func):
            object.__init__(self)
            self._func = func

        def __call__(self, *args, **kwds):
            with fileLock:
                print self._func.__name__, args[0]
                return self._func(*args, **kwds)

    oldFile = file
    class InterlockedFile(file):
        def __init__(self, *args, **kwds):
            with fileLock:
                oldFile.__init__(self, *args, **kwds)
                print "file(%s)" % self.name
        def close(self):
            with fileLock:
                print "file.close", self.name
                oldFile.close(self)
        @staticmethod
        def alt_open(*args, **kwds):
            return InterlockedFile(*args, **kwds)

    import __builtin__
    sys.modules["__builtin__"].open = InterlockedFile.alt_open
    sys.modules["__builtin__"].file = InterlockedFile
    sys.modules["os"].fdopen = WithLock(os.fdopen)
    sys.modules["os"].open = WithLock(os.open)
    sys.modules["subprocess"].Popen = WithLock(subprocess.Popen)
    sys.modules["os"].spawnve = WithLock(os.spawnve)

From this I have discovered two things:
   a) The problem still occurs
   b) The most common failing sequence seems to be:
      i) One .exe/.pdb is ready to be built and another has just been built and is being considered for Installing
      ii) The source and target .exe or .pdb that is being considered for installing are opened for comparison. For various reasons, this uses a custom decider that uses filecmp.cmp(), which in turn use open() to open the files
      iii) While the files are still open, the link to produce the unrelated .exe/.pdb is spawned (using subprocess.Popen)
      iv) The files opened by filecmp.cmp() are closed, the decider says to install the file but _rmv_existing() fails saying that the (destination) file is in use by another process

Why this handle would be inherited, I don't know since _scons_file (which has been patched into file()) sets the O_NOINHERIT flag.

Anyway, I tried the sys.setcheckinterval and it still failed. I'm happy to try anything else that might shed light on the situation...

S.

P.S. I am not using CacheDirs

-- 
From: Scons-users [mailto:scons-users-bounces at scons.org] On Behalf Of Bill Deegan
Sent: 04 May 2017 16:15
To: SCons users mailing list
Subject: Re: [Scons-users] Possible 2.5.1 regression on Windows? (was: Unreliable build problem)

Steve,
Can you try something since you've got a pretty reproducible configuration.
Add this to your top level SConstruct

sys.setcheckinterval(1000)
(This changes the thread swap checking interval, default on py2.7 is 100)
Thanks,
Bill


On Thu, May 4, 2017 at 12:02 AM, William Blevins <wblevins001 at gmail.com> wrote:
Fair enough. I was just remembering the hot topic of race conditions
on Windows and trying to drudge up some recent information ;)

On Wed, May 3, 2017 at 9:53 PM, Bill Deegan <bill at baddogconsulting.com> wrote:
> Nope.. actually it's all 3.
>
> The issue (at it's root) is likely either threads are sharing the handle and
> it's not closed in a timely fashion on windows, or just plain normal file
> closes aren't being closed before the call returns on win32.
> Of note, looking at the c code, the GIL is released around the fclose()..
>
> -Bill
>
> On Wed, May 3, 2017 at 9:09 PM, William Blevins <wblevins001 at gmail.com>
> wrote:
>>
>> Bill,
>>
>> I think you mean:
>> http://scons.tigris.org/issues/show_bug.cgi?id=2449
>>
>> https://bitbucket.org/scons/scons/pull-requests/389/fix-race-condition-on-win32/diff
>>
>> V/R,
>> William
>>
>> On Wed, May 3, 2017 at 6:32 PM, Bill Deegan <bill at baddogconsulting.com>
>> wrote:
>> > Likely it's the same issue as this:
>> > http://scons.tigris.org/issues/show_bug.cgi?id=2124
>> >
>> >
>> >
>> > On Wed, May 3, 2017 at 3:14 PM, Arvid Rosén <arvid at softube.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have problems with that too. Typically running 16 threads. I had to
>> >> rewrite all actions that generated source files (like header files with
>> >> version information), because they never worked reliably on Windows.
>> >> Mac has
>> >> always been fine. I ended up using static header files with defines
>> >> passed
>> >> on the command line instead. Somewhat ugly but it works.
>> >>
>> >> I always thought this was rather related to some Windows related
>> >> scanning
>> >> or anti-virus service, but I might give it a try with 2.3.6 and see if
>> >> that
>> >> solves it.
>> >>
>> >> Cheers,
>> >> Arvid
>> >>
>> >> Get Outlook for iOS
>> >> _____________________________
>> >> From: Bill Deegan <bill at baddogconsulting.com>
>> >> Sent: onsdag, maj 3, 2017 8:25 em
>> >> Subject: Re: [Scons-users] Possible 2.5.1 regression on Windows? (was:
>> >> Unreliable build problem)
>> >> To: SCons users mailing list <scons-users at scons.org>
>> >>
>> >>
>> >>
>> >> Steve,
>> >>
>> >> That's useful to know 2.3.6 isn't showing this.
>> >> We've had a few reports of others running into it.
>> >>
>> >> Are you using CacheDirs?
>> >>
>> >> -Bill
>> >>
>> >> On Wed, May 3, 2017 at 10:42 AM, Hill, Steve (FP COM)
>> >> <Steve.Hill at cobham.com> wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> While looking at the unreliable build problem (which I will return
>> >>> to), I
>> >>> upgraded to 2.5.1 (from 2.3.6). Unfortunately, this seems to have
>> >>> introduced
>> >>> an issue which is preventing me from rolling the new version out the
>> >>> development community: I am seeing the file handle inheritance problem
>> >>> again
>> >>> that _scons_file and _scons_open were added to work around.
>> >>>
>> >>> We typically use between 8 and 12 build threads (depending on the
>> >>> physical machine that we are building on) and, when building with
>> >>> 2.5.1, we
>> >>> quite frequently see the build fail with some sort of access denied
>> >>> issue.
>> >>> The build system automatically runs handle.exe in this case and I can
>> >>> see
>> >>> that the other (unrelated) build threads have an open handle on the
>> >>> file at
>> >>> this time. Reverting to 2.3.6 results in the issue going away.
>> >>>
>> >>> I've confirmed that _scons_open and _scons_file are both in place for
>> >>> the
>> >>> built-in file() and open() and I've monkey patched os.open to assert
>> >>> that it
>> >>> is always called with the os.O_NOINHERIT flag. Does anyone know what
>> >>> other
>> >>> functions could be causing this that I can check?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> S.
>> >>>
>> >>> _______________________________________________
>> >>> Scons-users mailing list
>> >>> Scons-users at scons.org
>> >>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>> >>>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Scons-users mailing list
>> >> Scons-users at scons.org
>> >> https://pairlist4.pair.net/mailman/listinfo/scons-users
>> >>
>> >
>> >
>> > _______________________________________________
>> > Scons-users mailing list
>> > Scons-users at scons.org
>> > https://pairlist4.pair.net/mailman/listinfo/scons-users
>> >
>> _______________________________________________
>> Scons-users mailing list
>> Scons-users at scons.org
>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
_______________________________________________
Scons-users mailing list
Scons-users at scons.org
https://pairlist4.pair.net/mailman/listinfo/scons-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 526 bytes
Desc: not available
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20170504/726c63c7/attachment-0001.pgp>


More information about the Scons-users mailing list