[Scons-users] Intermittent Install() failure

Vasily just.one.man at yandex.ru
Mon Sep 5 06:07:07 EDT 2016


Yes, that should be enough. Also I'd suggest patching "open" as well.

Thanks,
Vasily
05 сент. 2016 г. 12:20 пользователь "Hill, Steve (FP COM)" <
Steve.Hill at cobham.com> написал:

> OK, I've a bit more information. I've reinstated the filecmp.cmp in the
> decider but with a try/catch BaseException around it. I now see the problem
> again but the filecmp.cmp is not throwing the exception. This leads me to
> conclude that there is some side-effect of calling filecmp.cmp that is
> causing (occasionally) an issue with the SCons code after the decider is
> invoked but before the copy is performed. From what I can determine,
> filecmp.cmp uses:
>
>     with open(f1, 'rb') as fp1, open(f2, 'rb') as fp2:
>         <do compare>
>
> so I do not think that it is leaving files open and a Google search didn't
> yield any reports of it doing so. Hence, I'm not really any wiser as to
> what the root cause of the problem is. I think that I need to do this:
>
> >> monkey patch built-in file() and open() and log all of them with
> timestamps
>
> Am I right in thinking that I just need to patch
> sys.modules["__builtin__"].file with my wrapper to patch this globally?
>
> Cheers,
>
> S.
>
> -----Original Message-----
> From: Scons-users [mailto:scons-users-bounces at scons.org] On Behalf Of
> Vasily
> Sent: 18 August 2016 11:03
> To: SCons users mailing list
> Subject: Re: [Scons-users] Intermittent Install() failure
>
> Hi Steve,
> I can suggest you to monkey patch built-in file() and open() and log all
> of them with timestamps. That may give you a clue if this file is being
> held open by normal Python. Next step could be logging callstacks (e.g. via
> traceback.format_tb() if I'm not mistaken) for offending open() or file()
> call, this should give you an idea of what part of SCons code is doing the
> bad thing...
> Thanks,
> Vasily
> 18 авг. 2016 г. 10:31 пользователь "Hill, Steve (FP COM)" <
> Steve.Hill at cobham.com> написал:
> OK, I've dug a bit further and have some more information on this. I've
> monkey patched SCons.Tool.install.copyFunc with a wrapper to peform retries
> and mapped SCons.Tool.install.shutil.copy2 onto win32api.CopyFile and I
> can see that copyFunc is *never* called for the failing Install() actions
> and, when it is called, it always seems to succeed at the first try.
>
> This leads me to conclude that it is the dependency checking - or some
> part of it like the MD5 calculation - that is where the problem is
> occuring. Investigations continue; suggests/observations gratefully
> received...
>
> S.
>
> -----Original Message-----
> From: Scons-users [mailto:scons-users-bounces at scons.org] On Behalf Of
> Hill, Steve (FP COM)
> Sent: 22 July 2016 08:42
> To: SCons users mailing list
> Subject: Re: [Scons-users] Intermittent Install() failure
>
> * PGP Bad Signature, Signed: 07/22/2016 at 03:41:30 AM
>
> Hi Andrew,
>
> This could certainly be related, although in our case it is usually a leaf
> file (i.e. a file that has no following actions) that exhibits the problem.
> However, like you I am only aware of having seen this with parallel builds
> - though that is the default so it may just be that we only see it doing
> what we do most often.
>
> I'm still wondering whether this is some sort of interaction with an
> external process - like a virus checker, Windows search indexing, etc. -
> since the Install() does a shutil.copy2 to copy the file followed by a
> os.chmod on the same file. When Googling, I noticed that Mozilla have a
> function to wrap various file operations for Windows that retries a number
> of times with a sleep in between (https://hg.mozilla.org/
> mozilla-central/file/e7e69cc8c07b/testing/mozbase/
> mozfile/mozfile/mozfile.py#l157). I might try patching env['INSTALL'] to
> do something similar and see if that works around the problem...
>
> Cheers,
>
> S.
>
> -----Original Message-----
>
> We have also had an issue, so far not entirely understood, in the MongoDB
> Windows CI build where we get a similar error during a highly parallel
> build. The situation there is:
>
> - an env.Program call generates the file <build>/mongo/mongod.exe
> - A hand rolled MSI generator is invoked, which accesses
> <build>/mongo/mongod.exe from the program 'light.exe'
> - An env.Install call copies <build>/mongo/mongod.exe to
> <install>/mongod.exe
>
> What we find is that sometimes (rarely), the call to light.exe will fail
> with an error:
>
> [2016/06/13 17:58:16.347] light.exe : error LGHT0001 : The process cannot
> access the file 'C:\data\mci\e0c4bbc6610beafed74ddcea6eb880
> bb\src\build\win32\mongo\mongod.exe'
> because it is being used by another process.
>
> Our belief is that this error occurs when the light.exe execution runs
> concurrently with the env.Install task, so that both are trying to
> concurrently read from <build>/mongo/mongod.exe. I would normally expect
> that to be OK, as long as both light.exe and the Install rule were opening
> the file only for read, but that is based in UNIX experience.
>
> I can see how this situation could possibly lead to an error if light.exe
> were opening <build>/mongo/mongod.exe for write, or with some sort of
> exclusive lock, and that is one avenue we plan to explore (I hope it isn't
> doing that though - why would it need to?). I guess the other hypothesis is
> that the env.Install action is somehow opening the file for write or with
> some sort of file locking? Is that a possibility?
>
> Thanks,
> Andrew
>
>
> On Wed, Jul 20, 2016 at 10:19 AM, Hill, Steve (FP COM) <
> Steve.Hill at cobham.com> wrote:
> > No, just using env.Install(); it is that operation that fails – but
> > will usually succeed if retried…
> >
> >
> >
> >
> >
> >
> >
> > Copying via what means? shutils?
> >
> >
> >
> > On Wed, Jul 20, 2016 at 9:18 AM, Hill, Steve (FP COM)
> > <Steve.Hill at cobham.com> wrote:
> >
> > We do have builders like this, though I hope that they all use the
> > ‘with file(“file”, “w”) as f:…’ construct. It is possible that one has
> > slipped in without that – but not on the files that are causing this
> > problem as some of them are literally copying a file under source
> > control to another location…
> >
> >
> >
> > S.
> >
> >
> >
> >
> >
> >
> >
> > William,
> >
> > Indeed..
> >
> > -Bill
> >
> >
> >
> > On Tue, Jul 19, 2016 at 1:48 PM, William Blevins
> > <wblevins001 at gmail.com>
> > wrote:
> >
> > Bill,
> >
> > I assume you think this may be a problem with not explicitly closing
> > the file? Relying on python garbage collection in this case would
> > certainly be dangerous.
> >
> > V/R,
> >
> > William
> >
> >
> >
> > On Tue, Jul 19, 2016 at 6:22 PM, Bill Deegan
> > <bill at baddogconsulting.com>
> > wrote:
> >
> > Steve,
> >
> > Do you have any python logic creating the files in the SCons process?
> > (a builder where the builder actuall does open('file','w').. to create
> > the file?
> >
> > -Bill
> >
> >
> >
> > On Tue, Jul 19, 2016 at 7:55 AM, William Blevins
> > <wblevins001 at gmail.com>
> > wrote:
> >
> > Steve,
> >
> > As a final thought, I have used SCons to build HT 24-core server
> > blades (so
> > 48 w/ HT) that RAM boot (no harddrive at RT). If it was specific to
> > the core package, I imagine I would have seen this issue tons.
> >
> > V/R,
> >
> > William
> >
> >
> >
> > On Tue, Jul 19, 2016 at 12:52 PM, William Blevins
> > <wblevins001 at gmail.com>
> > wrote:
> >
> > Vasily,
> >
> > I think you are referring to Precious.
> >
> > V/R,
> >
> > William
> >
> >
> >
> > On Tue, Jul 19, 2016 at 9:51 AM, Vasily <just.one.man at yandex.ru> wrote:
> >
> > Hi Steve,
> >
> > What does your build do with this file except installing it? Is it
> > used by a compiler or some other tool?
> >
> > Also, if I'm not mistaken, default SCons behavior is to remove the
> > target before performing any action to regenerate it, so this might be
> > the source of the exception you're seeing. There seems to be a way to
> > turn off the removal part of the action, so you may want to check the
> > manual and see if it helps you.
> >
> > P.S. Based on file path I assume you're working on Windows, is this
> > path a network one?
> >
> > Thanks,
> > Vasily
> >
> > 19 июля 2016 г. 11:02 пользователь "Hill, Steve (FP COM)"
> > <Steve.Hill at cobham.com> написал:
> >
> >
> >
> > Thanks William.
> >
> >
> >
> > I’ve checked the dependency tree and, as far as I can tell, it looks OK.
> >
> > We do have custom scanners – using env.Scanner(_scan_domain_header) –
> > but not for the files that are affected.
> >
> > As far as I can see, the fix for that Python bug never made it into
> > 2.6.x but it appears always to result in a “No child processes”
> > OSError so I don’t believe that is what I am seeing.
> >
> >
> >
> > Note that I have (temporarily) changed the decider below to simply
> > return True (without doing anything with the file) and I still see the
> > issue so the decider doesn’t seem to be relevant to the issue.
> >
> >
> >
> > Does anyone have any more ideas? This is becoming a major issue for
> > our automated builds…
> >
> >
> >
> > S.
> >
> >
> >
> >
> >
> > Steve,
> >
> > I of course should have asked the obvious question, do you know if you
> > dependency tree has missing dependencies? This tends to be a common
> issue.
> >
> > V/R,
> >
> > William
> >
> >
> >
> > On Fri, Jul 15, 2016 at 5:14 PM, William Blevins
> > <wblevins001 at gmail.com>
> > wrote:
> >
> > Steve,
> >
> > I'm not aware of any specific issue with install, but there are some
> > possible issues that I am aware:
> >
> > If you have custom scanners, make sure they implement from
> > SCons.Scanner.Current and not SCons.Scanner.Base; otherwise, you might
> > have concurrent file access between implicit scanning operations and
> > other processes.
> > There was a big subprocess bug in python 2.6 that carried through
> > several other major versions: https://bugs.python.org/issue1731717. I
> > would check to see that your version of 2.6 contains the patch for this
> issue.
> >
> > V/R,
> >
> > William
> >
> >
> >
> > On Fri, Jul 15, 2016 at 9:30 AM, Hill, Steve (FP COM)
> > <Steve.Hill at cobham.com> wrote:
> >
> > I’m having a problem where, with parallel builds (most people use 8,
> > 12 or
> > 16 threads), we occasionally get failures like the following:
> >
> >
> >
> > F:\<directory path>\hw_cfgs\1Server_1TM_6C66_2U.cfg: The process
> > cannot access the file because it is being used by another process
> > scons: building terminated because of errors.
> >
> >
> >
> > We are running Python 2.6.5 (with pywin32) and SCons 2.3.6. This file
> > is being copied due to a Install() call. Note that, for various
> > historical reasons, we have the following decider for these Installs:
> >
> >
> >
> > def _copy_decider(dependency, target, prev_ni):
> >
> >     target = str(target.abspath)
> >
> >     dependency = str(dependency.abspath)
> >
> >     if os.path.isfile(target) and os.path.isfile(dependency):
> >
> >         # By default, filecmp.cmp assumes that files with identical
> > os.stat signatures
> >
> >         # (which includes the inode) are the same file and, hence,
> > must be the same.
> >
> >         # However, on Windows, there is no inode - it appears to be
> > set to zero - so
> >
> >         # any two files with the same size and
> > access/creation/modification times
> >
> >         # will have the same os.stat signature, leading to a false
> positive.
> > For this
> >
> >         # reason, we must force it to do an actual file comparison by
> > setting shallow
> >
> >         # to False
> >
> >         return not filecmp.cmp(target, dependency, shallow = False)
> >
> >     else:
> >
> >         # Either one of the dependency or target isn't a file or one
> > of the files
> >
> >         # (presumably the target) isn't there so do the copy
> >
> >         return True
> >
> >
> >
> > Note also that our IT department claims that virus checkers are
> > disabled within the directory where the build is being performed (and
> > we certainly have not seen any indication in the virus checker console
> > to suggest that to be incorrect).
> >
> >
> >
> > Does anyone have any thoughts as to what the problem might be?
> >
> >
> >
> > Thanks in advance,
> >
> >
> >
> > S.
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> >
> >
> >
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
> * Hill, Steve <Steve.Hill at cobham.com>
> * 0xFA34C3FC
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160905/6993d0c9/attachment-0001.html>


More information about the Scons-users mailing list