[Scons-users] Intermittent Install() failure
Hill, Steve (FP COM)
Steve.Hill at cobham.com
Wed Sep 7 06:09:52 EDT 2016
OK, I've looked at the (Python) source code for filecmp.cmp and the (C) source code for the file class and, as far as I can see, the call into the run-time to close the file will take place as soon as the with statement completes.
I've also spotted that SCons has a --debug=stacktrace option (can't believe that I didn't spot this before...) so I can now see the stacktrace for the failure:
scons: internal stack trace:
File "C:\Python26\Scripts\..\Lib\site-packages\scons-2.3.6\SCons\Job.py", line 387, in start
task.prepare()
File "C:\Python26\Scripts\..\Lib\site-packages\scons-2.3.6\SCons\Script\Main.py", line 173, in prepare
return SCons.Taskmaster.OutOfDateTask.prepare(self)
File "C:\Python26\Scripts\..\Lib\site-packages\scons-2.3.6\SCons\Taskmaster.py", line 197, in prepare
t.prepare()
File "C:\Python26\Scripts\..\Lib\site-packages\scons-2.3.6\SCons\Node\FS.py", line 2899, in prepare
self._rmv_existing()
File "C:\Python26\Scripts\..\Lib\site-packages\scons-2.3.6\SCons\Node\FS.py", line 2882, in _rmv_existing
raise e
Looking at FS.py, I think that this boils down to an exception being thrown during os.unlink(), used to remove the existing target file before copying the source to the target. This leads me to suspect that the issue here is that, under some circumstances (remember that we've only seen this with parallel builds), the OS can delay actually closing the file until some time after fclose() has returned.
I'm going to try monkey patching os.unlink to see whether retrying will allow me to work around this. Any further thoughts/suggestions gratefully received...
S.
-----Original Message-----
From: Scons-users [mailto:scons-users-bounces at scons.org] On Behalf Of Thomas Berg
Sent: 05 September 2016 13:34
To: SCons users mailing list
Subject: Re: [Scons-users] Intermittent Install() failure
On Mon, Sep 5, 2016 at 11:20 AM, Hill, Steve (FP COM) <Steve.Hill at cobham.com> wrote:
> OK, I've a bit more information. I've reinstated the filecmp.cmp in the decider but with a try/catch BaseException around it. I now see the problem again but the filecmp.cmp is not throwing the exception. This leads me to conclude that there is some side-effect of calling filecmp.cmp that is causing (occasionally) an issue with the SCons code after the decider is invoked but before the copy is performed. From what I can determine, filecmp.cmp uses:
>
> with open(f1, 'rb') as fp1, open(f2, 'rb') as fp2:
> <do compare>
>
> so I do not think that it is leaving files open and a Google search didn't yield any reports of it doing so. Hence, I'm not really any wiser as to what the root cause of the problem is. I think that I need to do this:
In our build we've had some indication that 'with open' can't be trusted to close the file in time. At least that's our main suspicion.
So it would be very interesting if you could test this assumption (that the problem is not with open), and try the win32 implementation here and check whether it helps.
- Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 526 bytes
Desc: not available
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20160907/2c642ff3/attachment.pgp>
More information about the Scons-users
mailing list