[Scons-users] Out of memory writing back the database

Bill Deegan bill at baddogconsulting.com
Fri Dec 1 16:22:26 EST 2017


Mats,

JSON may help memory, but not incrementality if it's a single file.

Thoughts?

-Bill


On Fri, Dec 1, 2017 at 10:50 AM, Mats Wichmann <mats at wichmann.us> wrote:

> Transition: json won't help with that. Sqlite might.
>
>
> On December 1, 2017 9:19:44 AM MST, Bill Deegan <bill at baddogconsulting.com>
> wrote:
>>
>> Would you consider contributing that logic in a pull request and/or
>> pointing to a commit(s) in your repo?
>>
>> Currently in the plans is to transition the .sconsign to another format,
>> possibly json or sqlite with an eye on speed and/or incrementality.
>>
>> -Bill
>>
>> On Fri, Dec 1, 2017 at 5:21 AM, Hill, Steve (FP COM) <
>> Steve.Hill at cobham.com> wrote:
>>
>>> Hi Bill,
>>>
>>>
>>>
>>> Just to follow up on this. I’ve just monkey patched the DirFile class in
>>> SConsign.py so that it writes the sconsign into a single temporary
>>> directory with each sconsign named with an adler32 hash of the directory
>>> path.
>>>
>>>
>>>
>>> After doing the 4 basic builds, the directory contains 10,745 sconsigns
>>> totalling 674MB. Using –debug=memory, the peak memory usage of the worst
>>> case build was 925 MB, compared to ~1.4GB for the same build with the
>>> monolithic sconsign (so the difference is basically the cost of pickling
>>> the sconsign into memory first).
>>>
>>>
>>>
>>> Cheers,
>>>
>>>
>>>
>>> S.
>>>
>>>
>>>
>>>
>>>
>>> Hi Bill,
>>>
>>>
>>>
>>> Our code base is not that big – about 20,000 files, but most files get
>>> built multiple times (for different CPUs) and we build it in several
>>> different ways, each with many of their own VariantDirs. When the 4 most
>>> common builds are done, the .sconsign (having been deleted first) is just
>>> under 500MB but, given a bit of history or add in some of the less common
>>> builds, and it gets big enough to blow the memory space of Python.
>>>
>>>
>>>
>>> I’ve split the .sconsign up so that largely independent builds on the
>>> same codebase are stored in a different .sconsign (initially done due to
>>> the time taken to write the .sconsign back), though I’ve had to write some
>>> custom Deciders to make this totally reliable. All in all, the .sconsign
>>> per directory (though with a .sconsign location outside the main source
>>> tree) would seem the best solution for me (and seems like it might be
>>> better performing too as not all files will need writing back for
>>> incremental builds), unless there is another option that I’m not aware of…
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> S.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Steve,
>>>
>>>
>>>
>>> How many files are you processing? 500MB is very large for a sconsign.
>>>
>>>
>>>
>>> Could you try mv .sconsign .sconsign.save, let your build run and see
>>> how big the new file is?
>>>
>>>
>>>
>>> -Bill
>>>
>>>
>>>
>>> On Tue, Nov 28, 2017 at 7:17 AM, Hill, Steve (FP COM) <
>>> Steve.Hill at cobham.com> wrote:
>>>
>>> All,
>>>
>>> We are using 32-bit Python 2.7.12 and SCons 2.5.1 on Windows (and
>>> Linux). We have now reached the case where the database is in excess of
>>> 500MB. During the build, the memory usage bubbles along at just over 1GB
>>> but, at the end of the run we sometimes get a spike in memory usage that
>>> causes an "Out of memory" exception and the build fails.
>>>
>>> Looking at the code, it appears that the database file is produced by
>>> pickling into memory before writing to the file, hence the memory usage
>>> increases by over 0.5GB as it is being written back  - and, depending on
>>> the database size, this can blow the memory space of a 32-bit process.
>>>
>>> Firstly, I do have a medium term plan to move us to 64-bit Python with
>>> the move to Python 3 but we still need to stick with 32-bit Python at the
>>> moment (for example, we have to load 32-bit DLLs) so I need to fix this for
>>> 32-bit Python in the short term.
>>>
>>> Secondly, I've had a look at the old (?) .sconsign per directory
>>> approach, which would seem to address this problem. I assume that this is
>>> still supported and should be as robust as the single file approach?
>>>
>>> The only issue with this is that some of the .sconsigns appear in the
>>> directories beside the source code, which breaches the contract for our
>>> build system, where no file may be produced within the directories
>>> containing source code. Where we use VariantDirs (for the actual
>>> implementation files), there is no problem but, for header files, the
>>> .sconsign ends up in the include or interface directory. I can see how I
>>> can reasonably easily monkey patch the DirFile class in SConsign.py to
>>> override the directory in which the .sconsign resides but, before I do
>>> this, I wanted to check that there is no easier way to achieve the builds
>>> without a memory issue.
>>>
>>> Thanks,
>>>
>>> Steve.
>>>
>>>
>>> _______________________________________________
>>> Scons-users mailing list
>>> Scons-users at scons.org
>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Scons-users mailing list
>>> Scons-users at scons.org
>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>
>>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20171201/aedfdea1/attachment-0001.html>


More information about the Scons-users mailing list