[Scons-users] Thin archive Tool

Bill Deegan bill at baddogconsulting.com
Wed Jan 4 11:57:35 EST 2017


off the top of my head..
get_content is used by scanners so overriding that could impact those.

I'll have to re-digest the email thread to have smarter response..
Probably later today.

-Bill


On Wed, Jan 4, 2017 at 6:38 AM, Andrew C. Morrow <andrew.c.morrow at gmail.com>
wrote:

>
> Hi -
>
> I'm circling back to this issue for various reasons, but I'd like to
> sidestep the questions about whether thin archives are a good idea or not,
> or whether temp files are a solution. What I am more interested in is
> determining the correct way to override the signature calculation for a
> file node.
>
> One approach, which I'm now using in two custom Tools, is to override the
> get_content and get_content_hash methods of FS.File. The alternative would
> be to directly override FS.File.get_csig, and leave get_content and
> get_content_hash as they are.
>
> The advantage to overriding get_content and get_content_hash is that they
> are simple methods, and by overriding them, the existing FS.File.get_csig
> mechanics just do the right thing. But I'm somewhat worried that there may
> be other parts of SCons outside the signature calculation mechanism that
> might want the actual file contents for some reason, not the "fake
> contents" I'm returning to compute a signature for the node.
>
> Overriding FS.File.get_csig seems like a more targeted approach, and so
> less likely to interfere if other parts of SCons really do need file
> contents, but it would mean that I would need to re-create the logic of
> FS.File.get_csig
> <http://scons.org/doc/HTML/scons-api/SCons.Node.FS-pysrc.html#File.get_csig>
> multiple times, and that seems likely to drift out of sync with SCons
> itself.
>
> Any guidance?
>
> Thanks,
> Andrew
>
>
> On Mon, Oct 24, 2016 at 11:36 PM, Bill Deegan <bill at baddogconsulting.com>
> wrote:
>
>> Did you try using TEMPFILE? and MAXLINELENGTH?
>> Which I'm just seeing is far from adequately documented.
>>
>> see src/engine/SCons/Platform/__init__.py
>>
>> """A callable class.  You can set an Environment variable to this,
>> then call it with a string argument, then it will perform temporary
>> file substitution on it.  This is used to circumvent the long command
>> line limitation.
>>
>> Example usage:
>> env["TEMPFILE"] = TempFileMunge
>> env["LINKCOM"] = "${TEMPFILE('$LINK $TARGET $SOURCES','$LINKCOMSTR')}"
>>
>> By default, the name of the temporary file used begins with a
>> prefix of '@'.  This may be configred for other tool chains by
>> setting '$TEMPFILEPREFIX'.
>>
>> env["TEMPFILEPREFIX"] = '-@'        # diab compiler
>> env["TEMPFILEPREFIX"] = '-via'      # arm tool chain
>> """
>>
>>
>>
>> On Mon, Oct 24, 2016 at 10:29 PM, Bill Deegan <bill at baddogconsulting.com>
>> wrote:
>>
>>> Andrew,
>>>
>>> I guess the question is would the extra work be worth it in terms of
>>> reduced build time?
>>> Would covering all binutils platforms resolve the issue for you and your
>>> team?
>>> Or would you still be left with needing to use  thin archives on some
>>> platforms?
>>> Which other linkers are you dealing with?
>>>
>>> -Bill
>>>
>>> On Mon, Oct 24, 2016 at 4:47 PM, Andrew C. Morrow <
>>> andrew.c.morrow at gmail.com> wrote:
>>>
>>>>
>>>> That would probably work on binutils platforms, but I think it would
>>>> require generating the linker script from the source list, and then feeding
>>>> that into the link step. Without having really investigated, that
>>>> intuitively felt like it would be more work to get SCons to do that than
>>>> would adding some letters to ARFLAGS and synthesizing a signature for the
>>>> target.
>>>>
>>>> On Mon, Oct 24, 2016 at 2:30 PM, Bill Deegan <bill at baddogconsulting.com
>>>> > wrote:
>>>>
>>>>> Would something like this work?
>>>>> https://cygwin.com/ml/cygwin/2004-04/msg00330.html
>>>>>
>>>>> Or is this for a non-gcc compiler?
>>>>> -Bill
>>>>>
>>>>> On Mon, Oct 24, 2016 at 10:12 AM, Andrew C. Morrow <
>>>>> andrew.c.morrow at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> The short answer is that when linking by enumerating all object files
>>>>>> on the command line, we recently hit the hard coded command line length
>>>>>> limit on certain linux variants. By using archives to bundle groups of
>>>>>> objects into logical libraries, we can sidestep that limitation. The
>>>>>> downside was increased IO and disk utilization, but thin archives mitigate
>>>>>> that.
>>>>>>
>>>>>> I'm still curious though - any thoughts on the implementation? Is the
>>>>>> mechanism used for interposing on the signature calculations correct? Is
>>>>>> there a better, more scons-y way?
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew
>>>>>>
>>>>>>
>>>>>> On Fri, Oct 21, 2016 at 8:05 PM, Bill Deegan <
>>>>>> bill at baddogconsulting.com> wrote:
>>>>>>
>>>>>>> I'm guessing that using archives let's him skip keeping track of
>>>>>>> which object files need to be pulled in (as opposed to listing all the
>>>>>>> object files).
>>>>>>>
>>>>>>> -Bill
>>>>>>>
>>>>>>> On Fri, Oct 21, 2016 at 11:26 AM, Rob Boehne <robb at datalogics.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Andrew,
>>>>>>>>
>>>>>>>> Just out of curiosity, why are you using archives at all?  There
>>>>>>>> are legitimate reasons, and I haven’t looked at MongoDB, but perhaps
>>>>>>>> removing this step would simplify your build and solve this issue.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>>
>>>>>>>> Robert Boehne
>>>>>>>>
>>>>>>>> From: Scons-users <scons-users-bounces at scons.org> on behalf of
>>>>>>>> "Andrew C. Morrow" <andrew.c.morrow at gmail.com>
>>>>>>>> Reply-To: SCons users mailing list <scons-users at scons.org>
>>>>>>>> Date: Thursday, October 20, 2016 at 3:32 PM
>>>>>>>> To: SCons users mailing list <scons-users at scons.org>
>>>>>>>> Subject: [Scons-users] Thin archive Tool
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi -
>>>>>>>>
>>>>>>>> The SCons based build system for MongoDB makes heavy use of static
>>>>>>>> linking. One consequence of static linking is that the space requirements
>>>>>>>> are basically doubled, since each translation unit produces an object file,
>>>>>>>> and then each object file is copied into an archive file. Adding CacheDir
>>>>>>>> into the mix multiplies this duplication.
>>>>>>>>
>>>>>>>> The GNU binutils tools, however, support 'thin' archives, where the
>>>>>>>> archive contents are simply a list of file references, meaning that the
>>>>>>>> archive files are very small. At link time, the linker simply dereferences
>>>>>>>> the listed files in each archive.
>>>>>>>>
>>>>>>>> To support this, we added the following Apache 2.0 licensed tool:
>>>>>>>>
>>>>>>>> https://github.com/mongodb/mongo/blob/master/site_scons/site
>>>>>>>> _tools/thin_archive.py
>>>>>>>>
>>>>>>>> One subtle aspect to consider is that when using thin archives, if
>>>>>>>> application X depends on libY.a, which contains z.o produced from z.c, then
>>>>>>>> if z.c is changed, the built-in signature of libY.a will not change, since
>>>>>>>> the reference to z.o doesn't change when the archive is rebuilt, so taking
>>>>>>>> the MD5 of the file contents will yield the same result.
>>>>>>>>
>>>>>>>> To address this, our Tool creates a new Node subclass that
>>>>>>>> overrides the get_contents and get_content_hash methods, and sets the
>>>>>>>> target_factory for StaticLibrary to produce that Node subclass. The
>>>>>>>> overriding behavior computes a content hash based on the content hash of
>>>>>>>> the children of the new node.
>>>>>>>>
>>>>>>>> This all seems to work, fairly well, but I was curious if there was
>>>>>>>> a more appropriate way to accomplish this. The end goal is that we want the
>>>>>>>> content signature of these Nodes to be a hash of the signatures of all of
>>>>>>>> the Nodes children, rather than the on-disk contents of the Node.
>>>>>>>>
>>>>>>>> Is there a better way to accomplish this than what we are doing
>>>>>>>> here?
>>>>>>>>
>>>>>>>> FWIW we also use a similar technique for driving ABI change linking
>>>>>>>> when doing dynamic builds:
>>>>>>>>
>>>>>>>> https://github.com/mongodb/mongo/blob/master/site_scons/site
>>>>>>>> _tools/abilink.py
>>>>>>>>
>>>>>>>> Here, we would be particularly interested in arranging to interpose
>>>>>>>> in such a way to absolutely minimize the number of times we must invoke
>>>>>>>> abidw, as that is very expensive. Avoiding needless re-invocations of abidw
>>>>>>>> as things move in and out of the CacheDir is desired.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Andrew
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Scons-users mailing list
>>>>>>>> Scons-users at scons.org
>>>>>>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Scons-users mailing list
>>>>>>> Scons-users at scons.org
>>>>>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Scons-users mailing list
>>>>>> Scons-users at scons.org
>>>>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Scons-users mailing list
>>>>> Scons-users at scons.org
>>>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Scons-users mailing list
>>>> Scons-users at scons.org
>>>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Scons-users mailing list
>> Scons-users at scons.org
>> https://pairlist4.pair.net/mailman/listinfo/scons-users
>>
>>
>
> _______________________________________________
> Scons-users mailing list
> Scons-users at scons.org
> https://pairlist4.pair.net/mailman/listinfo/scons-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20170104/e2f7f76a/attachment-0001.html>


More information about the Scons-users mailing list