[Scons-users] Using a Decider to prevent shared lib relinks

orenaud at coventor.com orenaud at coventor.com
Wed May 27 10:19:43 EDT 2020


Oh yes it's pretty dumb to call WhereIs more than once. Thanks. I also found that I can access the env using tgt.env from within the decider function, instead of getting it as a parameter to gen_my_decider.

Is the True vs object() a stylistic issue, or do you see a real problem using an object? I thought about using a boolean, but as I'm only interested in the existence of the attribute and never actually read the value, I thought it was misleading that both True and False would have the same meaning in this context. I confess object() has the downside of making a useless allocation. I could also have used None I guess.

Even though the solution I described in my previous message works on a small test, it actually does not work on my real code base. I found that my Decider does not have full control over the value of csig, presumably because there are other places that call Node.get_csig that overwrite the custom hash I put in csig. If Decider does not have full control over csig, and if I can't store my own hash in an other field of the SConsign entry, then I still don't understand how I could write a meaningful Decider. I studied the GitDecider from MongoDB, and I think it is also based on the assumption that GitDecider has total control over csig. I don't know if it's a problem with my code base or a wrong assumption.

I'll stop trying to use a Decider for now, and copy the approach used by Abilink from MongoDB.

Thanks.

On 5/26/2020 6:06 PM, Bill Deegan wrote:
Also:

            dep.attributes.csig_is_lib_symbols_md5 = True # rather than object()




On Tue, May 26, 2020 at 8:24 AM Bill Deegan <bill at baddogconsulting.com<mailto:bill at baddogconsulting.com>> wrote:
Also I wouldn't do the whereis inside the function.


def compute_symbols_md5(lib_filename):
    dumpbin = WhereIs('dumpbin', env['ENV']['PATH'])
    s = subprocess.check_output([dumpbin, '/exports', lib_filename])
    return SCons.Util.MD5signature(s)



And calling subprocess inside a decider also not really encouraged.


On Tue, May 26, 2020 at 7:30 AM Andrew C. Morrow <andrew.c.morrow at gmail.com<mailto:andrew.c.morrow at gmail.com>> wrote:

Hi Olivier -

You might like to read the blog post describing my investigations on this topic: https://engineering.mongodb.com/post/pruning-dynamic-rebuilds-with-libabigail<https://urldefense.proofpoint.com/v2/url?u=https-3A__engineering.mongodb.com_post_pruning-2Ddynamic-2Drebuilds-2Dwith-2Dlibabigail&d=DwMFaQ&c=RWI7EqL8K9lqtga8KxgfzvOYoob76EZWE0yAO85PVMQ&r=4kZkTHayCjeGAQcCy0E295XGr9xJGf-CHs_MqTnCu2I&m=7PQYjlaz_FFA9PffkbTuWseGocHPGQdb0Hm4NDZhs-g&s=ObNBnVLwBwCET4hMuMmUnzCu16KBdTbTxtxA4GbRxRs&e=>

Thanks,
Andrew


On Mon, May 25, 2020 at 9:56 PM <orenaud at coventor.com<mailto:orenaud at coventor.com>> wrote:
Thanks, your message and the recent message from Andrew C. Morrow ("My own Decider function") put me on the right track. The missing piece for me was to understand that it is possible, and even necessary, to assign the value of csig in this decider function. I now realize that it is more or less mentioned in the doc, in this excerpt:
Note how the signature information for the dependency file has to get initialized via get_csig during each function call (this is mandatory!)
Here is the solution I came up with. From my limited testing, it seems to do what I want:


def compute_symbols_md5(lib_filename):
    dumpbin = WhereIs('dumpbin', env['ENV']['PATH'])
    s = subprocess.check_output([dumpbin, '/exports', lib_filename])
    return SCons.Util.MD5signature(s)

def is_lib(filename, env):
    return filename.startswith(env.subst('$LIBPREFIX')) and filename.endswith(env.subst('$LIBSUFFIX'))

def gen_my_decider(env):
    default_decider = env.decide_target

    def my_decider(dep, tgt, prev_ni, repo_node=None):
        if not is_lib(str(dep), env):
            return default_decider(dep, tgt, prev_ni, repo_node)

        dep_node_info = dep.get_ninfo()

        if not hasattr(dep.attributes, 'csig_is_lib_symbols_md5'):
            dep_node_info.csig = compute_symbols_md5(str(dep))
            dep.attributes.csig_is_lib_symbols_md5 = object()

        if prev_ni is None or not hasattr(prev_ni, 'csig'):
            return True

        return prev_ni.csig != dep_node_info.csig

    return my_decider

my_env.Decider(gen_my_decider(my_env))



I use `csig_is_lib_symbols_md5` to mark the lib nodes that contain a md5 of the dumped symbols in `csig`, as opposed to a md5 of the file content (it works because the lib file is always represented by the same instance of a Node during a build session). I introduced `gen_my_decider` so that I have a place to store the "default" decider to use for non-lib files. I still need to distinguish between static lib file and dynamic lib file on windows.

On 5/25/2020 11:26 PM, Bill Deegan wrote:
Olivier,

You're not the first to head down this path as it would be useful to (configurably) avoid such unnecessary relinks.
I think MongoDB has implemented something like this.

Please keep in mind that dep and tgt are not strings, the are File() nodes, which have access to information stored in .sconsign with regards to the signature of the file from the previous build.
This is (currently) a md5sum of the full contents of the file.
So to adequately do what you're thinking you'd need a way to store the ABI info (or a hash thereof) as the content signature (also known as csig) for the node(s) in question.

So it's going to be a pretty deep dive into the code.

Perhaps someone from MongoDB will chime in here with a pointer to their code/experience.

-Bill

On Mon, May 25, 2020 at 10:11 AM <orenaud at coventor.com<mailto:orenaud at coventor.com>> wrote:

Hi,

I want to prevent unnecessary re-links against my shared libraries. By "unnecessary", I mean the re-links that happen even when the list of exported symbols did not change.

In practice, I would like to use the output of `nm --extern-only` on Linux or `dumpbin /exports` on Windows as the content to use for the signature of the shared library.

I was able to approximate this on a simple example, by explicitly specifying the dependency of a target (myExe) that links against a specific shared library (myLib.lib):

def dumpLibSymbols(target, source, env):
    with open(str(target[0]), 'w') as dumpFile:
        nm = WhereIs('nm', env['ENV']['PATH'])
        subprocess.call([nm, '--extern-only', str(source[0])], stdout=dumpFile)

dumpLibSymbolsAction = Action(dumpLibSymbols,
                              "Dumping Symbols for $SOURCE into $TARGET")
dumpLibSymbolsBuilder = Builder(action=dumpLibSymbolsAction,
                                src_suffix="$LIBSUFFIX",
                                suffix=".lib.exports")
env.Append(BUILDERS= {'DumpLibSymbols': dumpLibSymbolsBuilder})

Ignore(myExe, 'myLib.lib')
Depends(myExe, 'myLib.lib.exports')


Basically, it changes the dependency chain [myExe -> myLib.lib] to [myExe -> myLib.lib.exports -> myLib.lib], without changing the link command itself.

While this example works, I am unable to generalize it: the calls to Ignore and Depends must be explicit for all the pairs exe/lib (or lib/lib). What I want is for it to be automatic for anything that depends on a shared library.

It seems to me that the Decider function (https://scons.org/doc/production/HTML/scons-user/ch06.html#idm962<https://urldefense.proofpoint.com/v2/url?u=https-3A__scons.org_doc_production_HTML_scons-2Duser_ch06.html-23idm962&d=DwMFaQ&c=RWI7EqL8K9lqtga8KxgfzvOYoob76EZWE0yAO85PVMQ&r=4kZkTHayCjeGAQcCy0E295XGr9xJGf-CHs_MqTnCu2I&m=BObPKvTudrvwwNERpOKI_nAnPLa7214U5QnkFJLchJs&s=w6CW5DrsZgcPu8gkgq_bP5jjjWlISlfBgzcaWyEoSLM&e=>) is exactly what I need. Indeed, I want to decide a target is out of date based on the output of nm/dumpbin. The problem is that I don't understand how I can do that in practice. The example in the docs uses an unspecified `specific_part_of_file_has_changed(dep, tgt)` function, but I don't see how this function can do its job given its inputs. To me, this function needs an additional information: the previous content of the dependency file (or a hash of only the part it is interested in). Similarly, my custom Decider would also need to know the previous content of nm/dumpbin in addition to its current content.

Can someone points me to a real world usage of the Decider function? Is it the right tool I need, or did I overlook a better way to achieve my goal?

Thanks,

Olivier Renaud

_______________________________________________
Scons-users mailing list
Scons-users at scons.org<mailto:Scons-users at scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users<https://urldefense.proofpoint.com/v2/url?u=https-3A__pairlist4.pair.net_mailman_listinfo_scons-2Dusers&d=DwMFaQ&c=RWI7EqL8K9lqtga8KxgfzvOYoob76EZWE0yAO85PVMQ&r=4kZkTHayCjeGAQcCy0E295XGr9xJGf-CHs_MqTnCu2I&m=BObPKvTudrvwwNERpOKI_nAnPLa7214U5QnkFJLchJs&s=fHlnPOUCfVYT_y1hB7DzmO1UwCCHOf89CzXmmg0i1gk&e=>


_______________________________________________
Scons-users mailing list
Scons-users at scons.org<mailto:Scons-users at scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users<https://urldefense.proofpoint.com/v2/url?u=https-3A__pairlist4.pair.net_mailman_listinfo_scons-2Dusers&d=DwMFaQ&c=RWI7EqL8K9lqtga8KxgfzvOYoob76EZWE0yAO85PVMQ&r=4kZkTHayCjeGAQcCy0E295XGr9xJGf-CHs_MqTnCu2I&m=7PQYjlaz_FFA9PffkbTuWseGocHPGQdb0Hm4NDZhs-g&s=b3T2Tv3Ued8rO7OlU1rJB5yS-jNPZ5HbpIbONj-9CAM&e=>
_______________________________________________
Scons-users mailing list
Scons-users at scons.org<mailto:Scons-users at scons.org>
https://pairlist4.pair.net/mailman/listinfo/scons-users<https://urldefense.proofpoint.com/v2/url?u=https-3A__pairlist4.pair.net_mailman_listinfo_scons-2Dusers&d=DwMFaQ&c=RWI7EqL8K9lqtga8KxgfzvOYoob76EZWE0yAO85PVMQ&r=4kZkTHayCjeGAQcCy0E295XGr9xJGf-CHs_MqTnCu2I&m=7PQYjlaz_FFA9PffkbTuWseGocHPGQdb0Hm4NDZhs-g&s=b3T2Tv3Ued8rO7OlU1rJB5yS-jNPZ5HbpIbONj-9CAM&e=>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20200527/992c229c/attachment-0001.html>


More information about the Scons-users mailing list