[Scons-users] Speeding up subprocess spawns

Fri Jun 26 09:19:55 EDT 2020

Interesting. I didn't see this PR earlier.

I tested it with my MongoDB build setup, from preliminary testing it appears
to perform worse. As written in the PR by Dirk, it starts providing
positive effects only after a certain project size and memory consumption.
I think it introduces some unnecessary complications by trying to fully
mimic subprocess, while SCons doesn't use all of those features in
its exec_subprocess() and exec_popen3().

I'll continue the discussion on the PR page.

> On Thu, Jun 25 2020 at 09:18 AM Bill Deegan <bill at baddogconsulting.com> wrote:
>
> Have you tried this?
> https://github.com/SCons/scons/pull/3703
>
> On Thu, Jun 25, 2020 at 12:09 AM Yonatan <yon.goldschmidt at gmail.com> wrote:
>
> > Hello, SCons community,
> >
> > Recently, I debugged the build-process performance of a large project
> > built with SCons.
> > Trying to figure out exactly where time is spent (gcc vs. scons vs.
> > linking), I ran a system-wide "perf".
> > One of the things that surprised me was that Python spends a significant
> > amount of time in fork() and
> > execve() (in the kernel).
> >
> > Almost all of the time of fork() was spent in copy_page_range(), and of
> > execve() in unmap_page_range().
> > To my understanding, these 2 kernel functions are responsible to copy the
> > process' page tables when
> > fork()ing, and to remove existing mappings when execve()ing (as you switch
> > to a new address space with
> > new mappings). The larger resident set a process has, the more time those
> > 2 functions will take.
> >
> > In that project, the SCons process grows very large in RAM (over 1GB);
> > this is obviously specific to that
> > project, and what it loads into Python. With a resident set that large,
> > the fork()-execve() sequence incurs a
> > significant cost.
> >
> > To overcome this I thought of 2 possible solutions:
> > 1. Use vfork()-execve() in subprocess.Popen() - this is not supported in
> > yet CPython, but it has been
> >     suggested and a PR exists. See https://bugs.python.org/issue35823
> > 2. Run a minimal "spawner" process and execute subprocess.Popen() requests
> > via it. I've written a PoC, and
> >     tested MongoDB's compilation using it. SCons doesn't hog too much
> > CPU/memory in the MongoDB case,
> >     but the spawner change still introduces a tiny improvement.
> >     I also tried manually increasing the memory usage of SCons (by
> > creating many Python objects), which
> >     lead to much more visible effects for this change.
> >     You can see it here:
> > https://github.com/Jongy/scons/tree/process-spawner
> >
> > I hope the vfork() PR gets merged, and meanwhile I was wondering if the
> > "spawner" approach sounds feasible to you.
> > I'd be happy to push it towards mainlining.
> >
> > Thanks,
> > Yonatan
> > _______________________________________________
> > Scons-users mailing list
> > Scons-users at scons.org
> > https://pairlist4.pair.net/mailman/listinfo/scons-users
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://pairlist4.pair.net/pipermail/scons-users/attachments/20200625/42fa3374/attachment.html>