Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configure option to link to tcmalloc/ jemalloc #17007

Closed

Conversation

sathvikl
Copy link

Both the memory allocators provide better performance on small sized
allocations.

Provide user options to link to jemalloc, tcmalloc.
Both of them have been known to provide better performance for small sized memory
allocations. The impact on long running apps must be evaluated before enabling it by default on production builds of node binary.
Earlier versions of tcmalloc cause fragmentation leading to overhead in actual physical memory consumed for the same given number of bytes allocated.

Checklist
  • [ x] make -j4 test (UNIX), or vcbuild test (Windows) passes
  • [ x] tests and/or benchmarks are included
  • [ x] documentation is changed or added
  • [ x] commit message follows commit guidelines
Affected core subsystem(s)

Build

@nodejs-github-bot nodejs-github-bot added the build Issues and PRs related to build files or the CI. label Nov 14, 2017
@bnoordhuis
Copy link
Member

The reason we've never done this so far is that it introduces incompatibilities between node and add-ons.

A pointer allocated in A can't be freed from B unless A and B use the same allocator. If node is linked against jemalloc and an add-on against libc, things go boom.

Likewise, even when an add-on picks up the allocator from the node binary but is linked against a library that doesn't: boom!

That means this PR is only useful in custom builds and even then only with restrictions. The help options should mention that.

@sathvikl
Copy link
Author

@bnoordhuis I will close this. @uttampawar said he wants to add performance results to it.

@sathvikl sathvikl closed this Nov 14, 2017
@sathvikl sathvikl reopened this Jan 12, 2018
@sathvikl
Copy link
Author

sathvikl commented Jan 12, 2018

Here are results from using jemalloc vs the default libc malloc on node master git# 6aac05b
https://gist.github.com/sathvikl/a9d0cc33e57d4c7a9fd59f9f8c2280f4

Results are from running on a skylake server core, ubuntu 16.10, kernel 4.9.0-rc8

@sathvikl
Copy link
Author

sathvikl commented Jan 12, 2018

@bnoordhuis That's how I executed it
cat compare-pr-5134.csv | Rscript benchmark/compare.R
This command however printed the confidence interval and % change numbers in one line.
At the very end the p-value is printed for the test-case/filename.

Maybe something changed in the compare.R output. I can dump the raw csv into another gist, if you would like to test it out.

jemalloc and tcmaloc memory allocators provide better
performance on small sized allocations.
Note that, the comments in the help section clearly warn the user
that, when linking against jemalloc/tcmalloc, alloc() and free()
should route to the same allocator to prevent a crash.

GH Issue nodejs#17007 has the performance report attached for jemalloc.
@BridgeAR
Copy link
Member

@sathvikl when looking at your results there are some that seem to benefit from the change and others that regress due to it. Is there really a strong benefit? Especially due to the mentioned downside which is a significant blocker as far as I see it.

@sathvikl
Copy link
Author

sathvikl commented Feb 5, 2018

@BridgeAR The benchmark runs are quite unreliable. However the +/- 10% delta pattern on some cases were consistent across runs.

I would suggest to keep it as a configure option, so a user can easily experiment with their workload if they choose to do so.

The effect on back-end API services workload is what I was targeting but we don't have a good proxy workload for that scenario.

@BridgeAR
Copy link
Member

BridgeAR commented Feb 7, 2018

@sathvikl I personally tend to a -0 on this. So far no other collaborator has really gone for it, so I guess this is not going to land. I am therefore going to close the PR. Thanks a lot for your work nevertheless! It is much appreciated! In case you feel strongly about this PR and would still like to get it landed, please leave a comment. In that case I am going to reopen the issue and escalate it to the TSC to get a decision.

@BridgeAR BridgeAR closed this Feb 7, 2018
@addaleax
Copy link
Member

addaleax commented Feb 7, 2018

@BridgeAR We can always at least ping @nodejs/collaborators in these cases and see whether anybody has an opinion.

This looks good to me code-wise, but I share Ben’s concern about the changed addons API here.

@gireeshpunathil
Copy link
Member

I have been searching for patterns where pointers allocated in node's C/C++ layer escape into native add-ons and vice-versa but could not figure out any. Moreover, the param values shuttled between JS and add-ons are almost always either JS objects or scalar types? So to me, the core layer and add-on layer operate in their own space in terms of self-managed memory, and thereby this change seems safe. @bnoordhuis , @addaleax - can you please clarify if this is not the case?

@BridgeAR
Copy link
Member

BridgeAR commented Feb 7, 2018

Reopening due to the comment from @gireeshpunathil

@BridgeAR BridgeAR reopened this Feb 7, 2018
@ofrobots
Copy link
Contributor

ofrobots commented Feb 7, 2018

Note that, at least on Linux, it is already possible to use alternative memory allocators using the LD_PRELOAD mechanism. I do this quite frequently with tcmalloc to utilize the excellent native heap profiler available with it.

I also share Ben's concerns. There is at least one V8 API that transfers ownership of a returned pointer allocated by V8 to the caller. An addon using a different allocator would not work well with such an API.

@gireeshpunathil
Copy link
Member

thanks @ofrobots .

  • LD_PRELOAD may be handy for certain types of deployments, but not feasible for all (such as Cloud deployments)
  • the raw pointer return you referenced seem to be from V8 profiler, and may be an edge case that can be covered through a doc caveat?
  • additionally, is it possible to modify node-gyp in such a manner to direct the builder to build the add-on against custom malloc libraries, based on the characteristics of the prevailing node instance?
  • what would be a rough distribution of pre-compiled native add-ons versus those built on the host?

@lpinca
Copy link
Member

lpinca commented Feb 7, 2018

This may also be useful to work around issues like #8871

@addaleax
Copy link
Member

addaleax commented Feb 7, 2018

@gireeshpunathil Like Ben said, an Uint8Array object (or Buffer) needs to be freeable by the default array buffer allocate, which in Node’s case uses free. For example, this code is always safe currently:

char* foo = malloc(100);
Buffer::New(isolate, foo, 100);

@gireeshpunathil
Copy link
Member

thanks @addaleax for the explanation. So essentially it means interoperability is guaranteed only if one of:

  • native addons are built with the same build config as that of node,
  • the free is interoperable with the malloc from a different implementation

is true. @sathvikl - can you validate the second point?

@bnoordhuis
Copy link
Member

the free is interoperable with the malloc from a different implementation

That's never the case.

@gireeshpunathil
Copy link
Member

thanks @bnoordhuis . I was under the impression that every malloc implementation stores the number of bytes it allocated behind the pointer it returns to the program, and the free could access this value to release the memory into the system, and the implementations differ only in the way they chunk the memory. But thinking more on it, I realize that the free should co-ordinate with the malloc's internal data structures - so agree, the interoperability is impossible.

that leaves us with only one option (if we want to go ahead with this PR) of documenting this limitation around the new build configurations with --shared-jemalloc... that: for node binaries built with such configurations to work native add-ons, the addons themselves should be built similarly, and mixed exution results in undesired behavior such as memory corruption.

Given that:

  • we don't have plans to release node binaries built with these options,
  • users who build from source with these options do it through their choice and are very well aware of the implications,

are we good with the documentation approach?

@bnoordhuis
Copy link
Member

There is another issue that should be addressed: ./configure --shared-jemalloc --shared-openssl is unsafe; some openssl apis take ownership of memory that is allocated by node.

node/src/node_crypto.cc

Lines 2684 to 2689 in 95a35bc

// OpenSSL takes control of the pointer after accepting it
char* data = node::Malloc(len);
memcpy(data, resp, len);
if (!SSL_set_tlsext_status_ocsp_resp(s, data, len))
free(data);

That's just one example, there are probably more, and openssl may not be the only library either.

We can either disable those apis when linking against jemalloc/tcmalloc, or make configure print a warning or abort.

A warning might be too easy to miss. configure prints a big fat warning when the compiler is too old and people still manage to completely overlook it. We know because they open issues about their failing builds.

This issue is a lot more subtle: it builds just fine but blows up at runtime, but not in a predictable manner.

@ofrobots
Copy link
Contributor

ofrobots commented Feb 9, 2018

LD_PRELOAD may be handy for certain types of deployments, but not feasible for all (such as Cloud deployments

I think this is the right problem to go solve. IMO LD_PRELOAD is quite feasible in IaaS and container cloud environments. I agree that it may be harder in PaaS environments, but I guess that's the precise idea of a PaaS; you provide the code and the PaaS provides the 'runtime environment'. That may be fundamentally incompatible with letting users provide their custom built node binary.

@gireeshpunathil
Copy link
Member

thanks all for the input - with so many if's and but's, looks like this PR cannot be realized without breaking the runtime consistency.

@benjamingr
Copy link
Member

@gireeshpunathil can we close the PR now?

@gireeshpunathil
Copy link
Member

I guess so - yes.

@gireeshpunathil
Copy link
Member

I want to thank @sathvikl for their creative thought and the effort behind this PR - unfortunately the interoperability is a challenge!

@sathvikl
Copy link
Author

Thank you @gireeshpunathil and everyone for their thoughts on this.

I believe, the issue that would be very hard to handle would be cases where npm/yarn install would build as per the module settings.
npm install of sqllite starts a build on the target system and the JS code would then handle the ownership of the returned objects OR it could pass it on to other node modules..

@ChALkeR ChALkeR added the memory Issues and PRs related to the memory management or memory footprint. label Jul 29, 2018
@weixingsun
Copy link

I felt something wired, this is the feature I have been looking for years in openjdk.
open source projects should accept new ideas like this to be experimental or at risk of users, otherwise this new feature cannot be implemented or be enhanced in future.

@theanarkh
Copy link
Contributor

The reason we've never done this so far is that it introduces incompatibilities between node and add-ons.

A pointer allocated in A can't be freed from B unless A and B use the same allocator. If node is linked against jemalloc and an add-on against libc, things go boom.

Likewise, even when an add-on picks up the allocator from the node binary but is linked against a library that doesn't: boom!

That means this PR is only useful in custom builds and even then only with restrictions. The help options should mention that.

Hi, Ben. What does this mean? If I load jemalloc with LD_PRELOAD, then both Node.js and addon use jemalloc, Isn't it ?

@bnoordhuis
Copy link
Member

That comment wasn't about LD_PRELOAD but about building node with jemalloc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. memory Issues and PRs related to the memory management or memory footprint.
Projects
None yet
Development

Successfully merging this pull request may close these issues.