-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large Page Support for Code Issue: 16198 #21064
Conversation
Increased number of Pages to be mapped
Changes to be committed: modified: src/node_large_page.cc
… enable node_use_large_pages=true in configure and protect the node_large_page.cc and the link using node_use_large_pages
…LO/node into intel-large-pages
src: node.cc,node_large_page.cc,node_large_page.h build: node.gyp,node.gypi,ld.script config: configure
…s/node into intel-large-pages
ld.script
Outdated
/* Copyright (C) 2014-2015 Free Software Foundation, Inc. | ||
Copying and distribution of this script, with or without modification, | ||
are permitted in any medium without royalty provided the copyright | ||
notice and this notice are preserved. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too low level for me to review, how do we make sure its correctness? are there documentations around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were 3 line modification to the default ld.script to create a variable name(s) for the start of actual node text segment.
PROVIDE (__nodetext = .);
PROVIDE (_nodetext = .);
PROVIDE (nodetext = .);
} | ||
#endif | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why we are doing it here, as opposed to the very beginning of everything, in main?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason. This was a place where other things were initialized like VTune support.
src/node.cc
Outdated
@@ -4420,6 +4431,7 @@ int Start(int argc, char** argv) { | |||
// will never be fully cleaned up. | |||
v8_platform.Dispose(); | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks will fix that.
src/node_large_page.cc
Outdated
extern char __nodetext; | ||
|
||
namespace node { | ||
namespace largepages { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure this qualifies for a separate namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I wanted the structure and code to live in a separate space and not pollute the node space.
src/node_large_page.cc
Outdated
int64_t offset; | ||
bool found_text_region; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many of these fields are used?
can we follow a consistent variable naming convention? (camel casing vs snake casing etc.) I suggest look around in node.cc for the prevailing nomenclature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will look and make it consistent.
src/node_large_page.cc
Outdated
} | ||
} | ||
|
||
return nregion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the frame will be overwritten after return, so the local structure will be stale in the caller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not returning a pointer to this but the actual structure which will be returned as a copy. So we are ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for that, my mis-understanding.
src/node_large_page.cc
Outdated
} | ||
|
||
static bool isTransparentHugePagesEnabled() { | ||
std::ifstream ifs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for using C++ style file I/O and C style elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make it consistent and use C++ everywhere.
src/node_large_page.cc
Outdated
|
||
memcpy(nmem, r.from, size); | ||
|
||
tmem = mmap(start, size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the chances you get what you want at the same location? start
is already mapped, right? how about passing a nullptr here and get the huge pages wherever available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If addr is not NULL, then the kernel takes it as a hint about where to place the mapping. Since we are using MAP_FIXED it is not a hint but a requirement. We are relying on the new mapping being at the same virtual address so we dont have to do any fix up of the code. Otherwise we will have to fixup the branches, offsets etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - I was thinking on the lines on mremap
, but looking at the man page for that and your implementation, I guess it is one and the same.
src/node_large_page.cc
Outdated
|
||
tmem = mmap(start, size, | ||
PROT_READ | PROT_WRITE | PROT_EXEC, | ||
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1 , 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the rationale behind predefining the memory attributes? ideally we want to inherit the attribute from the old small pages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want to be able to write to this so PROT_WRITE is needed. We are using MAP_FIXED to place the mapping at exactly that address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code comments
We already know the original page is r-xp (PROT_READ, PROT_EXEC, MAP_PRIVATE)
We want PROT_WRITE because we are writing into it.
We want it at the fixed address and we use MAP_FIXED.
We are using MAP_ANONYMOUS so it's contents are initialized to zero and it is also not backed by any file.
src/node_large_page.cc
Outdated
ifs.open("/sys/kernel/mm/transparent_hugepage/enabled"); | ||
if (!ifs) { | ||
fprintf(stderr, "WARNING: Couldn't check hugepages support\n"); | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we follow a consistent return value convention? (returning null vs. -1 vs errno) I suggest look arounf in other APIs for the prevailing convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@gireeshpunathil thanks for the review and feedback. We will work on updating the PR with those suggestions. Let me know if you have any other questions. |
@gireeshpunathil the PR is updated based on the feedback. Would appreciate if you could take a look. |
configure
Outdated
if options.node_use_large_pages: | ||
o['variables']['node_use_large_pages'] = 'true' | ||
else: | ||
o['variables']['node_use_large_pages'] = 'false' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified to this:
use_large_pages = (flavor == 'linux' and options.node_use_large_pages)
o['variables']['node_use_large_pages'] = b(use_large_pages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok fixed
ld.script
Outdated
.debug_macro 0 : { *(.debug_macro) } | ||
.gnu.attributes 0 : { KEEP (*(.gnu.attributes)) } | ||
/DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file can probably be reduced to ~20 lines. It's pretty well hidden in all the definitions but I infer you really only care about the nodetext
symbols.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see how it can be reduced to 20 lines. Much of it is what is used by the default linker script. Yes the only reason for me to modify it was to find the start of the actual text. I had trouble with moving the plt regions and wanted to avoid several of those regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://sourceware.org/binutils/docs/ld/Implicit-Linker-Scripts.html#Implicit-Linker-Scripts - instead of replacing the default linker script, you use an implicit linker script that works in addition to the default linker script.
Replacing the default linker script isn't an option, IMO, because that will be a never-ending maintenance hassle for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implicit linker scripts wont really help as I cant do a symbol assignment for something that is already specified in the linker script. I will investigate if there is some other approach to get the address of the .text segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am able eliminate the ld.script by using the LD option -Wl,-Ttext=<address to place text segment>
. This places the text segment at a specific address which I can then use as the start for the 2M mapping. This will hard code the text starting address.
The other way is to read the elf file and find the .text segment and determine the address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fixed address is bad for security reasons and reading it from the ELF header won't work when ASLR is enabled.
I guess you could compute the ASLR slide at run-time but that gets complicated fast when code and data is spread across several segments.
Another issue with that approach: two segments with different protection might be located within < 2 MB from each other.
Are you sure implicit linker scripts don't work? I remember using that to put symbols at fixed locations.
What about retrieving the path to the default linker script with ld --verbose
and passing -T /path/to/default/script.ld -T custom.ld
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Let me look into the implicit linker script some more. Another option is to generate the linker script using the ld --verbose
and to then add my symbol to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to figure this out. GNU ld has the INSERT option which makes the externally supported script not override the default script but insert the code into the default. So I will delete the ld.script and insert a new ld.implicit.script
PROVIDE (_nodetext = .);
PROVIDE (nodetext = .);
INSERT BEFORE .text;
node.gypi
Outdated
['OS!="aix" and node_shared=="false"', { | ||
['OS!="aix" and node_shared=="false" and node_use_large_pages=="true"', { | ||
'ldflags': [ | ||
'-Wl,-T <(PRODUCT_DIR)/../../ld.script', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this on on other platforms than x86_64 Linux?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No only on x86_64 Linux, specifically Ubuntu 16.04 is what I have done most of my measurement and testing on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I figured as much. This needs an additional guard to ensure it's only activated when OS=="linux" and target_arch="x64"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks fixed it.
node.gypi
Outdated
@@ -348,6 +356,5 @@ | |||
}, { | |||
'defines': [ 'HAVE_OPENSSL=0' ] | |||
}], | |||
|
|||
], | |||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stylistic change, please undo.
src/node_large_page.cc
Outdated
|
||
nregion.found_text_region = false; | ||
|
||
ifs.open("/proc/self/maps"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumes /proc
is mounted. Not necessarily true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, added checks.
src/node_large_page.cc
Outdated
iss >> permission; | ||
|
||
if (permission.compare("r-xp") == 0) { | ||
start = (unsigned int64_t) &__nodetext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use C style casts, use proper C++ casts (in this case reinterpret_cast<uint64_t>
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok fixed that. in fact took out that line and use nodetext directly.
src/node_large_page.cc
Outdated
return isTransparentHugePagesEnabled(); | ||
} | ||
|
||
} // namespace node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I kinda checked out after the first 100 lines. This file has so many style errors, can you fix those first, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you be specific? I ran the make -s lint and fixed all the ones pointed to by that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linter doesn't catch everything but to name three things:
- Inconsistent indentation.
- Star leans left in pointers (
void* p
, notvoid *p
orvoid * p
.) - Casing:
IsLargePagesEnabled
, notisLargePagesEnabled
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I have updated the PR with 2) and 3). I am using standard indentation of 2 spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suresh-srinivas There’s https://github.com/nodejs/node/blob/master/CPP_STYLE_GUIDE.md that would help a lot with this. Also, comparing with other code in our codebase if you’re unsure. :)
(In particular, we don’t indent code inside namespaces)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax thanks much. I will look at this and fix styles accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax @bnoordhuis @gireeshpunathil fixed the style errors. Let me know if you see anything else.
@bnoordhuis PR updated with the feedback. |
…eEnabled instead of isLargePageEnabled)
i) CamelCase for methods, functions ii) snake_case for variables/structs iii) indentation (dont indent code inside namespace)
src/node_large_page.h
Outdated
#ifndef SRC_NODE_LARGE_PAGE_H_ | ||
#define SRC_NODE_LARGE_PAGE_H_ | ||
|
||
namespace node { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be enclosed in
#if defined(NODE_WANT_INTERNALS) && NODE_WANT_INTERNALS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I fixed it.
GNU ld has the INSERT option which makes the externally supported script not override the default script but insert the code into the default. The new ld.implicit.script does just that and no modification to the default script is necessary anymore. PROVIDE (_nodetext = .); PROVIDE (nodetext = .); INSERT BEFORE .text;
@gireeshpunathil @bnoordhuis @addaleax @gabrielschulhof @Fishrock123 |
The tests were failing due to libgcc_s unwinder getting confused. I was able to debug this problem and linking and using libunwind didnt have any issues. The code MoveTextRegionToLargePages was being placed in eh_frame region initially and this was causing the tests to fail. I now create a new stub region (before .text) in the implicit script and place the MoveTextRegionToLargePages in that region and also ensure that this is ahead of the region we are moving. gdb node info file 0x0000000000a00000 - 0x0000000000a001d3 is .lpstub 0x0000000000a01000 - 0x000000000182a279 is .text
Fix the test failures. The tests were failing due to I now create a new stub region If you use gdb to inspect node you will see this new region.
`
[01:45|% 100|+ 2274|- 0]: Done |
@suresh-srinivas - give me 2 day's time please |
We have a technical problem: from Linux man pages: from our supported platforms:
so we cannot run this on kernel versions between 2.6.32 and 2.6.38
|
@gireeshpunathil thanks. i have been doing all the work under 4.4.x kernels. What do you suggest? |
It's an opt-in build time option so it's alright if it doesn't work everywhere, as long as that's noted somewhere (e.g. in |
src/node_large_page.cc
Outdated
namespace node { | ||
#define ALIGN(x, a) (((x) + (a) - 1) & ~((a) - 1)) | ||
#define PAGE_ALIGN_UP(x, a) ALIGN(x, a) | ||
#define PAGE_ALIGN_DOWN(x, a) ((x) & ~((a) - 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: these could all be inline functions, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Yes fixed them to be inline functions.
src/node_large_page.cc
Outdated
if (permission.compare("r-xp") == 0) { | ||
start = reinterpret_cast<unsigned int64_t>(&__nodetext); | ||
char* from = reinterpret_cast<char *>PAGE_ALIGN_UP(start, hps); | ||
char* to = reinterpret_cast<char *>PAGE_ALIGN_DOWN(end, hps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: left-leaning pointers (e.g. char*
), and I would not rely on the implicit parentheses added by the macros here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed.
src/node_large_page.cc
Outdated
return -1; | ||
} | ||
|
||
bool IsLargePagesEnabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: there is an extra space after bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed this too.
src/node_large_page.cc
Outdated
return -1; | ||
} | ||
|
||
if (r.from > reinterpret_cast<void *> (&MoveTextRegionToLargePages)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, use void*
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Yes fixed.
CI failed with: but the above widget says I am bit confused. anyone has any idea? In either way, @suresh-srinivas - can you please rebase? [edit: |
@gireeshpunathil Suresh is on vacation but I'll try to rebase and build it. Thanks. |
What's the status on this one? |
I just got back from my sabbatical. I will work with Uttam to update per feedback. |
Support under Linux to map the text segment of node into 2M pages.
Fixes: #16198
src: node.cc,node_large_page.cc,node_large_page.h
build: node.gyp,node.gypi,ld.implicit.script
config: configure
How to enable
./configure --use-largepages
Performance Results
Node Microbenchmarks (under node/benchmarks)
total tests | 5444
total 3 stars | 589 with 82% of tests showing positive gains and 18% have low single digit regression.
Node Larger Application
Web Tooling - 2%
Node-DC-EIS - 3% - 5%
Ghost - 2% - 3%
Node-Todo - 20%
React-SSR - 2.5%
Caveats
Checklist
make -j4 test
(UNIX), passes