Cache accesses to process.env? #3104

STRML · 2015-09-28T15:30:30Z

I've found in multiple projects that accessing process.env within a hot section of the code leads to major slowdown.

This really hurts in React server-side rendering (issue) and has caused them to rearrange how they access the env.

It would sense to cache already-accessed properties rather than reaching out to the actual environment. And of course, update the cache on assignment.

I would be happy to contribute a patch if anyone could point me in the right direction to get started, as I'm new to Node core dev.

The text was updated successfully, but these errors were encountered:

ChALkeR · 2015-09-28T16:01:25Z

See #1648.

STRML · 2015-09-28T16:16:41Z

Thanks @ChALkeR. I understand the argument for keeping it as-is, but if it is kept this way, I think it would help to document that values should be cached whenever possible as serious performance problems can result.

Alternatively, a function like process.env[freeze | unfreeze]() would be great to create this cache from inside our programs in case of misbehaving modules.

Fishrock123 · 2015-09-28T16:36:14Z

Is it actually possible to change the env externally once the process is running? If not, why not cache this?

ChALkeR · 2015-09-28T16:48:53Z

@Fishrock123 Any native module (or a library that is used by it) could call setenv directly. Though the likelihood of this is near zero, it's still possible. And it should affect process.env.

process.env behaves as it should, and caching it on the Node.js side will break it. There is no actual problem — anything that uses it in hot code path could cache it if it's fine with that.

-1 on caching, +1 on the minor note in the docs.

While we are here, the docs for the process.env should state things a bit more clear about how does the env work, because now «changes won't be reflected outside of your process» could be misunderstood. As well as the normal env, changes affect new child processes.

silverwind · 2015-09-28T17:44:30Z

Also keep in mind that there are modules that assign to process.env.

@STRML why not clone process.env to a local object once in your application?

STRML · 2015-09-28T17:58:11Z

@silverwind It's really more of a problem with child modules, like React itself (which is finally caching process.env properly new versions). And I've seen many more that do this in hot paths.

I think @ChALkeR has the right approach; I underestimated how much potential breakage there is. But documenting it properly and making it known that process.env accesses are slow would be great.

Perhaps it should have been function getter/setters, which would have been more expected to have slow behavior. Too late now of course.

ChALkeR · 2015-09-28T19:06:10Z

@silverwind

Also keep in mind that there are modules that assign to process.env.

If that was the only possible way to set the env from node, then it could be covered by updating the cache in the setter. But it's not.

Fishrock123 · 2015-10-16T03:54:30Z

Doesn't seem worthwhile. Closing, please cache it yourself if need be. :)

STRML · 2015-12-02T20:53:12Z

Just wanted to ping this, we're having quite a hard time handling this in React, and caching NODE_ENV results in a 100% (!) speedup in benchmark code. In real-world testing, I see roughly 50-70%. It's significant.

I'm getting sick of issuing PRs to repos caching access to NODE_ENV, and there really isn't a good solution for modules that need to hide features behind debug flags and want them eliminated in bundling. If we cache the environment lookup, UglifyJS is unable to eliminate the dead code in bundling unless we use a const, which then breaks older browsers.

In reality, is it really useful to rely on external changes to the env (say, from native code)? What proportion of users really need this, and is it a valuable tradeoff compared to the performance benefits we could get ecosystem-wide?

Would the Node core contributors consider a semver-major patch? My thoughts are to make process.env a plain object of the env available at boot (still global and able to be set internally), then added either a process.refreshEnv() method or a process.liveEnv object that had the old semantics.

Fishrock123 · 2015-12-02T21:00:18Z

@STRML could you detail what problems you are having caching it locally?

Fishrock123 · 2015-12-02T21:03:19Z

@STRML Another option, make a module that does this and load it in -r/--require:

const env = process.env
delete process.env // might not be necessary
process.env = env

STRML · 2015-12-02T21:50:10Z

I've been doing:

process.env = JSON.parse(JSON.stringify(process.env));

Which seems to work fine.

Okay - so the issue with a library like React caching locally is that there are three main usage paths and two goals.

The goals:

Eliminate the runtime penalty from accessing process.env in Node environments.
Successfully optimize out dev/debug flags so entire branches can be removed during a minification step.

As a library author, you usually deliver pre-bundled source (dist.js, dist.min.js), built source (say, with babel, in a lib/ folder), and sometimes pre-built source (ES2015 etc).

There are three main usage paths:

Path 1: Pre-bundled source in the browser (or even via require('mylibrary/dist/lib.min.js')); this is the easiest, we can easily stub process.env.NODE_ENV or __DEV__ flags with constants in our own build steps. Goal 1 and 2 are easily attained.
Path 2. Inside NodeJS. We need to have a build step from src/ to lib/ that's replacing __DEV__ flags or calls to process.env.NODE_ENV to cache them so process.env.NODE_ENV is not accessed in hot functions. That's what this PR does. You can do the same by simply hoisting yourself in smaller projects. Goal 1 can be attained with some fiddling, goal 2 is irrelevant.
Path 3. Through browserify/webpack by bundling source. This presents a problem in combination with Path 2 because we've just hoisted our env accesses, so the minifier needs to be smart enough to still be able to do dead code elimination. UglifyJS is smart enough to do this, but only if you use a const declaration, which is not portable to < IE11, so it's not viable in most situations. It will not be able to figure out:

// This would work if it were a const
var NODE_ENV = process.env.NODE_ENV;
//...
if(NODE_ENV !== 'production') { // this will not be eliminated
  // debug code
}

Closure compiler has a /** @const */ comment which works. I want to port this to UglifyJS which would solve the problem.

As a result, if we solve Path 2 (NodeJS), we produce larger bundles for Path 3 (Webpack/Browserify) with UglifyJS. In the case of React, the bundle can be as much as 40% larger gzipped, so it's very significant (38 vs 53kb).

If process.env didn't have this performance penalty, then we could just go on writing if (process.env.NODE_ENV !== 'production'), as most library authors already do, envify and Uglify can handle this, and there'd be no problem.

vkurchatkin · 2015-12-02T22:17:26Z

Is there a real reason to use env for this? You can use global.__REACT_ENV the same way

STRML · 2015-12-02T22:21:52Z

It's possible. It appears to be the attitude of the devs that they want to avoid polluting globals whenever possible.

Despite that, this is a more widespread problem that doesn't just affect React.

jtlapp · 2016-06-01T15:40:25Z

The performance penalty is still not mentioned in the docs. On the contrary, because changes don't affect the shell, the docs seem to imply that env is just like any other JS object.

bnoordhuis · 2016-06-01T15:50:17Z

because changes don't affect the shell

They do, child processes created with the child_process module inherit the modified environment unless told otherwise.

Are you thinking of seeing changes to process.env reflected in the parent shell? That's not how setenv(3) and friends work. No program works like that.

jtlapp · 2016-06-01T15:55:06Z

Sorry, I'm not much of a shell programmer. I just would rather not have to discover the performance penalty when it's a problem. Why can't the docs say reading process.env makes system calls (if that's what it's doing)? Why do we have to discover that process.env is not a normal JS object the hard way?

bnoordhuis · 2016-06-01T16:01:55Z

We accept pull requests, you're welcome to have a stab at improving the documentation.

Having said that, any JS object can be "magic" and really be a bunch of C++ code; process.env is not special in that respect.

jtlapp · 2016-06-01T20:46:16Z

I'd have to investigate the node.js code to have confidence making a statement. Several people in this discussion are already claiming to know how it works. I'm hoping that someone who wants to be helpful wouldn't mind taking a minute to update the repo they already have checked out.

Just a small documentation update, if you're open to it. Suggesting that `NODE_ENV` be "cached" in the example `unsafeValidate` method. Reasoning / discussions here: * nodejs/node#3104 * nodejs/node#1648 Unless you'd expect `NODE_ENV` to be editable while the application is running (debugging on a live server, perhaps?)

JDiPierro · 2018-08-20T19:51:24Z

It looks like Node is caching process.env nowadays:

⌚ 15:46:00
$ export FOO=one

⌚ 15:46:07
$ export BAR=two

⌚ 15:46:09
$ export BAZ=three

⌚ 15:46:12
$ node
> console.time("logEnv"); console.log(process.env.FOO); console.timeEnd("logEnv");
one
logEnv: 0.849ms
undefined

> console.time("logEnv"); console.log(process.env.BAR); console.timeEnd("logEnv");
two
logEnv: 0.224ms
undefined

> console.time("logEnv"); console.log(process.env.BAZ); console.timeEnd("logEnv");
three
logEnv: 0.076ms
undefined

> console.time("logEnv"); console.log(process.env.FOO); console.timeEnd("logEnv");
one
logEnv: 0.085ms
undefined

> console.time("logEnv"); console.log(process.env.BAZ); console.timeEnd("logEnv");
three
logEnv: 0.077ms
undefined

> console.time("logEnv"); console.log(process.env.BAR); console.timeEnd("logEnv");
two
logEnv: 0.085ms
undefined

$ node --version
v10.6.0

a-stepanenko · 2018-09-20T23:19:22Z

I'd like to warn you that previous comment is not the case. Node.js does not cache env vars, so each time you use process.env.VAR you actually call getnev(). This might be a performance issue if you have lots of env vars (this is especially important in case you run your node.js app in k8s since k8s creates env vars for all endpoints in a namespace, see kubernetes/kubernetes#60099 for details)

Proof (running node.js which prints some env var in an endless loop, attaching to the node.js process via gdb and changing this var):

root@it100msk:~# nodejs -v
v10.11.0
root@it100msk:~# cat test.js
while (true) {
    console.log(process.env.var1);
    Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, 3000);
}
root@it100msk:~# export var1=1
root@it100msk:~# nodejs test.js
1
1
1
1
1

another console on the same host
root@it100msk:~# ps auxf | grep node
root     20226  0.0  0.0  15648  1032 pts/2    S+   02:16   0:00  |       \_ grep --color=auto node
root     20200  0.6  0.1 557612 30580 pts/1    Sl+  02:16   0:00          \_ nodejs test.js
root@it100msk:~# gdb
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) attach 20200
Attaching to process 20200
[New LWP 20201]
[New LWP 20202]
[New LWP 20203]
[New LWP 20204]
[New LWP 20205]
[New LWP 20206]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f8cc38b4ed9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7ffe46e9cd00, expected=0, futex_word=0x4498250) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
142	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) call putenv("var1=2")
$1 = 0
(gdb) detach
Detaching from program: /usr/bin/node, process 20200
(gdb) quit

back to the first console
2
2
2
2
2

Removes the 'log' export from 'insight'. Accessing process.env.XXX is relatively slow in Node.js. Benchmark of a plain object property access and accessing process.env.NODE_ENV: ``` property access 500000000 iterations 0 ns/op process.env access 5000000 iterations 246 ns/op ``` See this thread nodejs/node#3104 for some more context

Highlight user code in internal stacktrace. Other Changes Avoid to use process.env at runtime because it does an expensive syscall (CF: nodejs/node#3104 (comment)) Boyscout export Global in a TS file to avoid the necessity of building the app to run it

AlexandriaOL · 2022-01-02T19:33:59Z

Sorry, I'm not much of a shell programmer. I just would rather not have to discover the performance penalty when it's a problem. Why can't the docs say reading process.env makes system calls (if that's what it's doing)? Why do we have to discover that process.env is not a normal JS object the hard way?

I'm writing here just to clarify this fundamental misunderstanding of how UNIX Environments work, for people who read this in the future (I came to this from a thread on an internet forum).

The Shell Environment is defined in the SYSV ABI Specification, next to or in the same section where the auxillary (auxv) and argument (argv) vectors are defined. Upon shell creation, Enviroments are copied from the parent process into a C-array called envp, which is logically after the arguments vector. You can subvert this when you create a process using the exceve (Called such because it is exec, with arg(V) and (E)nvp) system call. Typically, programs pass their environment down to their children. Environments are an array of strings, with the form NAME=VAR.

Typically, because of it's inherent immutability (disregarding setenv), and it's stringly-typed nature, you will want to access this once at program start, and then never again. setenv (which sets an environment value) is probably the most useless function there is in the POSIX standard, because there are better ways to do whatever you are trying to do !

e.g. if you want to communicate an environment to your child, then pass an environment explicitly to your child; if you want to communicate to other threads, keep in mind that setenv is not Atomic and therefore you are likely to get a corrupted read at some point, and use a different (and safer) method of threadful communication; if you want to communicate to yourself, use a global variable -- they're free and come with type safety, too.

EDIT: An addendum. One typical example of wanting to use setenv is to do something like:

setenv("some_internal_value", to_json(some_data), 1);
(your fork/execvp code here)

and then in the child:

some_data = getenv("some_internal_value");

However this is a security risk detailed in the CERT C Coding Guidelines (Which are equally applicable in this instance! Probably one of the few instances where the semantics of Windows/Linux, C, POSIX, and Javascript are all applicable !). https://wiki.sei.cmu.edu/confluence/display/c/ENV03-C.+Sanitize+the+environment+when+invoking+external+programs and https://wiki.sei.cmu.edu/confluence/display/c/ENV02-C.+Beware+of+multiple+environment+variables+with+the+same+effective+name :)

see: nodejs/node#3104

ChALkeR added process Issues and PRs related to the process subsystem. feature request Issues that request new features to be added to Node.js. and removed feature request Issues that request new features to be added to Node.js. labels Sep 28, 2015

Fishrock123 closed this as completed Oct 16, 2015

This was referenced Jan 19, 2016

Server rendering is slower with npm react facebook/react#812

Closed

Mark vars with /** @const */ pragma as consts so they can be eliminated. mishoo/UglifyJS#928

Merged

tjwebb mentioned this issue Apr 18, 2016

cache process.env trailsjs/trails#150

Closed

keyz mentioned this issue Jun 22, 2016

dist/react-dom.js cannot get React.__SECRET_DOM_DO_NOT_USE_OR_YOU_WILL_BE_FIRED facebook/react#7092

Closed

mAiNiNfEcTiOn mentioned this issue Dec 29, 2016

Store management with createStore frintjs/frint#71

Merged

ehg mentioned this issue Jan 17, 2017

Server: Use production react in the server bundle for prod Automattic/wp-calypso#10693

Merged

gerardmrk mentioned this issue Nov 7, 2017

Some of the performance items goldbergyoni/nodebestpractices#31

Closed

dvlsg mentioned this issue Nov 28, 2017

docs: cache NODE_ENV in unsafe validate example gcanti/io-ts#92

Merged

gregmartyn mentioned this issue Mar 28, 2018

RFC: process.env.RAZZLE_RUNTIME_XXXX jaredpalmer/razzle#528

Closed

richardlau mentioned this issue Sep 21, 2018

Node 10.10.0 is not caching env variables (slow "getenv") #22960

Closed

ArfatSalman mentioned this issue Oct 15, 2018

JWT pesto-students/project-delta#8

Merged

yepninja mentioned this issue Dec 22, 2018

Warning for mismatching env variables jaegertracing/jaeger-client-node#332

Merged

BorntraegerMarc mentioned this issue Mar 17, 2020

Feature request: Caching of env variables nestjs/config#121

Closed

vankop mentioned this issue Mar 29, 2020

feat(EnvironmentPlugin): improve performance, define all keys by default webpack/webpack#10637

Closed

thetutlage mentioned this issue Jul 6, 2020

Adding support for validations and cache in Env provider adonisjs/rfcs#24

Closed

dirkdev98 mentioned this issue Oct 29, 2020

stdlib: cache environment variables compasjs/compas#454

Merged

vladar mentioned this issue Dec 1, 2020

feature(gatsby): Add experiment to run source plugins in parallel gatsbyjs/gatsby#28214

Merged

DylanVann mentioned this issue Jan 14, 2021

chore: refactor environment variables graphile/starter#234

Closed

5 tasks

Aschen mentioned this issue Jan 27, 2021

Error stacktrace overhaul kuzzleio/kuzzle#1944

Merged

alexanderbartels mentioned this issue Mar 5, 2021

feat(core): lower case http headers, based on env variable. Fixes #311 marblejs/marble#318

Closed

didiercolens mentioned this issue Sep 7, 2021

performance impact of reading process.env.X motdotla/dotenv#562

Closed

lmolkova mentioned this issue Mar 11, 2022

[instrumentation] - Suppress tracing using environment variables Azure/azure-sdk-for-js#20776

Merged

3 tasks

pieh mentioned this issue May 19, 2022

feat(gatsby, gatsby-plugin-utils): add image cdn source urls to redux gatsbyjs/gatsby#35427

Merged

cskiwi added a commit to Badminton-Apps/badman that referenced this issue Oct 7, 2023

feat: cache the config for better performance

8fefa8d

see: nodejs/node#3104

polymath-eric mentioned this issue Aug 22, 2024

Send slack messages for new Business application or application put on hold PolymeshAssociation/cdd-onboarding#102

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache accesses to process.env? #3104

Cache accesses to process.env? #3104

STRML commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

STRML commented Sep 28, 2015

Fishrock123 commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

silverwind commented Sep 28, 2015

STRML commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

Fishrock123 commented Oct 16, 2015

STRML commented Dec 2, 2015

Fishrock123 commented Dec 2, 2015

Fishrock123 commented Dec 2, 2015

STRML commented Dec 2, 2015

vkurchatkin commented Dec 2, 2015

STRML commented Dec 2, 2015

jtlapp commented Jun 1, 2016

bnoordhuis commented Jun 1, 2016

jtlapp commented Jun 1, 2016

bnoordhuis commented Jun 1, 2016

jtlapp commented Jun 1, 2016

JDiPierro commented Aug 20, 2018

a-stepanenko commented Sep 20, 2018

AlexandriaOL commented Jan 2, 2022 •

edited

Loading

Cache accesses to process.env? #3104

Cache accesses to process.env? #3104

Comments

STRML commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

STRML commented Sep 28, 2015

Fishrock123 commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

silverwind commented Sep 28, 2015

STRML commented Sep 28, 2015

ChALkeR commented Sep 28, 2015

Fishrock123 commented Oct 16, 2015

STRML commented Dec 2, 2015

Fishrock123 commented Dec 2, 2015

Fishrock123 commented Dec 2, 2015

STRML commented Dec 2, 2015

vkurchatkin commented Dec 2, 2015

STRML commented Dec 2, 2015

jtlapp commented Jun 1, 2016

bnoordhuis commented Jun 1, 2016

jtlapp commented Jun 1, 2016

bnoordhuis commented Jun 1, 2016

jtlapp commented Jun 1, 2016

JDiPierro commented Aug 20, 2018

a-stepanenko commented Sep 20, 2018

AlexandriaOL commented Jan 2, 2022 • edited Loading

AlexandriaOL commented Jan 2, 2022 •

edited

Loading