Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set up cluster.schedulingPolicy in ESM #49240

Open
alexfernandez opened this issue Aug 19, 2023 · 15 comments · May be fixed by #49292
Open

Cannot set up cluster.schedulingPolicy in ESM #49240

alexfernandez opened this issue Aug 19, 2023 · 15 comments · May be fixed by #49292
Labels
cluster Issues and PRs related to the cluster subsystem.

Comments

@alexfernandez
Copy link

alexfernandez commented Aug 19, 2023

Version

v18.17.1

Platform

Linux xxx 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

cluster

What steps will reproduce the bug?

Run the following code in Node.js:

import * as cluster from 'cluster'

if (cluster.isPrimary) { 
    process.env.NODE_CLUSTER_SCHED_POLICY = 'none'
    console.log(`process.env.NODE_CLUSTER_SCHED_POLICY: ${process.env.NODE_CLUSTER_SCHED_POLICY}`)
    for (let index = 0; index < 2; index++) { 
        cluster.fork()
        setTimeout(() => console.log(`cluster.schedulingPolicy: ${cluster.schedulingPolicy}`), 1000)
    }
} else { 
    setTimeout(() => null, 1000)
}

cluster-error.txt
Attached as cluster-error.txt. You will see that it prints the following:

$ node cluster-error.js 
process.env.NODE_CLUSTER_SCHED_POLICY: 'none'
cluster.schedulingPolicy: 2
cluster.schedulingPolicy: 2

So the env variable NODE_CLUSTER_SCHED_POLICY has no effect, cluster.schedulingPolicy has the value 2 which is the default cluster.SCHED_RR. However running it with the env variable from the console works as expected:

$ NODE_CLUSTER_SCHED_POLICY='none' node cluster-error.js 
process.env.NODE_CLUSTER_SCHED_POLICY: none
cluster.schedulingPolicy: 1
cluster.schedulingPolicy: 1

I cannot set the cluster.schedulingPolicy as per the docs: cluster.schedulingPolicy = cluster.SCHED_NONE since the cluster module is imported:

import * as cluster from 'cluster'
cluster.schedulingPolicy = cluster.SCHED_NONE

so I get a TypeError:

cluster.schedulingPolicy = cluster.SCHED_NONE
                         ^

TypeError: Cannot assign to read only property 'schedulingPolicy' of object '[object Module]'
    at file:///home/alex/projects/loadtest/cluster-error.js:2:26
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)

How often does it reproduce? Is there a required condition?

With the attached code it works (as in doesn't work) always.

What is the expected behavior? Why is that the expected behavior?

I expect to see the same output as with the env variable NODE_CLUSTER_SCHED_POLICY='none', i.e. the cluster.schedulingPolicy is set to 1 cluster.SCHED_NONE:

$ node cluster-error.js 
process.env.NODE_CLUSTER_SCHED_POLICY: none
cluster.schedulingPolicy: 1
cluster.schedulingPolicy: 1

(Fake output.)

What do you see instead?

Instead I see cluster.schedulingPolicy has the default value of 2 cluster.SCHED_RR:

$ node cluster-error.js 
process.env.NODE_CLUSTER_SCHED_POLICY: none
cluster.schedulingPolicy: 2
cluster.schedulingPolicy: 2

Additional information

I am the author of the loadtest package and I am converting it to multi-core. I really see a big difference of more than 30% when running the test server in the default round-robin mode and in the none mode I want to set:

  • cluster.SCHED_RR: 9600 rps.
  • cluster.SCHED_NONE: 12700 rps.

It is true that load on processors is a bit more uneven in none, but that doesn't seem to matter much. I want my lightning fast speed! ⚡

Besides, I think that the documentation should be updated because the recommended way of setting cluster.schedulingPolicy does not work. I would think that the best solution is to have a setting in cluster.settings that can be set with cluster.setupPrimary([settings]). As suggested by @bnoordhuis in this hopeful comment.

@alexfernandez
Copy link
Author

alexfernandez commented Aug 19, 2023

I found a nasty workaround: since imports are always hoisted to the top of the file, setting the env variable before the import does not work. But if I use a dynamic import it works as expected:

process.env.NODE_CLUSTER_SCHED_POLICY = 'none'
//import * as cluster from 'cluster'
const cluster = await import('cluster')

if (cluster.isPrimary) { 
    console.log(`process.env.NODE_CLUSTER_SCHED_POLICY: ${process.env.NODE_CLUSTER_SCHED_POLICY}`)
    for (let index = 0; index < 2; index++) { 
        cluster.fork()
        setTimeout(() => console.log(`cluster.schedulingPolicy: ${cluster.schedulingPolicy}`), 1000)
    } 
} else { 
    setTimeout(() => null, 1000)
}

This is quite tricky; but it works, so I leave it here for other people to find it. Also posted to Stack Overflow for AIs to harvest other performance freaks.

@aduh95 aduh95 changed the title Cannot set up cluster.schedulingPolicy in a JS module (with import) Cannot set up cluster.schedulingPolicy in ESM Aug 19, 2023
@aduh95
Copy link
Contributor

aduh95 commented Aug 19, 2023

Can you clarify if you are reporting a difference of behavior between CJS (with require("node:cluster")) vs ESM (with import "node:cluster")? Or are you reporting that the changes to process.env.NODE_CLUSTER_SCHED_POLICY are not taken into account after loading node:cluster on CJS and ESM?

@alexfernandez
Copy link
Author

alexfernandez commented Aug 19, 2023

To clarify, I have not reported anything at all about require() in CJS because I'm not using it, it should work as usual. I am just reporting that on ESM it is impossible to change cluster.schedulingPolicy with the usual import; in fact it works with a dynamic import(). So your new title is spot on.

@aduh95
Copy link
Contributor

aduh95 commented Aug 19, 2023

Could you test with require and report back? That would help understand what is the issue and how we can fix or at least document it.

@alexfernandez
Copy link
Author

alexfernandez commented Aug 19, 2023

Sure, it works as expected. For reference, the code is:

process.env.NODE_CLUSTER_SCHED_POLICY = 'none'

const cluster = require('cluster')
console.log(cluster)
//import * as cluster from 'cluster'

if (cluster.isPrimary) {
    console.log(`process.env.NODE_CLUSTER_SCHED_POLICY: ${process.env.NODE_CLUSTER_SCHED_POLICY}`)
    for (let index = 0; index < 2; index++) {
        cluster.fork()
        setTimeout(() => console.log(`cluster.schedulingPolicy: ${cluster.schedulingPolicy}`), 1000)
    }
} else {
    setTimeout(() => null, 1000)
}

The fix is trivial: adding schedulingPolicy to cluster.settings, I have actually done it in a couple of lines. The biggest effort is to change the docs. If you want I can prepare a PR.

@aduh95
Copy link
Contributor

aduh95 commented Aug 19, 2023

Sure, sending a PR would be fantastic :)

@alexfernandez
Copy link
Author

alexfernandez commented Aug 20, 2023

Some more info about possible solutions.

The proper solution seems to me to add schedulingPolicy to cluster.settings so that it can be set using cluster.setPrimary({schedulingPolicy: cluster.SCHED_NONE}). It can be implemented fairly easily.

When using ESM import you get back a read-only copy, so that neither cluster.schedulingPolicy nor cluster.settings will show any changes after setting schedulingPolicy. It will work with cluster.setPrimary() as shown above but there will be no way to test it. Moving further would be to implement getSettings() so that it returns a read-only copy of the present settings, and deprecate both cluster.schedulingPolicy and cluster.settings. This way a test can be implemented as shown below:

import '../common/index.mjs';
import assert from 'node:assert';
import * as cluster from 'cluster';


assert.strictEqual(cluster.schedulingPolicy, cluster.SCHED_RR);
cluster.setupPrimary({ schedulingPolicy: cluster.SCHED_NONE });
const settings = cluster.getSettings();
assert.strictEqual(settings.schedulingPolicy, cluster.SCHED_NONE);

I have a preliminary implementation ready that I can post for comments, only the documentation is missing. Is this going too far?

@aduh95
Copy link
Contributor

aduh95 commented Aug 20, 2023

Is this going too far?

I would have a hard time answering you without looking at the code – and even then, I have not worked on node:cluster so I would not be the right person to ask. IMO opening a draft PR is the logical next step, then we can get early feedback before you spend too much time working on the documentation.

@alexfernandez
Copy link
Author

Sounds good. I will wait for triage and more feedback before opening the draft PR if you don't mind; meanwhile this is the branch as a RFC, with three commits:

Thanks!

@benjamingr
Copy link
Member

@nodejs/cluster

@bnoordhuis
Copy link
Member

I don't think a new cluster.getSettings() public API method is necessary or desirable. An internal method in lib/internal/cluster is sufficient for testing purposes.

Supporting cluster.setupPrimary({schedulingPolicy}) seems fine to me. Deprecating cluster.schedulingPolicy or cluster.settings not so much, just add doc notes explaining their ESM mode shortcomings and how to work around that.

@alexfernandez
Copy link
Author

@bnoordhuis Thanks for the feedback!

I don't think a new cluster.getSettings() public API method is necessary or desirable. An internal method in lib/internal/cluster is sufficient for testing purposes.

Good for me. My concern was that in ESM there is no way to programmatically access cluster.schedulingPolicy or indeed cluster.settings, since once the cluster module is imported at the top its attributes will never change. If you see other ways to address it, or maybe think it's not necessary, then let's move on.

Supporting cluster.setupPrimary({schedulingPolicy}) seems fine to me. Deprecating cluster.schedulingPolicy or cluster.settings not so much, just add doc notes explaining their ESM mode shortcomings and how to work around that.

I will do that, it's also significantly easier to implement and document. As stated above I don't know how to work around read-only cluster.schedulingPolicy or cluster.settings so I will just document that they will not yield the desired results.

@bnoordhuis
Copy link
Member

I usually take a very reactive approach: don't do anything until users start filing bug reports :-)

alexfernandez added a commit to alexfernandez/node that referenced this issue Aug 20, 2023
@alexfernandez
Copy link
Author

Excuse my ignorance @bnoordhuis, is an "internal method" just undocumented? Or do I need to mark it with some special code? 🤔

@VoltrexKeyva VoltrexKeyva added the cluster Issues and PRs related to the cluster subsystem. label Aug 22, 2023
alexfernandez added a commit to alexfernandez/node that referenced this issue Aug 22, 2023
alexfernandez added a commit to alexfernandez/node that referenced this issue Aug 22, 2023
@alexfernandez alexfernandez linked a pull request Aug 22, 2023 that will close this issue
@alexfernandez
Copy link
Author

alexfernandez commented Aug 22, 2023

I sent this PR to move this issue forward: #49292. I'm not sure if it should be a notable PR, it definitely changes the external API. Hope everything is OK!

alexfernandez added a commit to alexfernandez/node that referenced this issue Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cluster Issues and PRs related to the cluster subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants