RFC: Add std::env::concurrency_hint #985

alexcrichton · 2015-03-17T17:08:11Z

This new function will return a hint to the amount of concurrency the
underlying system can support.

This new function will return a **hint** to the amount of concurrency the underlying system can support.

mahkoh · 2015-03-17T17:46:09Z

A useless compromise for compromise's sake. It is not as abstract as the C++ function which allows the C++ function to be used in exotic environments, nor is the stdlib as a whole suitable for this, nor is it as specific as the information that it actually available on the supported systems.

comex · 2015-03-17T20:57:42Z

For the record, I've never liked the approach of testing the number of CPUs and spawning that number of threads. What if you have some 1024-core NUMA CPU (hardly unheard of) and you're running, say, 64 opportunistically concurrent processes? It would be much better for each to spawn 16 threads than for each to overwhelm the kernel with 1024 threads... That's an extreme example, but I'm not sure regular systems are that different if you're running a bunch of CPU intensive applications all making their own thread pools.

It would be better if the kernel globally coordinated the threads, which I think is what Grand Central Dispatch tries to do on OS X. If a higher level thread pool ever makes it into the standard library, it should consider using it where available. Of course, this PR isn't that, and high level libraries always have situations where they're inapplicable. So just take this comment as a measure of general dissatisfaction with the status quo.

sanxiyn · 2015-03-18T04:33:39Z

I like this.

@comex, serious users with NUMA and stuffs can use hwloc, which embeds insane amounts of knowledge and is not appropriate for the standard library. This function is not intended for such uses.

theemathas · 2015-03-19T05:24:37Z

What if we do not have any information about the amount of concurrency?

I believe that the correct type signature should be this:

pub fn concurrency_hint() -> Option<u32>;

sanxiyn · 2015-03-19T05:43:40Z

I think the correct design is to return 1 if there is no information, and allowing runtime override, for example, using environment variable, with name like RUST_CONCURRENCY.

Someone could even write a program setting RUST_CONCURRENCY using information from hwloc.

reem · 2015-03-19T16:17:52Z

This sounds not very useful, in my opinion. I understand the need for stabilizing only conservative behavior, but for something like this it might as well just live in a cargo package and grow independently.

Ericson2314 · 2015-03-19T21:29:43Z

-1 for reasons given. Too simple to be useful for really low level code, and too crude for a good high-level abstraction, it's an "anti-Goldilocks". Such abstractions just tempt people to do the wrong thing.

tshepang · 2015-03-19T22:01:30Z

People are going to expect this from libstd, and the functionality is too simple to have to create a whole external crate for it.

alexcrichton · 2015-03-19T22:07:58Z

@theemathas

What if we do not have any information about the amount of concurrency?

I feel that all users of this function would use .unwrap_or(1) in the instance that None was returned, however. For example this function is largely intended to serve as simply a hint, not a rigorous statistic, so if None is returned there's nothing you can really do other than assume 1 anyway.

@Ericson2314

Too simple to be useful for really low level code, and too crude for a good high-level abstraction, it's an "anti-Goldilocks".

I agree that this is not useful for low level code, but the documentation is intended to strongly word that. The idea is that the return value is very strongly just a hint, no concrete statistic about the underlying system (hence the inappropriateness for low-level code which wants this sort of precise knowledge).

I'm not sure I agree that this is too crude, however. For example cargo wants to spin off some number of builds in parallel, but how does it determine what number of builds to spin off? Is using the number returned from this function wrong?

Ericson2314 · 2015-03-19T23:51:01Z

@alexcrichton To come clean, I had previous thought about peril of weak and leaky abstractions, but not include get_num_cpus among them. But the comments convinced me.

Looking at your cargo example, I am inclined to say it is indeed wrong. Realistically, better methods will take a good deal of developer effort, so no hard feelings if Cargo keeps on using it :). Consider that it doesn't reflect the current load of the system, and precludes the OS from scheduling more cargo-spawned processes when others are blocking. Fundamentally there is a tension between exposing the potential parallelism to the OS scheduler, and avoiding the overhead of many of processes---A classic granularity tradeoff. Num CPUs is but a crude heuristic, and a more high level way to expresses the dependency DAG, and somehow let the OS/third-party decide the best way to traverse it based on the dynamic characteristics of the system would be ideal, and work for distributed builds too.

Granted we do have convenience methods in e.g. read/writer that have pitfalls too, but at least there exist situations where those convenience methods are exactly what one wants. I can't think of a situation where Num CPUs is truly the ideal method, and not just a quick fix in liu of something harder to make but better.

carllerche · 2015-03-25T04:07:32Z

My vote is to leave this API in a dedicated crate and have it evolve as needed. There are many arguments in this thread and I agree with them.

alexcrichton · 2015-04-09T21:40:58Z

It doesn't look like there's enough consensus to add this at this time, so I'm going to close this for now. Thanks for the discussion everyone!

tshepang · 2015-04-09T23:21:51Z

I find concurrency_hint a great name even if overlong. It's certainly better than the ugly num_cpus which is now quite a popular crate. My favorite alternative is cpu_count, even though the cpu part is controversial.

Glitchy-Tozier · 2021-11-30T07:08:01Z

Does anyone know whether something like this has been implemented as of yet? I can't find a lot of information, except "use the num_cpus crate".

kennytm · 2021-11-30T07:59:53Z

@Glitchy-Tozier rust-lang/rust#74479

Glitchy-Tozier · 2021-11-30T09:31:02Z

@kennytm Thank you!

RFC: Add std::env::concurrency_hint

a036a83

This new function will return a **hint** to the amount of concurrency the underlying system can support.

alexcrichton self-assigned this Mar 17, 2015

alexcrichton closed this Apr 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Add std::env::concurrency_hint #985

RFC: Add std::env::concurrency_hint #985

alexcrichton commented Mar 17, 2015

mahkoh commented Mar 17, 2015

comex commented Mar 17, 2015

sanxiyn commented Mar 18, 2015

theemathas commented Mar 19, 2015

sanxiyn commented Mar 19, 2015

reem commented Mar 19, 2015

Ericson2314 commented Mar 19, 2015

tshepang commented Mar 19, 2015

alexcrichton commented Mar 19, 2015

Ericson2314 commented Mar 19, 2015

carllerche commented Mar 25, 2015

alexcrichton commented Apr 9, 2015

tshepang commented Apr 9, 2015

Glitchy-Tozier commented Nov 30, 2021

kennytm commented Nov 30, 2021

Glitchy-Tozier commented Nov 30, 2021

RFC: Add std::env::concurrency_hint #985

RFC: Add std::env::concurrency_hint #985

Conversation

alexcrichton commented Mar 17, 2015

mahkoh commented Mar 17, 2015

comex commented Mar 17, 2015

sanxiyn commented Mar 18, 2015

theemathas commented Mar 19, 2015

sanxiyn commented Mar 19, 2015

reem commented Mar 19, 2015

Ericson2314 commented Mar 19, 2015

tshepang commented Mar 19, 2015

alexcrichton commented Mar 19, 2015

Ericson2314 commented Mar 19, 2015

carllerche commented Mar 25, 2015

alexcrichton commented Apr 9, 2015

tshepang commented Apr 9, 2015

Glitchy-Tozier commented Nov 30, 2021

kennytm commented Nov 30, 2021

Glitchy-Tozier commented Nov 30, 2021