-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large structs constructed on stack #817
Comments
This will hopefully get better once we release Rand 0.7, which uses ChaCha20 for
|
We should add some Reusing stack space tightly sounds vital CPU cache performance. Yet, Rust code often returns large structures on the stack. We've RFCs for returning DSTs and VLAs that encourage this. We'll gain const generics this year, which should make methods like I'd therefore consider this an ABI bug or code generation bug in rustc. I'd expect the caller to pass the called method a pointer to memory where it may construct the returned type, but even if it simply returned the value on the stack, then you'd double, or quadruple in this case, you cache efficiency by doing an overlaping copy instead of the non-overlaping copy they obviously do. Can you create an issue on https://github.com/rust-lang/rust/ ? |
Agreed; I'll add to #815. Without testing the full application, I can only guess where these will be most effective however. |
We declare methods `#[inline(always)]` if they return a 200 byte Transcript because rustc does not handle large returns as efficently as one might like. See rust-random/rand#817
OT: Is there any "hot spot" analyzer for rust code? Or LLVM? You'd put tests into it and it identifies better inlining or whatever? |
There are no I've no idea about inlining methods from |
The At any rate, benchmarks are either a little better or similar, so I think this is going in the right direction. |
Thanks all! That was a swift response! I agree that the best solution would be some sort of language support (e.g. move elision/optimization or emplacement construction), but that doesn't seem to be a near-term thing, and I think it is already a recognized problem (e.g. there have been emplacement RFCs, and this, and others). @vks Thanks for the suggestions
I will give that a try. I will also try after the release and see...
This doesn't solve the problem because the struct is created on the stack first, and then copied to the heap.
Unfortunately, I need a crypto RNG :/ |
ChaCha only needs 136 bytes, so it hopefully works for you. We could probably use the Gimli permutation as a RNG, which only has 48 bytes of state, but this is not yet implemented. |
It's an interesting point that |
Good point @burdges but somewhat off-topic. I'll make those releases... |
This is exactly what it happens, but for soundness reasons, a copy is performed in the body of the function. (oops, I forgot to hit send 4 days ago...) |
@eddyb are there any plans to implement NRVO in rustc? would that be a breaking change? or alternately, do you think an explicit syntactic method of invoking NRVO would be better (e.g. some attribute)? |
Implementation/enablement of optimisations is, by the definition of an optimisation, not a breaking change. |
@mark-i-m Yes, see rust-lang/rust#47954 (which I first tried to get into Rust last year), it's mostly a matter of adding better debuginfo support (rust-lang/rust#56231, which I was working earlier this year before a few other things came up) before we can start doing that sort of optimizations (and IMO the need for debuginfo support would also apply if it were more manual). And no, it wouldn't be a breaking change, as @nagisa pointed out. LLVM already can technically do it sometimes, it's just not as aggressive as we'd need it to be for Rust. |
@mark-i-m please test the new I think then we can close this? |
@dhardy Thanks! Unfortunately, I'm not able to build 😕 It looks like the most recent build doesn't pass a $ make runtext
make -C kernel
make[1]: Entering directory '/nobackup/os2/kernel'
gcc -m64 -O0 -g -c -o start.o start.S
gcc -m64 -O0 -g -c -o mbr.o mbr.S
cargo xbuild -Z unstable-options --out-dir `pwd` --target x86_64-unknown-elf.json
Updating crates.io index
Compiling core v0.0.0 (/nobackup/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore)
Compiling compiler_builtins v0.1.16
Compiling rustc-std-workspace-core v1.0.0 (/nobackup/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/tools/rustc-std-workspace-core)
Compiling alloc v0.0.0 (/tmp/xargo.0gpCipQ7XgVt)
Finished release [optimized] target(s) in 21.57s
Compiling semver-parser v0.7.0
Compiling cc v1.0.35
Compiling autocfg v0.1.2
Compiling ppv-lite86 v0.2.5
Compiling lazy_static v1.3.0
error[E0463]: can't find crate for `std`
--> /nobackup/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/lazy_static-1.3.0/src/inline_lazy.rs:9:1
|
9 | extern crate std;
| ^^^^^^^^^^^^^^^^^ can't find crate
|
= note: the `x86_64-unknown-elf-1121863173107299811` target may not be installed
error: aborting due to previous error
For more information about this error, try `rustc --explain E0463`.
error: Could not compile `lazy_static`.
warning: build failed, waiting for other jobs to finish...
error: build failed
Makefile:44: recipe for target 'libkernel.a' failed
make[1]: *** [libkernel.a] Error 101
make[1]: Leaving directory '/nobackup/os2/kernel'
Makefile:8: recipe for target 'kernel' failed
make: *** [kernel] Error 2 and my [package]
name = "kernel"
version = "0.1.0"
authors = ["mark"]
[lib]
name = "kernel"
path = "lib.rs"
crate-type = ["staticlib"]
[profile.dev]
panic = "abort"
[profile.release]
panic = "abort"
[dependencies]
rlibc = "1.0.0"
spin = "0.4.8"
smallheap = { git = "https://github.com/mark-i-m/smallheap", features = ["no_std"] }
buddy = { git = "https://github.com/mark-i-m/buddy" }
x86_64 = "0.5.2"
os_bootinfo = "0.2.1"
rand = { version = "0.7.0-pre.1", default-features = false, features = ["alloc"] } |
You're right, we should depend on @kazcw how should we handle the |
@dhardy The option to disable @vks Chacha can be as light as 138 bytes, but a |
Unfortunately I think forwarding the Do we want to use the new Also, should we drop the |
@mark-i-m would you like to re-test with rand 0.7.1? I believe this should have solved the |
Hmm... I am getting the following with
I've never seen this error before. Some googling suggests that this is related to the target. My target file is:
EDIT: to clarify, I'm referring to the LLVM ERROR part. The unused function is on me. |
Is this issue still about large structs on the stack or is it about debugging your build? I tried reproducing but got the following, similar to your previous error.
However, I don't believe this has anything to do with the |
@dhardy yeah sorry about that. I suspect that it is probably fixed but i don't have a good way to check. Thanks for your help! |
It's an open rustc issue since 2014, not fixed yet. According to rust-lang/rust#32966 you should understand and try to improve upon rust-lang/rust#47954 if you really want this sooner rather than later. Yes, this issue should be closed in favor of those. |
I am trying to use
StdRng
in ano_std
context where the stack is rather small. I found that the stack is overflowed when the RNG is constructed.Basically,
rand_core::SeedableRng::seed_from_u64
calls<rand::rngs::std::StdRng as rand_core::SeedableRng>::from_seed
, which calls<rand_hc::hc128::Hc128Rng as rand_core::SeedableRng>::from_seed
, which callsBlockRng::<Hc128Core>::from_seed
... etcAnd each of these allocates a huge amount of space on the stack for the return value, which leads to a stack overflow. I imagine it is also quite slow on a normal platform too, since a lot of values will be copied on the stack.
TBH, I'm not really sure what the solution is. It seems like allocating the RNG on the heap is the right way to avoid all of this copying on the stack. On the other, perhaps marking some of the constructors
#[inline(always)]
would solve the problem?The text was updated successfully, but these errors were encountered: