-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: better cast create2 #6212
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speed gud
crates/cast/bin/cmd/create2.rs
Outdated
let salt_word = unsafe { &mut *salt.0.as_mut_ptr().cast::<usize>() }; | ||
*salt_word = start; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not obvious what's going on here, this sets the salt's first word to the thread's index incrementing it by the number of threads, so the salt cycles through all unique values?
needs docs+SAFETY
salt uses u64 internally? should this cast to u64? read b256 aligned wrong, or maybe not idk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n_threads=4
- (thread)i=0: word=0; word=4; word=8
- (thread)i=1: word=1; word=5; word=9
- (thread)i=2: word=2; word=6; word=10
- (thread)i=3: word=3; word=7; word=11
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha create2 go brrrr
same as matt, just more docs would be nice!
Added |
let salt_word = unsafe { &mut *salt.0.as_mut_ptr().cast::<usize>() }; | ||
*salt_word = start; | ||
|
||
// Important: set the salt to the start value, otherwise all threads loop over the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the loop only covers u64 or u32 depending on the arch, so not all the available 32bytes?
fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>> u32::MAX / 128
33554431
>> u64::MAX / 128
144115188075855871
iterations needed to overflow, I don't think that's reachable
Motivation
Speed
Solution
Make faster