-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
History id start #391
History id start #391
Conversation
The OnceLock solution was the one below, it's now replaced with Once, to be 1.64 compatible. pub(crate) fn get_id() -> u32 {
static ID_CTR: OnceLock<AtomicU32> = OnceLock::new();
let atomic = ID_CTR.get_or_init(|| {
// Use the most significant 32 bits of `now`'s mantissa and reverse it.
// The time 1696926809097.0 is stored as something like 0x1.8b18b83c09000 * 2^0x28,
// and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed).
// This way the numbers change significantly when a little time is passed.
// It's not a perfect solutions but should reduce collisions significantly.
let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits();
// Using a high initial value is not an issue as `fetch_add` does wrap around.
AtomicU32::new(start)
});
atomic.fetch_add(1, Ordering::SeqCst)
} |
Can you add a test case for this? |
Unfortunately I can't add an test for it since the id is private (on purpose) and it's a static variable and thus it's not possible to start it twice and check it. (Or did I miss some kind of test.) |
You can add an integration test that ensures the bug described in #369 is fixed |
I'm sorry but I don't know how to do:
|
crates/history/src/utils.rs
Outdated
// and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed). | ||
// This way the numbers change significantly when a little time is passed. | ||
// It's not a perfect solutions but should reduce collisions significantly. | ||
let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use JavaScript APIs to seed the counter because MemoryHistory
can be used in SSR (non-WebAssembly targets) as well.
I think it is better to use getrandom as we don't have to guarantee that an Id is u32
or sequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can split it and let MemoryHistory use the old (and for it correct behavior) and only use this for BrowserHistory.
Or use gertandom for both or just BrowserHistory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use getrandom
for wasm32
and existing behaviour for other platforms as I feel it is safer to deal with collision using a cryptographically secure entropy. Having MemoryHistory
to use sequential slots whilst other histories use random slots might increase the chance of collision. We compare the id of 2 locations to determine of whether 2 locations are the same. So it is better to also protect against collision across history types.
getrandom
uses window.crypto.getRandomValues
on WebAssembly, which should have minimal bundle size increase for Web targets and performance is not a concern (as that is usually limited by how many clicks a user can do).
The current behaviour should be significantly faster than getrandom
(and rand::thread_rng().fill
) for SSR, and cannot be restored by browser so we can keep the existing behaviour for other platforms.
We can make location id to be an opaque type such as: struct LocationId([u8; 8])
as u64
is outside of JavaScript's safe integer range.
I think u64 or larger is better to prevent collision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first step I created a new version which uses getrandom on wasm and leaves it unchanged on all other platforms.
As for the collision: getrandom is still only used to pick the starting number and then increase by one, which removes collision within the lifespan of the app, which is a good idea.
Which is why I thought to depend on reversed time, that way it changes a lot when little time has passed.
Should I now change the result of get_id from u32
into struct LocationId([u8; 8])
, and thus the inner counter from u32 ti u64? Also in the case of BrowserHistory it's used more as an StateId
and in a HashMap<u32, Rc<dyn Any>>
and the LocationId/StateId
is stored in javascript via serde_wasm_bindgen::to_value
. While MemoryHistory uses it only for tracking a location within rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to keep it as u32 for now, so we can publish a patch release to fix the issue and then make this breaking change in the next minor release.
static ID_CTR: AtomicU32 = AtomicU32::new(0); | ||
static INIT: std::sync::Once = std::sync::Once::new(); | ||
|
||
INIT.call_once(|| { | ||
let mut start: [u8; 4] = [0; 4]; | ||
// If it fails then the start is not or only partly filled. | ||
// But since this method should not fail, we take what we get. | ||
let _ = getrandom::getrandom(&mut start); | ||
// Using a high initial value is not an issue as `fetch_add` does wrap around. | ||
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst); | ||
}); | ||
|
||
ID_CTR.fetch_add(1, Ordering::SeqCst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static ID_CTR: AtomicU32 = AtomicU32::new(0); | |
static INIT: std::sync::Once = std::sync::Once::new(); | |
INIT.call_once(|| { | |
let mut start: [u8; 4] = [0; 4]; | |
// If it fails then the start is not or only partly filled. | |
// But since this method should not fail, we take what we get. | |
let _ = getrandom::getrandom(&mut start); | |
// Using a high initial value is not an issue as `fetch_add` does wrap around. | |
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst); | |
}); | |
ID_CTR.fetch_add(1, Ordering::SeqCst) | |
let mut start: [u8; 4] = [0; 4]; | |
let _ = getrandom::getrandom(&mut start); | |
u32::from_ne_bytes(start) |
I think we can omit the initialisation logic and get a random value as id every time.
This might help us to save bundle size and I feel this might be safer.
However, the 2 methods should have the same probability for the scenarios I can come up with.
So I am fine with the current method as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).
So I'll leave it at the current method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).
2^32
is only the best case scenario, where it only happens if the start id is assigned the last id + 1.
Suppose 64
is selected for a start id with 10 navigation, then ids 64~73
are given out.
For another page to collide with currently assigned id within 10 navigations, the start id needs to be within 55~73
.
Suppose the id is u8
, which will make the calculation easier.
Assume the start id is selected based on 1/256
for every possibility, then the probability is (73 - 55 + 1 (both ends are inclusive) + 1 (assigning 11th id for the second navigation)) / 256
, which is 20 / 256
.
If every id is random, then for each unique id given out, the likelihood of collision increases by 1/256
. After 20 ids are assigned, the 21st id is 20/256
.
So the probability is the same.
crates/history/src/utils.rs
Outdated
// and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed). | ||
// This way the numbers change significantly when a little time is passed. | ||
// It's not a perfect solutions but should reduce collisions significantly. | ||
let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to keep it as u32 for now, so we can publish a patch release to fix the issue and then make this breaking change in the next minor release.
crates/history/Cargo.toml
Outdated
@@ -24,6 +24,9 @@ thiserror = { version = "1.0", optional = true } | |||
version = "0.3" | |||
features = ["History", "Window", "Location", "Url"] | |||
|
|||
[target.'cfg(target_arch = "wasm32")'.dependencies] | |||
getrandom = {version="0.2.10", features = ["js"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getrandom = {version="0.2.10", features = ["js"]} | |
getrandom = { version = "0.2.10", features = ["js"] } |
Should have spacing like other dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love it when cargo fmt would format also this. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
static ID_CTR: AtomicU32 = AtomicU32::new(0); | ||
static INIT: std::sync::Once = std::sync::Once::new(); | ||
|
||
INIT.call_once(|| { | ||
let mut start: [u8; 4] = [0; 4]; | ||
// If it fails then the start is not or only partly filled. | ||
// But since this method should not fail, we take what we get. | ||
let _ = getrandom::getrandom(&mut start); | ||
// Using a high initial value is not an issue as `fetch_add` does wrap around. | ||
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst); | ||
}); | ||
|
||
ID_CTR.fetch_add(1, Ordering::SeqCst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).
2^32
is only the best case scenario, where it only happens if the start id is assigned the last id + 1.
Suppose 64
is selected for a start id with 10 navigation, then ids 64~73
are given out.
For another page to collide with currently assigned id within 10 navigations, the start id needs to be within 55~73
.
Suppose the id is u8
, which will make the calculation easier.
Assume the start id is selected based on 1/256
for every possibility, then the probability is (73 - 55 + 1 (both ends are inclusive) + 1 (assigning 11th id for the second navigation)) / 256
, which is 20 / 256
.
If every id is random, then for each unique id given out, the likelihood of collision increases by 1/256
. After 20 ids are assigned, the 21st id is 20/256
.
So the probability is the same.
This should fix #369 most of the time.
Without any changes the Cargo.lock changed after building, that's in the first commit (I can squish it into one).
Using window.performance.now() is not an option because it starts the time when the website loads and thats why I've added js-sys as a dependency.
Everything else should be documented in the code.
Instead of OnceLock it's also possible to use the unprotected AtomicU32 as before and a Once to initialize it.
Edit: the option with Once is available with 1.64 - which I didn't realize that it has/should be compatible with.