Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

History id start #391

Merged
merged 2 commits into from
Oct 14, 2023
Merged

History id start #391

merged 2 commits into from
Oct 14, 2023

Conversation

alexkazik
Copy link
Contributor

@alexkazik alexkazik commented Oct 10, 2023

This should fix #369 most of the time.

Without any changes the Cargo.lock changed after building, that's in the first commit (I can squish it into one).

Using window.performance.now() is not an option because it starts the time when the website loads and thats why I've added js-sys as a dependency.

Everything else should be documented in the code.

Instead of OnceLock it's also possible to use the unprotected AtomicU32 as before and a Once to initialize it.

Edit: the option with Once is available with 1.64 - which I didn't realize that it has/should be compatible with.

@alexkazik
Copy link
Contributor Author

alexkazik commented Oct 10, 2023

On the first try all 1.64 checks failed due to OnceLock, but all others succeeded - I don't know why Test gloo-net (stable) is now failing. (Edit: it seems to work now)

The OnceLock solution was the one below, it's now replaced with Once, to be 1.64 compatible.

pub(crate) fn get_id() -> u32 {
    static ID_CTR: OnceLock<AtomicU32> = OnceLock::new();

    let atomic = ID_CTR.get_or_init(|| {
        // Use the most significant 32 bits of `now`'s mantissa and reverse it.
        // The time 1696926809097.0 is stored as something like 0x1.8b18b83c09000 * 2^0x28,
        // and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed).
        // This way the numbers change significantly when a little time is passed.
        // It's not a perfect solutions but should reduce collisions significantly.
        let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits();
        // Using a high initial value is not an issue as `fetch_add` does wrap around.
        AtomicU32::new(start)
    });

    atomic.fetch_add(1, Ordering::SeqCst)
}

@ranile ranile requested a review from futursolo October 11, 2023 09:44
@ranile
Copy link
Collaborator

ranile commented Oct 11, 2023

Can you add a test case for this?

@alexkazik
Copy link
Contributor Author

Unfortunately I can't add an test for it since the id is private (on purpose) and it's a static variable and thus it's not possible to start it twice and check it. (Or did I miss some kind of test.)

@ranile
Copy link
Collaborator

ranile commented Oct 11, 2023

You can add an integration test that ensures the bug described in #369 is fixed

@alexkazik
Copy link
Contributor Author

I'm sorry but I don't know how to do:

  • replace_with_state (for current page)
  • push_with_state
  • reload the app (and thus resetting the id counter)
  • back
  • ensure that the state is now None

// and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed).
// This way the numbers change significantly when a little time is passed.
// It's not a perfect solutions but should reduce collisions significantly.
let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use JavaScript APIs to seed the counter because MemoryHistory can be used in SSR (non-WebAssembly targets) as well.

I think it is better to use getrandom as we don't have to guarantee that an Id is u32 or sequential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can split it and let MemoryHistory use the old (and for it correct behavior) and only use this for BrowserHistory.
Or use gertandom for both or just BrowserHistory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use getrandom for wasm32 and existing behaviour for other platforms as I feel it is safer to deal with collision using a cryptographically secure entropy. Having MemoryHistory to use sequential slots whilst other histories use random slots might increase the chance of collision. We compare the id of 2 locations to determine of whether 2 locations are the same. So it is better to also protect against collision across history types.

getrandom uses window.crypto.getRandomValues on WebAssembly, which should have minimal bundle size increase for Web targets and performance is not a concern (as that is usually limited by how many clicks a user can do).

The current behaviour should be significantly faster than getrandom (and rand::thread_rng().fill) for SSR, and cannot be restored by browser so we can keep the existing behaviour for other platforms.

We can make location id to be an opaque type such as: struct LocationId([u8; 8]) as u64 is outside of JavaScript's safe integer range.
I think u64 or larger is better to prevent collision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a first step I created a new version which uses getrandom on wasm and leaves it unchanged on all other platforms.

As for the collision: getrandom is still only used to pick the starting number and then increase by one, which removes collision within the lifespan of the app, which is a good idea.
Which is why I thought to depend on reversed time, that way it changes a lot when little time has passed.

Should I now change the result of get_id from u32 into struct LocationId([u8; 8]), and thus the inner counter from u32 ti u64? Also in the case of BrowserHistory it's used more as an StateId and in a HashMap<u32, Rc<dyn Any>> and the LocationId/StateId is stored in javascript via serde_wasm_bindgen::to_value. While MemoryHistory uses it only for tracking a location within rust.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to keep it as u32 for now, so we can publish a patch release to fix the issue and then make this breaking change in the next minor release.

Comment on lines +16 to +28
static ID_CTR: AtomicU32 = AtomicU32::new(0);
static INIT: std::sync::Once = std::sync::Once::new();

INIT.call_once(|| {
let mut start: [u8; 4] = [0; 4];
// If it fails then the start is not or only partly filled.
// But since this method should not fail, we take what we get.
let _ = getrandom::getrandom(&mut start);
// Using a high initial value is not an issue as `fetch_add` does wrap around.
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst);
});

ID_CTR.fetch_add(1, Ordering::SeqCst)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static ID_CTR: AtomicU32 = AtomicU32::new(0);
static INIT: std::sync::Once = std::sync::Once::new();
INIT.call_once(|| {
let mut start: [u8; 4] = [0; 4];
// If it fails then the start is not or only partly filled.
// But since this method should not fail, we take what we get.
let _ = getrandom::getrandom(&mut start);
// Using a high initial value is not an issue as `fetch_add` does wrap around.
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst);
});
ID_CTR.fetch_add(1, Ordering::SeqCst)
let mut start: [u8; 4] = [0; 4];
let _ = getrandom::getrandom(&mut start);
u32::from_ne_bytes(start)

I think we can omit the initialisation logic and get a random value as id every time.
This might help us to save bundle size and I feel this might be safer.

However, the 2 methods should have the same probability for the scenarios I can come up with.
So I am fine with the current method as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).

So I'll leave it at the current method.

Copy link
Collaborator

@futursolo futursolo Oct 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).

2^32 is only the best case scenario, where it only happens if the start id is assigned the last id + 1.

Suppose 64 is selected for a start id with 10 navigation, then ids 64~73 are given out.

For another page to collide with currently assigned id within 10 navigations, the start id needs to be within 55~73.

Suppose the id is u8, which will make the calculation easier.
Assume the start id is selected based on 1/256 for every possibility, then the probability is (73 - 55 + 1 (both ends are inclusive) + 1 (assigning 11th id for the second navigation)) / 256, which is 20 / 256.

If every id is random, then for each unique id given out, the likelihood of collision increases by 1/256. After 20 ids are assigned, the 21st id is 20/256.

So the probability is the same.

// and the start value for it is 0x3c1d18d1 (0x8b18b83c reversed).
// This way the numbers change significantly when a little time is passed.
// It's not a perfect solutions but should reduce collisions significantly.
let start = ((js_sys::Date::now().to_bits() >> (52 - 32)) as u32).reverse_bits();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to keep it as u32 for now, so we can publish a patch release to fix the issue and then make this breaking change in the next minor release.

@@ -24,6 +24,9 @@ thiserror = { version = "1.0", optional = true }
version = "0.3"
features = ["History", "Window", "Location", "Url"]

[target.'cfg(target_arch = "wasm32")'.dependencies]
getrandom = {version="0.2.10", features = ["js"]}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
getrandom = {version="0.2.10", features = ["js"]}
getrandom = { version = "0.2.10", features = ["js"] }

Should have spacing like other dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love it when cargo fmt would format also this. Fixed.

Copy link
Collaborator

@futursolo futursolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

Comment on lines +16 to +28
static ID_CTR: AtomicU32 = AtomicU32::new(0);
static INIT: std::sync::Once = std::sync::Once::new();

INIT.call_once(|| {
let mut start: [u8; 4] = [0; 4];
// If it fails then the start is not or only partly filled.
// But since this method should not fail, we take what we get.
let _ = getrandom::getrandom(&mut start);
// Using a high initial value is not an issue as `fetch_add` does wrap around.
ID_CTR.store(u32::from_ne_bytes(start), Ordering::SeqCst);
});

ID_CTR.fetch_add(1, Ordering::SeqCst)
Copy link
Collaborator

@futursolo futursolo Oct 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to use a random value each time because then there could be a collision, which is currently impossible (unless 2^32 id's are generated).

2^32 is only the best case scenario, where it only happens if the start id is assigned the last id + 1.

Suppose 64 is selected for a start id with 10 navigation, then ids 64~73 are given out.

For another page to collide with currently assigned id within 10 navigations, the start id needs to be within 55~73.

Suppose the id is u8, which will make the calculation easier.
Assume the start id is selected based on 1/256 for every possibility, then the probability is (73 - 55 + 1 (both ends are inclusive) + 1 (assigning 11th id for the second navigation)) / 256, which is 20 / 256.

If every id is random, then for each unique id given out, the likelihood of collision increases by 1/256. After 20 ids are assigned, the 21st id is 20/256.

So the probability is the same.

@futursolo futursolo merged commit 5eb33f3 into rustwasm:master Oct 14, 2023
17 checks passed
ranile pushed a commit that referenced this pull request Oct 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[history] BrowserHistory: Loaded wrong state
3 participants