Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: 'force_le' feature flag? #1179

Closed
dhardy opened this issue Sep 11, 2021 · 5 comments
Closed

Suggestion: 'force_le' feature flag? #1179

dhardy opened this issue Sep 11, 2021 · 5 comments

Comments

@dhardy
Copy link
Member

dhardy commented Sep 11, 2021

For the sake of portable results, we force the use of LE on byte-int and int-byte conversions. This isn't always desired, but to avoid multiple code-paths we do it anyway.

However, by using a feature flag, we don't need multiple compiled code paths (and the multiple code versions exist anyway, where this is a significant optimisation on LE). We could add a force_le feature flag to cause current behaviour on BE platforms, using NE by default.

This would be a breaking change (target 0.9; #1165).

@vks
Copy link
Collaborator

vks commented Sep 11, 2021

What's the motivation for this? Better performance on big-endian platforms?

@newpavlov
Copy link
Member

I don't think it's a good idea. If force_le is enabled by default, it will be virtually impossible to disable it in any sizable project, and if not, it certainly will be a portability headache. Also enabling it for a whole rand looks like overly coarse-grained to me.

@dhardy
Copy link
Member Author

dhardy commented Sep 11, 2021

Motivation? I'm looking at #1170, specifically fill_via_chunks, and notice that using a simple copy_nonoverlapping on little-endian is around 50% faster than the copy-by-word-sized-chunks variant (after a pretty significant optimisation to that), for the gen_bytes_chacha* benchmarks. But BE platforms aren't exactly popular, so maybe we just don't care.

@vks
Copy link
Collaborator

vks commented Sep 12, 2021

Ok, so the goal is to offer big-endian platforms better performance at the cost of losing reproducibility with respect to little-endian platforms. I think this is fair, but I would be hesitant to add this without knowing whether anyone will use it, because it increases complexity by adding another feature, and it's a potential footgun for people upgrading rand_core with no-default-features = true.

@dhardy
Copy link
Member Author

dhardy commented Sep 13, 2021

It's also less of a performance boost now (see #1180). Agreed, we don't appear to need this.

@dhardy dhardy closed this as completed Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants