Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster float conversion operations #370

Closed
wants to merge 1 commit into from

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Aug 12, 2020

Hi!

I've been playing around a bit with int->float and float->int conversions lately. The implementations I have now seem to produce shorter assembly and perform faster than those currently in compiler-builtins. Some of them make a very significant difference. For example: u128->f64 is about 4 to 5 times faster on my x86_64 desktop, and u64->f64 runs about a factor 3 faster for the i686 target. Many others seem to provide roughly 30% improvement in my initial benchmarks, at least on the hardware I have here.

For now I put them in this crate: https://docs.rs/floatconv/

I still want to optimize some more, do better benchmarks, and verify correctness better. But I figured it might be good to already ask some questions about potential integration into compiler-builtins:

  • Should these optimized versions be integrated into compiler-builtins? If so:
  • Can this crate simply depend on an external crate (like in this (draft) PR)?
  • How would we check the performance properly? I can run benchmarks on the computers I have here, but I'm not sure if that's enough to prove it's better on all processor models, etc.
  • How would we check correctness? For the conversions from a 32-bit value, I've checked literally every possible value against the current implementation, but for 64 and 128 bit that's infeasible. Testing a lot of (generated) edge cases helps, but would that be enough? Or would this require some kind of formal proof?

@bjorn3
Copy link
Member

bjorn3 commented Aug 12, 2020

Can this crate simply depend on an external crate (like in this (draft) PR)?

I don't think so.

@m-ou-se m-ou-se changed the title Use float conversion operations from floatconv. Faster float conversion operations Aug 12, 2020
@eddyb
Copy link
Member

eddyb commented Aug 18, 2020

cc @Amanieu @japaric @alexcrichton How would you like to proceed here?
Alternatively, should this be a @rust-lang/compiler decision, perhaps with an MCP?

@nagisa
Copy link
Member

nagisa commented Aug 18, 2020

The original (Rust) implementations of anything in this crate were ported without any concern for performance on any platform. There was no effort in formally proving their correctness either.

As thus I wouldn’t really hold any new changes to a significantly higher level of standard. So testing/benchmarking just on one or two different architectures would be good enough IME.

@Amanieu
Copy link
Member

Amanieu commented Aug 18, 2020

I think it's possible to depend on an external crate but that crate will need to be modified to support Rust's build system. Have a look at how hashbrown does it.

@Amanieu
Copy link
Member

Amanieu commented Aug 18, 2020

Then again compiler-builtins involves quite a bit of magic so you'll want to test this in a rustc build.

@bjorn3
Copy link
Member

bjorn3 commented Aug 18, 2020

compiler-builtins is special in that all crates except for libcore implicitly depend on it.

@alexcrichton
Copy link
Member

I think it's a good idea to run benchmarks to verify that this performs as expected, but otherwise I agree that we don't need hugely rigorous testing. It'd be great to take this time to invest in the test suite to expand it from what it already is, but we should already at least lightly test these functions relative to the default implementation of compiler builtins.

Unfortunately I don't think depending on a crate will work. The compiler-builtins crate is specially handled with respect to codegen and how it's linked, but that logic doesn't apply to the dependencies of compiler-builtins. If this can't be easily integrated into this crate, though, we could try making submodules work.

@AaronKutch
Copy link
Contributor

I have greatly improved integer to float testing in compiler-builtins. Does cargo test --release still pass the tests with the latest master?

@AaronKutch
Copy link
Contributor

Actually, since I am refactoring the floating point code right now, I think I will tackle this and try to implement any speedups from floatconv

@AaronKutch
Copy link
Contributor

AaronKutch commented Dec 10, 2020

Now that I look deeply into it, it seems floatconv has a refined int-to-float conversion, and configures native or soft conversion depending on platform. However, it seems to me that LLVM is already doing the native/soft configuration? I see it using fcvtzu instructions for all appropriate conversions with aarch64. note: compiler-builtins has some hardcoded configuration for x86_64, but none for aarch64, so does this indicate no hard-coded configuration is needed at all?

@m-ou-se
Copy link
Member Author

m-ou-se commented Dec 11, 2020

I have greatly improved integer to float testing in compiler-builtins.

@AaronKutch Thanks :)

I'll check if it passes all the new tests, and update the PR to integrate the implementations directly into this crate.

@Shnatsel
Copy link
Member

Shnatsel commented May 10, 2022

For the conversions from a 32-bit value, I've checked literally every possible value against the current implementation, but for 64 and 128 bit that's infeasible. Testing a lot of (generated) edge cases helps, but would that be enough?

Fuzzing can help discover interesting edge cases. The Rust Fuzz Book will get you started.

You can use fuzzing to determine that your implementation matches the values produced by the current implementation with a simple assert!().

Update: I've opened an initial PR for fuzzing: m-ou-se/floatconv#4

@AaronKutch
Copy link
Contributor

That LLVM based fuzzer is good for general structural fuzzing where it tries to hit all code paths. There is also a fuzzer in compiler-builtins I specially built for the purpose of fuzzing floats https://github.com/rust-lang/compiler-builtins/blob/master/testcrate/src/lib.rs. This fuzzer is structured around the sign, exponent, and significand, has an edge case tester for extremes that random values would be unlikely to trigger, and has my or-and-xor-rotate fuzzer that is better than feeding plain random values

@m-ou-se
Copy link
Member Author

m-ou-se commented May 20, 2022

Closing in favour of #464

@m-ou-se m-ou-se closed this May 20, 2022
@m-ou-se m-ou-se deleted the floatconv branch November 30, 2023 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants