-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration with VectorizationBase.jl for mixed-precision and saturating intrinsics? #245
Comments
There is a plan to implement saturating operations for integer types in CheckedArithmeticCore and use them in this package. (#239) WIP: https://github.com/kimikage/CheckedArithmetic.jl/pull/1/files I was addressing a few performance issues before the release of Julia v1.6.0, so that plan has been on hold for a while, but I haven't really stopped. |
On the LoopVectorization/VectorizationBase side of things, I think I'll define specialized methods for functions like @inline Base.:(+)(x::Vec{8,Int32}, y::Vec{16,Int16})::Vec{8,Int32} and then check destination types to define the accumulators, so that it can use the specialized intrinsics. |
@kimikage if you're willing to define methods for The benefit of this vs me defining |
We are now talking about two orthogonal things: vectorization and overflow handling. (To be fair, though, I was the one who tied the latter to the former.) I believe that having just a single definition (of the function name) for each arithmetic operations is a great advantage. The issue #239 aims to do exactly that. Although, it is debatable whether We might want to ask for opinions on that in a place like the Discourse. |
FixedPointNumbers has the same problem of not enough maintainers, but the development of VectorizationBase.jl and LoopVectorization.jl is expected to be discontinued. CheckedArithmeticCore.jl has also stopped development, but I would have had the commit privileges. |
LLVM provides access to some useful vectorized intrinsics (e.g. saturating arithmetic or http://0x80.pl/notesen/2018-10-24-sse-sumbytes.html#sadbw), and those are easy enough to add to VectorizationBase.jl, but some coordination will be needed before those intrinsics can be used to painlessly redo various Images.jl kernels/reductions with
@avx
.Tagging in @chriselrod, who has a much better handle on how to wrestle with LLVM.
The text was updated successfully, but these errors were encountered: