Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup and fix of multiplication by OneHotMatrix #1756

Merged
merged 22 commits into from
Oct 28, 2021
Merged

Conversation

racinmat
Copy link
Contributor

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable
  • API changes require approval from a committer (different from the author, if applicable)

Fixes #1355 .
Also fixes bug mentioned in #1355 (comment).
Adds tests for both gpu and cpu.
Adds multiplication by sparse matrix to benchmarks.

test/cuda/cuda.jl Outdated Show resolved Hide resolved
src/onehot.jl Outdated Show resolved Hide resolved
test/onehot.jl Outdated Show resolved Hide resolved
Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>
test/cuda/cuda.jl Outdated Show resolved Hide resolved
test/onehot.jl Outdated Show resolved Hide resolved
Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
test/cuda/cuda.jl Outdated Show resolved Hide resolved
@CarloLucibello
Copy link
Member

@racinmat do you want to include #1355 (comment) as well?

Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
@racinmat
Copy link
Contributor Author

yes. I want, pushing right now.

@CarloLucibello
Copy link
Member

@racinmat gpu tests (buildkite) are still failing, not sure why

@racinmat
Copy link
Contributor Author

I see, I messed up dimension validation before multiplicating by the adjoint, fixing it and adding it also to cpu tests.

src/onehot.jl Outdated Show resolved Hide resolved
src/onehot.jl Outdated Show resolved Hide resolved
Co-authored-by: Brian Chen <ToucheSir@users.noreply.github.com>
@racinmat
Copy link
Contributor Author

I'm a bit unsure about adding code which would work for ReshapedArrays without actually adding testcase which would exercise that code and called the method with reshaped array. Is it ok, or should I add tests for it?

@ToucheSir
Copy link
Member

That's a good question, are we clear on what the semantics would be with reshaped arrays?

@darsnack
Copy link
Member

Reshaped arrays in this context means N-d OneHotArrays that are reshaped to look like OneHotMatrix with the first dimension untouched.

So using

reshape(OneHotArray(rand(1:10, 5, 5), 10), 10, :)

should work for adding tests to exercise it. I do think we should add tests.

@racinmat
Copy link
Contributor Author

racinmat commented Oct 26, 2021

During testing reshaped arrays I realized there is no _onehot implemented for adjoint of reshaped onehot matrix and I realized I got a bit lost there.
So I guess I'll need a bit of help.
There should be sth. like _isonehot(B) || return invoke(*, Tuple{AbstractMatrix, AbstractMatrix}, A, B), right? But how should _isonehot look like to distinguish e.g. these?

julia>   b4 = reshape(Flux.OneHotMatrix([1 2 3; 2 2 1], 3), 3, :)
3×6 reshape(OneHotArray(::Matrix{Int64}), 3, 6) with eltype Bool:
 1          1
   1  1  1    
         1  

julia>   b5 = reshape(b4, 6, :)
6×3 reshape(OneHotArray(::Matrix{Int64}), 6, 3) with eltype Bool:
 1  0  0
 0  1  0
 0  0  1
 0  0  1
 1  1  0
 0  0  0

julia>   b5'
3×6 adjoint(reshape(OneHotArray(::Matrix{Int64}), 6, 3)) with eltype Bool:
 1  0  0  0  1  0
 0  1  0  0  1  0
 0  0  1  1  0  0

I thought I could check indices, but they are same for both of them, and one is onehot and the other is not.

I'm currently not sure which way to go:

  1. Should I add these cases to tests and try to make it work in this PR? I don't know how to make efficient check for _isonehot for adjoint reshaped arrays, so I'm a bit lost there.
  2. Should I make the new method for adjoint only for OneHotMatrix and support for reshaped arrays would be left for separate PR and add these test cases so we know we need to cover them?

@racinmat
Copy link
Contributor Author

In the end I decided I will keep it as function Base.:(*)(A::AbstractMatrix, B::Adjoint{Bool, <:OneHotMatrix}), but I added tests for multiplication by reshaped onehot matrix so in the future people would have test data to see if some additional otimized version would behave correctly.

@CarloLucibello
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Oct 28, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Usage of OneHotMatrix for input to neural network is very slow.
5 participants