Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scatter operations #255

Merged
merged 57 commits into from
Mar 5, 2021
Merged

Add scatter operations #255

merged 57 commits into from
Mar 5, 2021

Conversation

yuehhua
Copy link
Member

@yuehhua yuehhua commented Dec 26, 2020

I have generalized scatter operations to every dimensions, instead of excluding the first dimension.
Related to yuehhua/ScatterNNlib.jl#32

@mcabbott
Copy link
Member

Could these easily be combined to something like scatter!(+, dst, indices, src)?

@yuehhua
Copy link
Member Author

yuehhua commented Dec 27, 2020

@mcabbott Sure! I have API like scatter!(:add, dst, indices, src) in my ScatterNNlib.jl. Should I move it here as well?

@CarloLucibello
Copy link
Member

CarloLucibello commented Dec 27, 2020

I have API like scatter!(:add, dst, indices, src) in my ScatterNNlib.jl. Should I move it here as well?

I think he is suggesting to directly pass the operator (not a symbol), the same way you would use it in

julia> reduce(+, ones(10))
10.0

We could have

julia> function scatter!(op, ys::Array{T}, us::Array{T}, xs::Array{<:IntOrTuple}) where {T<:Real}
           @simd for k = 1:length(xs)
               k = CartesianIndices(xs)[k]
               ys_v = view(ys, xs[k]...)
               us_v = view(us, k)
               @inbounds ys_v .= (op).(ys_v, us_v)
           end
           ys
       end

and a few specialized methods:

function scatter!(op::typeof(mean), ys::Array{T}, us::Array{T}, xs::Array{<:IntOrTuple}) where {T<:Real}
...

src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
@yuehhua yuehhua force-pushed the scatter branch 2 times, most recently from ab03446 to be4fa94 Compare December 28, 2020 12:29
src/scatter.jl Outdated Show resolved Hide resolved
src/gather.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/gather.jl Outdated Show resolved Hide resolved
@chengchingwen
Copy link
Member

Maybe we should specified how our scatter/gather different from TF or Torch?

src/gather.jl Outdated Show resolved Hide resolved
@yuehhua
Copy link
Member Author

yuehhua commented Dec 30, 2020

I discussed with @chengchingwen yesterday. We hope to introduce some new features to generalize scatter/gather operations. New features are listed in the following:

  • Add dims argument for specifying which dimension to start scattering and the dimensions will be contiguous block, like TensorFlow does.
  • The operational unit of scatter/gather could be a scalar or (at least) a vector. Both of them will be provided. The scalar unit will be designed by assigning dims=0, while the vector unit will be assigning with integer dims=1.
  • Following above, dims=1 will be a default option.
  • Scalar version: dst[i, j, k] = src[idx[i, j, k]...]
  • Array version:
    • dims=1: dst[i, j, k] = src[:, idx[i, j, k]...]
    • dims=2: dst[i, j, k] = src[:, :, idx[i, j, k]...]
    • dims=3: dst[i, j, k] = src[:, :, :, idx[i, j, k]...]
    • ...

src/scatter.jl Outdated Show resolved Hide resolved
src/gather.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/gather.jl Outdated Show resolved Hide resolved
src/gather.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
Project.toml Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/scatter.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
dims = Nsrc - Nidx
dstsize = (size(src)[1:dims]..., maximum_dims(idx)...)
dst = similar(src, T, dstsize)
fill!(dst, typemax(T))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if typemax is the right thing to do here. The problem is if there are positions in dst which receive no contributions for src they will end up holding typemax, which doesn't seem meaningful. Maybe we should error out in such cases, but doing this check may have a performance impact

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought this issue before. Checking the position of dst is properly covered by idx is the way to avoid holding typemax. But still, it is necessary to check values in src is smaller than the value we assigned, either typemax or similar. similar gives the value existing in bare memory, so we have no idea knowing if the values are smaller enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we give the maximum of src? Thus, the value is at least smaller or equals to the maximum of src.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maximum(src) would be more surprising, in un-visited entries. typemax seems OK to me.

@yuehhua
Copy link
Member Author

yuehhua commented Mar 4, 2021

If it is ready to go, just let it go.

@CarloLucibello
Copy link
Member

ok, I'll merge this so that work can proceed, I'll open a pr to revisit the docstrings

@CarloLucibello CarloLucibello merged commit b59cf53 into FluxML:master Mar 5, 2021
bors bot added a commit to FluxML/Flux.jl that referenced this pull request Jul 13, 2021
1516: add Embedding layer r=CarloLucibello a=CarloLucibello

Basic implementation. 

Maybe could be improved when FluxML/NNlib.jl#255 lands

### PR Checklist

- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
bors bot added a commit to FluxML/Flux.jl that referenced this pull request Jul 13, 2021
1516: add Embedding layer r=CarloLucibello a=CarloLucibello

Basic implementation. 

Maybe could be improved when FluxML/NNlib.jl#255 lands

### PR Checklist

- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
bors bot added a commit to FluxML/Flux.jl that referenced this pull request Jul 13, 2021
1516: add Embedding layer r=DhairyaLGandhi a=CarloLucibello

Basic implementation. 

Maybe could be improved when FluxML/NNlib.jl#255 lands

### PR Checklist

- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [ ] Final review from `@dhairyagandhi96` (for API changes).


Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
Co-authored-by: Carlo Lucibello <carlo.lucibello@unibocconi.it>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants