Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SubIndex, LinearSubIndex, and PermutedIndex types #202

Closed
wants to merge 18 commits into from

Conversation

Tokazama
Copy link
Member

@Tokazama Tokazama commented Sep 6, 2021

  • Added PermutedIndex for Adjoint, Transpose, PermutedDimsArray
  • Added SubIndex for SubArray
  • Added LinearSubIndex for FastSubArray and FastContiguousSubArray
  • Define getindex(::ArrayIndex, ::ArrayIndex) methods: convenient way of creating rules for combining subtypes of ArrayIndex
  • ArrayIndex{N}(A) constructor methods:
    • ArrayIndex{1}(A) constructs a subtype appropriate for linear indexing
    • ArrayIndex{1+}(A) construct a subtype appropriate for cartesian indexing

Hopefully this PR helps make the utility of ArrayIndex subtypes a little more clear without committing to any controversial syntax or interface. I'm sure we will want a more flexible approach for constructing an ArrayIndex in the future, but this at least gets things up and running.

@codecov
Copy link

codecov bot commented Sep 6, 2021

Codecov Report

Merging #202 (fc95c21) into master (63c86af) will decrease coverage by 5.39%.
The diff coverage is 61.94%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #202      +/-   ##
==========================================
- Coverage   86.20%   80.80%   -5.40%     
==========================================
  Files          11       11              
  Lines        1812     1730      -82     
==========================================
- Hits         1562     1398     -164     
- Misses        250      332      +82     
Impacted Files Coverage Δ
src/ArrayInterface.jl 84.61% <ø> (-1.03%) ⬇️
src/axes.jl 65.30% <0.00%> (-6.04%) ⬇️
src/stridelayout.jl 84.61% <ø> (-4.66%) ⬇️
src/array_index.jl 78.07% <52.63%> (-20.11%) ⬇️
src/size.jl 79.54% <66.66%> (-4.55%) ⬇️
src/indexing.jl 66.50% <70.29%> (-11.49%) ⬇️
src/dimensions.jl 93.15% <72.00%> (-3.96%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63c86af...fc95c21. Read the comment docs.

@Tokazama Tokazama requested a review from chriselrod September 7, 2021 12:27
function ArrayIndex{1}(x::PermutedDimsArray{<:Any,N,perm,iperm}) where {N,perm,iperm}
ComposedIndex(_to_cartesian(x), PermutedIndex{N,perm,iperm}())
end

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because many SubArrays and PermutedDimArrays can be represented by StridedIndex, can we have ArrayIndex for these return a StridedIndex?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I'm treating ArrayIndex here is a one layer deep index representation. The problem with going straight to StrideIndex is that whatever is calling ArrayIndex(::AbstractArray) doesn't know how deeply nested the buffer is. I figured the user level interface would be more like layout(A, accessor) = ArrayIndex(A), buffer(A) and we could special case for cartesian indexing going straight to StrideIndex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chriselrod, should this PR have something like the layout method implemented so that it is immediately more useful? I was trying to make a PR that was still functional but didn't create too many new things at once, but I can see how that lack of a fully implemented interface runs the risk of breaking changes in the near future. Alternatively, I could add warnings to the docs that this is experimental, so that we can drop/change stuff without worrying about breaking changes for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could add warnings to the docs that this is experimental, so that we can drop/change stuff without worrying about breaking changes for now.

Sure, although why not have these things live in a separate, experimental repo?
Are they part of the array interface, or something other libraries would want to extend?

It's hard for me to assess changes like these if I don't know the vision for their future use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully recent commits involving the indexing pipeline make it more clear that these changes allow unique interactions between layers of nested arrays. If we were to have a separate package for this we'd need ArrayInterface.jl to depend on it so that we could complete the indexing interface.

That being said, we don't need to strictly use what I've proposed here. The point of the ArrayInterface.compose methods here is to support a simple rule based system for composing simplified/efficient index transformations. I'm sure will need the new types here, but I'm completely open to there being a more optimal way to doing this.

Project.toml Outdated Show resolved Hide resolved
c2 = nothing
elseif C === 1
c2 = 2
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add

elseif C === 2
  c2 = 1

Although given the getfield(iperm, C) fallback in the other method, is this somehow not allowed?
Not sure what PermutedIndex is without looking at the code more closely.

src/array_index.jl Show resolved Hide resolved
function ArrayIndex{1}(x::PermutedDimsArray{<:Any,N,perm,iperm}) where {N,perm,iperm}
ComposedIndex(_to_cartesian(x), PermutedIndex{N,perm,iperm}())
end

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could add warnings to the docs that this is experimental, so that we can drop/change stuff without worrying about breaking changes for now.

Sure, although why not have these things live in a separate, experimental repo?
Are they part of the array interface, or something other libraries would want to extend?

It's hard for me to assess changes like these if I don't know the vision for their future use.

@Tokazama
Copy link
Member Author

There's a lot that I'd like to try out with this but don't have a formalized interface prepared for. Some of it will be straightforward but some will take some more discussion:

  • Get rid of overly specific code that can just be represented as an ArrayIndex (e.g., to_parent_dims and from_parent_dims which were originally written to consolidate @generated code).
  • I think the final unsafe_getindex method should be somethind like unsafe_getindex(A, inds::Vararg{Int}) = layout(A, inds)[inds...]. This would make completing the typical array interface just involve composing a single ArrayIndex subtype given inds (or some representation of accessors).
  • Some simple implementation of Base.similar(::Type{T}, ::ArrayIndex) would support an array reconstruction interface that works with what's in base already and helps with propagating certain traits onto newly constructed arrays.

While we could probably have a completely new package for this, I think there's enough codependence between these subtypes of ArrayIndex and what this package is trying to do that it would be more difficult to separate it out in the long run. I apologize that I haven't provided a fully interface for any of these interfaces to justify the extra code at the moment, but I think it's a bit unwieldy to have a PR that's implementing all of these things at once.

@timholy and @johnnychen94, you had some input when I first proposed ShapedIndex (and made sure we didn't need it with a nice PR to base). Might be helpful to get feedback here if you're still interested.

@Tokazama
Copy link
Member Author

Tokazama commented Oct 1, 2021

The most recent commit here illustrates why I think this PR is necessary for having a general array indexing interface.
There are lot of things that should probably be reviewed individually within "src/Experimental" before actually being pushed to this repo.
Therefore, I haven't spent too much time optimizing that code given it's more of a working concept at this point.
The overall design of the indexing system here is lot like Base.Broadcasted, but instead of translating a lazy operation across every index of multiple arrays Layouted translates indexing across multiple layers of an array.
Another thing we could do with this is something like Base.similar(A, ::Type{T}, ::Layouted) for reconstructing arrays, hopefully allowing nested array parameters to be appropriately propagated to new instances (e.g., dimension names, offsets, etc.).

@Tokazama
Copy link
Member Author

Tokazama commented Oct 4, 2021

This now composes an efficient layout represent for indexing.
I'm going to hold off on making the internal components like layout public b/c I'm not sure if it's the final design we should have for recomposing layouts from individual layers of nested arrays.
I've also added relayout_constructor and relayout, which allow rebuilding arrays even if they are nested.
For example, adding this code to the StaticArrays package:

function static_size(x, inds)
    StaticArrays.Size(ArrayInterface._mapsub(ArrayInterface.known_length, inds))
end

function ArrayInterface.relayout_constructor(::Type{<:StaticArrays.SArray})
    static_size
end

function ArrayInterface.:()(x::NTuple{N}, y::StaticArrays.Size{S}) where {N,S}
    StaticArrays.SArray{Tuple{S...},eltype(x),length(S),N}(x)
end

Makes it so we can do:

julia> A = view(SArray{Tuple{3,3}}((1, 2, 3, 4, 5, 6, 7, 8, 9)), :, static(1):static(2));

julia> B = ArrayInterface.getindex(A, static(1):static(2), 1)
2-element SVector{2, Int64} with indices SOneTo(2):
 1
 2

Using this design any array type that defines an appropriate relayout_constructor should be able to propagate parameters to the output from getindex no matter how deeply nested the array is.
This provides a pretty clear use case for arrays that carry parameters that should be propagated when indexing (keys, dimension names, offsets, etc.).
The use of here has a lot of other use cases.
Some of them aren't as obvious because I'm not eager to heavily document and settle on several methods yet (like getlayout).
Once they are finalized we could do things like simplify all of Adapt.jl to just:

function adapt(::Type{T}, A) where {T<:AbstractArray}
    inds = ArrayInterface.to_indices(A, ntuple(_, :, Val(ndims(A))))
    buf = ArrayInterface.getlayout(device(T), A, layout(A, static(ndims(A))))
    ArrayInterface.relayout(buf, A, inds)
end

@rafaqz
Copy link
Contributor

rafaqz commented Oct 4, 2021

Since you are defining relayout_constructor do you know about ConstructionBase.constructorof ? It may be useful here.
https://github.com/JuliaObjects/ConstructionBase.jl

It's already possible to remove Adapt.jl using constructorof, such as with Flatten.jl: Flatten.modify(CuArray, x, Array). This also isn't limited to arrays, but most objects. It will work even more reliably soon in Accessors.jl whenever I can finish that PR.

GeoData.jl uses Flatten.jl/constructorof to replace very deeply nested dummy arrays with an actually opened disk-based array - through lazy broadcast/view/permuteddims wrappers.

See here:
https://github.com/rafaqz/GeoData.jl/blob/ba2b71872d25f1752430d0c258b58457ef78848e/src/series.jl#L140-L144
And a little more complicated, using flatten/reconstruct instead of modify here:
https://github.com/rafaqz/GeoData.jl/blob/ba2b71872d25f1752430d0c258b58457ef78848e/src/array.jl#L149-L161

ConstructionBase.jl is probably a dependency of most of SciML already.

@Tokazama
Copy link
Member Author

Tokazama commented Oct 4, 2021

I'm aware of ConstructionBase.jl and considered the potential overlap, but I think there's an important difference.
The return from relayout_constructor isn't called on the resulting collection of values after indexing.
It's called on the original array within the context of the indexing arguments and the result of this is then the new array, like what you get from constructorof.
There are some other important differences too.
There's the possibility that we don't want the reconstruct a layer (like PermutedDimsArray and SubArray) or want to construct a different wrapper type entirely.
If we wanted to get keys that changed after indexing that were attached to an array that was deeply nested we could still reconstruct them with this method.

That being said, it could make sense for specific array types to use constructorof when defining relayout_constructor.

@rafaqz
Copy link
Contributor

rafaqz commented Oct 4, 2021

Ok good to know you've been thinking about that already.

I can see how wanting more specific behaviors for some types is useful, like returning an entirely different wrapper - and constructorof has to stay generic. It may at least be useful to use as a fallback constructor for wrappers that don't implement relayout_constructor or don't affect the index (not that I really understand your design at this stage).

Reducing the amount of interface methods we have to define is good.

@Tokazama
Copy link
Member Author

Tokazama commented Oct 4, 2021

relayout_constructor is more of a useful example that also helps refine the indexing interface a bit more. The point of it in the context of this PR is that having many small simple subtypes of ArrayIndex is not only useful but necessary for defining an array interface that uses types from base. So if relayout_constructor needs to be changed here it shouldn't be a terribly big deal to the overall direction of this PR.

@rafaqz
Copy link
Contributor

rafaqz commented Oct 4, 2021

Sure! My comment was much more general, for the long term of widely-used methods for reconstructing things.

I wish I understood more of the details of your PRs to comment at a more immediately useful level, but currently I don't. So carry on :)

@Tokazama Tokazama mentioned this pull request Oct 15, 2021
@ChrisRackauckas
Copy link
Member

Anything static should go to https://github.com/JuliaArrays/StaticArrayInterface.jl

@ChrisRackauckas ChrisRackauckas deleted the PermutedIndex branch February 18, 2023 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants