Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subarrays/views support #172

Closed
andreasnoack opened this issue Apr 2, 2018 · 7 comments
Closed

Subarrays/views support #172

andreasnoack opened this issue Apr 2, 2018 · 7 comments
Labels
cuda array Stuff about CuArray. enhancement New feature or request

Comments

@andreasnoack
Copy link
Member

I've been trying to make more of the highlevel linear algebra routines work similarly to the Base versions. For this to really work well, we'll need support for strided views where stride(A,1)==1 but stride(A,2)>size(A,1). Currently, I see two approaches to support view for CuArrays: 1) Make Base's SubArrays work for CuArray or 2) include strides information in the CuArray to support this.

The CuArray struct already includes a field for offset and it wraps a memory buffer so it might make sense to make CuArrays more general and add a strides field. The cost is that people working explicitly with the buffer would have to be more careful about the strides information. These views would also be less general than Base's SubArrays but I think they'll cover the more important cases.

Last night I tried the other approach, i.e. to just wrap CuArrays in SubArray but it doesn't work with ccall and it might require some controversial changes to how CuArrays are converted to Ptr if we want to go this direction. Basically, the problem is that unsafe_convert(Ptr{T}, CuArray{T}) isn't defined and therefore you can't pass a SubArray{CuArray} to ccall.

What do people think? Which one is the better solution or have a missed a third better option? Personally, I lean towards including stride information in CuArrays since we already wrap a memory buffer and attach shape information so also doing this as part of a SubArray is kind of double wrapping.

@maleadt
Copy link
Member

maleadt commented Apr 2, 2018

I vaguely remember a discussion at JuliaCon about a unified array class (ie. across GPU/DL libraries) where such information would obviously need to be encoded in the struct. So that might be a valid short-term fix, because the whole array wrapping business is pretty unsolved wrt. GPU types (eg. how to broadcast over this container, Adapt.jl, etc).

@andreasnoack
Copy link
Member Author

I just noticed Mem.view so that might be the simplest solution, i.e. just define

Base.cconvert(::Type{Ptr{T}}, V::SubArray{T,N,P,<:Tuple{Vararg{RangeIndex}}}) where {T,N,P<:CuArray} =
    Mem.view(unsafe_buffer(V.parent), (Base.first_index(V) - 1)*sizeof(T))

and rely on the SubArray machinery. That seems to be sufficient for CUSOLVER.

@andreasnoack
Copy link
Member Author

Is this solution acceptable or are there GPU specific caveats?

@MikeInnes
Copy link
Contributor

Correct me if I'm mistaken, but that only supports a special case of strides, right? [But I'm on board if it solves an issue for you now, of course; I don't think it would cause issues elsewhere]

In principle I'd rather support general SubArrays and other wrapper types here. The nested wrapping issue has come up a lot, and if we had a good solution to those problems it should be very easy to do. Sadly, since we don't have that, supporting wrapper types leads to hacks on top of hacks.

So in the shorter term I think it's fine to just put strides and offsets in the CuArray struct (either as a tuple or as a small GPU buffer). I actually don't think this'll be that bad to support; overloading indexing takes care of the Julia kernels in one swoop, so we just need to update the BLAS calls etc.

@andreasnoack
Copy link
Member Author

Correct me if I'm mistaken, but that only supports a special case of strides, right?

Are you referring to the cconvert method definition? I believe, it supports the type of strides that are compatible with BLAS and LAPACK which would probably be sufficient for most applications of CuArrays.

@maleadt
Copy link
Member

maleadt commented Sep 20, 2018

This is implemented now.
EDIT: false, only support for broadcast has been improved.

@maleadt maleadt closed this as completed Sep 20, 2018
@maleadt maleadt reopened this Oct 24, 2018
@maleadt maleadt transferred this issue from JuliaGPU/CuArrays.jl May 27, 2020
@maleadt maleadt added cuda array Stuff about CuArray. enhancement New feature or request labels May 27, 2020
@maleadt
Copy link
Member

maleadt commented Oct 29, 2020

the problem is that unsafe_convert(Ptr{T}, CuArray{T}) isn't defined and therefore you can't pass a SubArray{CuArray} to ccall.

This has been implemented now, e.g.:

## copy
for (fname, elty) in ((:cublasDcopy_v2,:Float64),
(:cublasScopy_v2,:Float32),
(:cublasZcopy_v2,:ComplexF64),
(:cublasCcopy_v2,:ComplexF32))
@eval begin
function copy!(n::Integer,
x::StridedCuArray{$elty},
y::StridedCuArray{$elty},)
$fname(handle(), n, x, stride(x, 1), y, stride(y, 1))
y
end
end
end

@maleadt maleadt closed this as completed Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants