Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a function to get device from array (type)? #268

Closed
oschulz opened this issue Oct 7, 2021 · 17 comments · Fixed by #269
Closed

Adding a function to get device from array (type)? #268

oschulz opened this issue Oct 7, 2021 · 17 comments · Fixed by #269

Comments

@oschulz
Copy link
Collaborator

oschulz commented Oct 7, 2021

It would be nice to have a function that returns a KernelAbstractions.jl device given an array (type), e.g.

get_device(A::AbstractArray) = get_device(typeof(A))

get_device(::Type{<:AbstractArray}) = CPUDevice()
get_device(T::Type{<:AbstractGPUArray}) = throw(ArgumentError("NoKernelAbstractions.Device type defined for arrays of type $(T.name.name)"))

get_device(::Type{<:CuArray}) = CUDADevice()
get_device(::Type{<:AbstractCuSparseArray}) = CUDADevice()

get_device(::Type{<:ROCArray}) = ROCDevice()

so that users can write device-agnostic code like

function foo(A::AbstractArray)
    A = ones(1024, 1024)
    device = KernelAbstractions.get_device(A)
    kernel = mul2(device, 16)
    event = kernel(A, ndrange=size(A))
    ...

Would that fit into the KernelAbstractions concept?

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

CC @lmh91

@oschulz oschulz changed the title Function to get device from array (type) Adding a function to get device from array (type)? Oct 7, 2021
@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

Yes now that we have split the packages between front-end and backend, I think this is feasible.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

Thanks! Should I do a PR?

@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

Yes please.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

Will do :-)

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

@vchuravy should this go into a single PR on this repo, or should I to a PR with the changes to KernelAbstractions first, then you guys tag KernelAbstractions, and then the changes to CUDAKernels and ROCKernels (so CI ran run)?

@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

You should be able to do all at once. CI is run across the commit

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

You should be able to do all at once. CI is run across the commit

Oh, neat!

Oh, here we go: #269

@chriselrod
Copy link

Note that ArrayInterface also implements a device function.

julia> using ArrayInterface

julia> ArrayInterface.device([1.2])
ArrayInterface.CPUPointer()

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

Note that ArrayInterface also implements a device function.

Ah, right! But to write generic code like

    if size(a)[1] != size(b)[2] || size(a)[2] != size(b)[1]
        println("Matrix size mismatch!")
        return nothing
    end
    device = KernelAbstractions.get_device(a)
    n = device isa GPU ? 256 : 4
    kernel! = naive_transpose_kernel!(device, n)
    kernel!(a, b, ndrange=size(a))
end

We need the device as a subtype of KernelAbstractions.Device. There's probably no conversion from ArrayInterface devices yet, right?

Having KernelAbstractions.Device(dev::ArrayInterface.AbstractDevice) would be neat, we wouldn't need get_device then. But we'd need to add ArrayInterface as a dependency, looks like that would increase the load time of KernelAbstractions from about 1.1 s to about 1.6 s.

What do you think @vchuravy ?

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

Having KernelAbstractions.Device(dev::ArrayInterface.AbstractDevice) would be neat, we wouldn't need get_device then.

Ah, no, I don't think that could work, currently: ArrayInterface.device is not specific enough, there are no subtypes for different GPU vendors:

julia> ArrayInterface.device(cu(rand(5)))
ArrayInterface.GPU()

@vchuravy
Copy link
Member

vchuravy commented Oct 8, 2021

Yeah I am not a fan to take on another dependency for this. The long term development goal is to use KA in GPUArrays.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 8, 2021

The long term development goal is to use KA in GPUArrays.

Do you plan to move some code into additional packages like lib/CPUKernels and/or lib/KernelCompilers, to make KA really lightweight then (removing deps like Cassette from KA itself and so on)?

@chriselrod
Copy link

Ah, no, I don't think that could work, currently: ArrayInterface.device is not specific enough, there are no subtypes for different GPU vendors:

I've never really done anything with GPUs / didn't really think about what's actually needed.
That could probably be changed without breaking any actual code. But I also get not wanting to take on a dependency.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 8, 2021

Personally, I think it would be nice if these things would converge longer term, though.

Bit OT: Why does ArrayInterface have a (comparatively) long load time, actually? For an interface package, I mean?

@chriselrod
Copy link

Bit OT: Why does ArrayInterface have a (comparatively) long load time, actually? For an interface package, I mean?

Discussion: JuliaArrays/ArrayInterface.jl#211 (comment)

Personally, I think the library is too big. But a big problem w/ respect to load times is that it uses Requires.jl to add methods for some other libraries. Requires.jl's load times have been improved drastically as of Julia 1.7, so it'll get better.
Because ArrayInterface is using Requires, it could @requires KernelAbstractions to define device to return (CUDA/ROC)Device.
Although

  1. What if someone loaded CUDA but not KernelAbstractions -- what should ArrayInterface.device do then?
  2. What about CPUs? ArrayInterface.device is mostly used currently to say whether or not we can get a pointer to it, or if we can get a pointer to it after wrapping it in a Ref, so the granularity that exists now is needed. Vs the KernelAbstractions.CPUDevice used for dispatching without need for granularity.
    This means that some coordination would be needed, e.g. defining AbstractCPUDevice for KernelAbstraction to be able to use for dispatch.

What's the benefit of converging?

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 8, 2021

What's the benefit of converging?

I think it would be nice to have a single API to query what kind of device an array lives on, across the ecosystem. I think a truly light-weight package that defines "array traits" would be a good thing to have, long term. Package like CUDA, AMDGPU and oneAPI could than implement those traits together with the array types. And KernelAbstractions (for example) could implement support the devices types in the vendor-specific kernel packages.

IMHO the occurrence of many @requires in the ecosystem is often a sign of a missing lightweight "interface-only" package (like ChainRulesCore) that people are willing to depend on.

Just long-term ramblings ... for now I'll be happy to have get_device in KA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants