Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function get_device #269

Merged
merged 14 commits into from
Mar 2, 2022
Merged

Add function get_device #269

merged 14 commits into from
Mar 2, 2022

Conversation

oschulz
Copy link
Collaborator

@oschulz oschulz commented Oct 7, 2021

Closes #268

@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

bors try

bors bot added a commit that referenced this pull request Oct 7, 2021
@bors
Copy link
Contributor

bors bot commented Oct 7, 2021

try

Build failed:

src/KernelAbstractions.jl Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

@jpsamaroo can you unbork AMDGPU on nightly?

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

In the examples, we could now simplify code like

    if isa(a, Array)
        kernel! = naive_transpose_kernel!(CPU(),4)
    else
        kernel! = naive_transpose_kernel!(CUDADevice(),256)
    end

to

    device = KernelAbstractions.get_device(A)
    n = isa(a, AbstractGPUArray) ? 256 : 4
    kernel! = naive_transpose_kernel!(device,n)

but it would require a dependency (at least for the example) on GPUArrays.jl.

I could also add a function isgpuarray (or similar), then we could do

    device = KernelAbstractions.get_device(A)
    n = KernelAbstractions.isgpuarray(A) ? 256 : 4
    kernel! = naive_transpose_kernel!(device,n)

without additional dependencies. What do you think?

@vchuravy
Copy link
Member

vchuravy commented Oct 7, 2021

Hm we could also add a default static launch size, but it is just an example. What I would do there is not check for GPU-ness, but check for Array

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

but check for Array

But what's if the user passes a SubArray or so?

Hm we could also add a default static launch size

My thinking was that the author of naive_transpose!(a, b) can judge best how much parallelism would make sense, given the input size. Now they just need to know what's possible on the hardware side - do we have a cross-vendor API function to query device capabilities?

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

But as for the example - silly me, we can just do

    device = KernelAbstractions.get_device(A)
    n = device isa GPU ? 256 : 4
    kernel! = naive_transpose_kernel!(device, n)

Very literate, I think. :-)

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 7, 2021

I changed the examples accordingly.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 8, 2021

Looks like bors is still running. :-)

@vchuravy
Copy link
Member

vchuravy commented Oct 8, 2021

bors try

bors bot added a commit that referenced this pull request Oct 8, 2021
@bors
Copy link
Contributor

bors bot commented Oct 8, 2021

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 14, 2021

@vchuravy , does this design look better to you now?

Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, I am not a fan that this requires us pulling LinearAlgebra and SparseArrays in even though these are currently stdlibs. I would rather have the question of "what is your storage array" be solved somewhere like Adapt.jl

src/KernelAbstractions.jl Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Oct 17, 2021
@bors
Copy link
Contributor

bors bot commented Oct 17, 2021

try

Build failed:

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 17, 2021

I would rather have the question of "what is your storage array" be solved somewhere like Adapt.jl

Yes, that would be nice - I've also been thinking that it would be nice if Adapt.jl would get a "read/inspect" functionality in addition to the "write/transform" functionality in the future. In the end, the designer of a data structure knows best what "underlying" data should mean (esp. if there are several arrays), so the clean way of doing a recursive "where is this stored?" would be to implement an "inspect" companion to adapt().

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 18, 2021

Is bors hanging?

@vchuravy
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Oct 18, 2021
@bors
Copy link
Contributor

bors bot commented Oct 18, 2021

try

Build failed:

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 19, 2021

Is bors stalled again?

@vchuravy
Copy link
Member

You can click on the red x next to they "Try #" https://github.com/JuliaGPU/KernelAbstractions.jl/runs/3927630339

@jpsamaroo
Copy link
Member

ROCm CI got hung after I updated the local buildkite repo; I've restarted the runners, which are now processing jobs.

@oschulz
Copy link
Collaborator Author

oschulz commented Oct 19, 2021

Thanks!

@oschulz
Copy link
Collaborator Author

oschulz commented Feb 14, 2022

I think the remaining test failures are unrelated to this PR (correct?).

@oschulz
Copy link
Collaborator Author

oschulz commented Feb 15, 2022

I saw JuliaGPU/AMDGPU.jl#187 got merged with a new AMDGPU.jl release. Should we try bors again?

@oschulz
Copy link
Collaborator Author

oschulz commented Feb 16, 2022

@jpsamaroo , could you kick bors again, with the new AMD fixes?

@jpsamaroo
Copy link
Member

@oschulz you would need to rebase on #288, although that PR isn't yet CI-green.

@oschulz
Copy link
Collaborator Author

oschulz commented Feb 16, 2022

rebase on #288, although that PR isn't yet CI-green.

Ah Ok - maybe easiest to just wait until #288 is in, then?

Project.toml Outdated Show resolved Hide resolved
@oschulz
Copy link
Collaborator Author

oschulz commented Mar 1, 2022

I saw you made some changes @vchuravy , do I need to do anything in addition?

(Am I correct that the current test failures are unrelated to this PR?)

@vchuravy
Copy link
Member

vchuravy commented Mar 2, 2022

The ROCM failure looks related, only Julia nightly is currently expected to fail.

@vchuravy vchuravy merged commit 5c1444f into JuliaGPU:master Mar 2, 2022
@vchuravy
Copy link
Member

vchuravy commented Mar 2, 2022

Thank you Oliver!

@oschulz
Copy link
Collaborator Author

oschulz commented Mar 3, 2022

Thank you Valentin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding a function to get device from array (type)?
3 participants