Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove Manifest #1657

Merged
merged 3 commits into from
Jul 10, 2021
Merged

remove Manifest #1657

merged 3 commits into from
Jul 10, 2021

Conversation

CarloLucibello
Copy link
Member

let's be free of the Manifest

mcabbott
mcabbott previously approved these changes Jul 10, 2021
Copy link
Member

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

bors bot added a commit that referenced this pull request Jul 10, 2021
1657: remove Manifest r=mcabbott a=CarloLucibello

let's be free of the Manifest


Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
@bors
Copy link
Contributor

bors bot commented Jul 10, 2021

Build failed:

@mcabbott
Copy link
Member

mcabbott commented Jul 10, 2021

Failure might be new to this PR. It involves

[4] ctc_alpha(ŷ::CuArray{Float64, 2}, y::Vector{Int64})
  @ Flux.Losses /var/lib/buildkite-agent/builds/rtx4000-gpuci4-julia-csail-mit-edu/julialang/flux-dot-jl/src/losses/ctc-gpu.jl:211

at /test/ctc-gpu.jl:26. Looks odd to call this with a Vector & a CuArray, but seemingly worked before? Paging @maetshju

@testset "ctc-gpu" begin
  x = rand(10, 50)
  y = rand(1:9, 30)
  x_cu = CuArray(x)
  g1 = gradient(ctc_loss, x_cu, y)[1]

Last pass I can see was on CUDA v3.2.1, NNlibCUDA v0.1.3
this failure is on CUDA v3.3.3, NNlibCUDA v0.1.6

@CarloLucibello
Copy link
Member Author

yeah I'm seeing the same failure in #1516

@mcabbott
Copy link
Member

mcabbott commented Jul 10, 2021

Oh right. Slightly different stack trace but this line appears. CUDA v3.2.1 & NNlibCUDA v0.1.3, which is what the manifest there specifies.

Tests have had CUDA.allowscalar(false) for at least a year. But most recent passing GPU test runs seem to have many "Warning: Performing scalar operations on GPU arrays", e.g. here:
https://buildkite.com/julialang/flux-dot-jl/builds/1257#d95b78a4-6453-48cc-8891-eeacb4cb8b37

@CarloLucibello
Copy link
Member Author

Even removing the ctc test I see those errors everywhere. cc @maleadt

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jul 10, 2021

Errors are due to kernels doing scalar indexing disallowed by new cuda and are tracked in #1648

@mcabbott
Copy link
Member

mcabbott commented Jul 10, 2021

Ok, I've updated the title.

This PR should perhaps just add an upper bound to CUDA? Then #1648 will mark tests broken on the new version.

@CarloLucibello
Copy link
Member Author

Even removing the ctc test I see those errors everywhere. cc @maleadt

actually, that's wrong, there was a problem with my cuda setup.

I fixed the ctc_loss, it shouldn't be doing scalar indexing now and tests should pass

@@ -10,6 +10,7 @@
@test outputsize(m, (10,); padbatch=true) == (2, 1)
@test outputsize(m, (10, 30)) == (2, 30)

@info "Don't mind the following error, it's for testing purpose."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@CarloLucibello
Copy link
Member Author

@mcabbott need an approval again

Copy link
Member

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

@bors
Copy link
Contributor

bors bot commented Jul 10, 2021

Build succeeded:

@bors bors bot merged commit 46b73a8 into master Jul 10, 2021
@CarloLucibello CarloLucibello deleted the cl/manifest branch April 7, 2022 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants