Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get benchmarks working again #186

Merged
merged 4 commits into from
Jan 29, 2024
Merged

Get benchmarks working again #186

merged 4 commits into from
Jan 29, 2024

Conversation

thomasfaingnaert
Copy link
Member

No description provided.

@maleadt
Copy link
Member

maleadt commented Jan 24, 2024

Hmm, this is unfortunate. It's only there for the validation, right? Can't we do something lazy, like https://github.com/JuliaGPU/GPUCompiler.jl/blob/962b84ed84cffc268ce1c28b1c71037566622f24/src/reflection.jl#L60-L62 ?

@thomasfaingnaert
Copy link
Member Author

It's only in the benchmarks/Project.toml, not in the Project.toml in the root, so that doesn't matter, right?

@maleadt
Copy link
Member

maleadt commented Jan 24, 2024

Ah, I missed that. Carry on.

@thomasfaingnaert
Copy link
Member Author

Benchmarks bump into an illegal memory access now, which is weird, because the configuration where it happens should also be run in CI, and it's fine there:

[ Info: Running benchmark WMMA GEMM Float16*Float16+Float32=Float32 (256×256) · (256×256) (TN) Block (256, 64, 64) Warps (2, 2) OP (16, 16, 16)...
ERROR: LoadError: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
throw_api_error(res::CUDA.cudaError_enum)
   @ CUDA ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/lib/cudadrv/libcuda.jl:27
check
   @ ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/lib/cudadrv/libcuda.jl:34 [inlined]
cuCtxSynchronize
   @ ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/lib/utils/call.jl:26 [inlined]
profile_internally(f::var"#473#profiled_code#24"{Configuration, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float16, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float16, 2, CUDA.Mem.DeviceBuffer}}; concurrent::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ CUDA.Profile ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/src/profile.jl:269
profile_internally(f::Function)
   @ CUDA.Profile ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/src/profile.jl:239
macro expansion
   @ ~/.cache/julia-buildkite-plugin/depots/3105e5d3-28f0-4cf0-b90b-02786f04b8f6/packages/CUDA/htRwP/src/profile.jl:62 [inlined]
   @ /var/lib/buildkite-agent/builds/ripper/julialang/gemmkernels-dot-jl/benchmarks/runbenchmarks.jl:120
include(fname::String)
   @ Base.MainInclude ./client.jl:478
   @ none:10
🚨 Error: The command exited with status 1

@maleadt
Copy link
Member

maleadt commented Jan 24, 2024

Benchmarks bump into an illegal memory access now, which is weird, because the configuration where it happens should also be run in CI

Benchmarks don't run with --check-bounds=yes.

@thomasfaingnaert thomasfaingnaert changed the title Add Octavian to benchmark Project.toml Get benchmarks working again Jan 24, 2024
@thomasfaingnaert
Copy link
Member Author

I can only reproduce the illegal memory access locally on 1.9, seems to be fixed in 1.10. Let's bump benchmarks to Julia 1.10 and see if that fixes things.

@maleadt
Copy link
Member

maleadt commented Jan 24, 2024

I can only reproduce the illegal memory access locally on 1.9, seems to be fixed in 1.10. Let's bump benchmarks to Julia 1.10 and see if that fixes things.

image

@thomasfaingnaert
Copy link
Member Author

thomasfaingnaert commented Jan 26, 2024

@maleadt Expired GH token?

2024-01-26 18:56:26 CEST	ERROR: LoadError: Error found in GitHub reponse:
2024-01-26 18:56:26 CEST		Status Code: 422
2024-01-26 18:56:26 CEST		Message: Validation Failed
2024-01-26 18:56:26 CEST		Docs URL: https://docs.github.com/rest/issues/comments#create-an-issue-comment
2024-01-26 18:56:26 CEST		Errors: Any[Dict{String, Any}("message" => "Body is too long (maximum is 65536 characters)", "field" => "data", "code" => "unprocessable", "resource" => "IssueComment")]

EDIT: Nevermind, I missed the error message: Body is too long.

@thomasfaingnaert
Copy link
Member Author

Let's just merge this for now so we at least have some form of benchmarking. I'll keep track of the remaining problems in an issue.

@thomasfaingnaert thomasfaingnaert merged commit 480924c into master Jan 29, 2024
3 of 5 checks passed
@thomasfaingnaert thomasfaingnaert deleted the tf/fix-benchmarks branch January 29, 2024 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants