Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit time held by CUDNN locks. #1491

Merged
merged 3 commits into from
May 9, 2022
Merged

Limit time held by CUDNN locks. #1491

merged 3 commits into from
May 9, 2022

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Apr 29, 2022

Hotfix for #1461, introduced by #1430. See #1461 (comment) and JuliaLang/julia#38487.

maleadt added 3 commits May 9, 2022 10:11
They inhibit finalizers, resulting in OOMs.
Locks taken in a finalizer should never be contended,
as ReentrantLock inhibits finalizers when they're
taken from a non-finalizer task.
@maleadt maleadt force-pushed the tb/cudnn_hotfix branch from d35ad58 to 92873e4 Compare May 9, 2022 08:41
@maleadt
Copy link
Member Author

maleadt commented May 9, 2022

I've added some small clean-ups: Use of @spinlock shouldn't be required anymore since locking now inhibits finalizers (the very crux of this issue, which should result in locks taken in finalizers to be always uncontended).

@codecov
Copy link

codecov bot commented May 9, 2022

Codecov Report

Merging #1491 (92873e4) into master (554dcc4) will increase coverage by 0.06%.
The diff coverage is 96.92%.

@@            Coverage Diff             @@
##           master    #1491      +/-   ##
==========================================
+ Coverage   72.64%   72.70%   +0.06%     
==========================================
  Files         131      131              
  Lines        9805     9812       +7     
==========================================
+ Hits         7123     7134      +11     
+ Misses       2682     2678       -4     
Impacted Files Coverage Δ
lib/cudnn/CUDNN.jl 37.50% <0.00%> (ø)
lib/utils/threading.jl 78.94% <ø> (+4.87%) ⬆️
lib/cudadrv/memory.jl 78.59% <100.00%> (ø)
lib/cudnn/convolution.jl 74.75% <100.00%> (+5.43%) ⬆️
lib/cudnn/descriptors.jl 100.00% <100.00%> (ø)
lib/utils/cache.jl 87.50% <100.00%> (-1.39%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 554dcc4...92873e4. Read the comment docs.

@maleadt maleadt merged commit dc8a11e into master May 9, 2022
@maleadt maleadt deleted the tb/cudnn_hotfix branch May 9, 2022 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant