Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUFFT] Preallocate a buffer for complex-to-real FFT #2578

Merged
merged 8 commits into from
Dec 14, 2024

Conversation

Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor comments. CI failures are also related.

lib/cufft/fft.jl Outdated Show resolved Hide resolved
lib/cufft/fft.jl Outdated Show resolved Hide resolved
Copy link

codecov bot commented Dec 13, 2024

Codecov Report

Attention: Patch coverage is 95.65217% with 2 lines in your changes missing coverage. Please review.

Project coverage is 73.69%. Comparing base (ca8f6cf) to head (6f82697).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
lib/cufft/fft.jl 95.65% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2578      +/-   ##
==========================================
+ Coverage   73.67%   73.69%   +0.01%     
==========================================
  Files         156      156              
  Lines       15030    15048      +18     
==========================================
+ Hits        11074    11090      +16     
- Misses       3956     3958       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -25,32 +25,34 @@ Base.:(*)(p::ScaledPlan, x::DenseCuArray) = rmul!(p.p * x, p.scale)

# N is the number of dimensions

mutable struct CuFFTPlan{T<:cufftNumber,S<:cufftNumber,K,inplace,N} <: Plan{S}
mutable struct CuFFTPlan{T<:cufftNumber,S<:cufftNumber,K,inplace,N,R,B} <: Plan{S}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dedicated typevar instead of simply Union{Nothing,CuArray} seems overkill, but I guess you want to eliminate the partial type information?

Copy link
Member Author

@amontoison amontoison Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I also take this PR as an opportunity to have a concrete type for region in the structure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust you have good reasons for this, but it's not always advantageous to do so. The CUTENSOR wrappers, for example, cause excessive specialization for little gain, since these calls are very coarse-grained anyway (yet leading to very long test times). In this case, though, the specialization is already there (since B doesn't introduced new information already contained in inplace and T).

Copy link
Member Author

@amontoison amontoison Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't changed the dispatch for the high-level FFT routines, so I didn't expect any slowdown. I simply made the low-level structure fully inferable.
I'll run a quick regression test for fft/ifft before and after this PR to check if there's any noticeable difference.

I didn't test it without the type B in the structure, but I observed a nice speed-up in my code when adding the buffer inside for rfft/irfft.
Julia also stopped crashing due to OOM issues.

lib/cufft/fft.jl Outdated Show resolved Hide resolved
Co-authored-by: Tim Besard <tim.besard@gmail.com>
@amontoison amontoison enabled auto-merge (squash) December 14, 2024 11:31
@amontoison amontoison merged commit 19a08ef into JuliaGPU:master Dec 14, 2024
1 check passed
@amontoison amontoison deleted the cufft_buffer branch December 14, 2024 15:59
maleadt referenced this pull request Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants