-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CPUTuple <: AbstractCPU as new device type #131
Conversation
Codecov Report
@@ Coverage Diff @@
## master #131 +/- ##
==========================================
+ Coverage 85.45% 85.47% +0.02%
==========================================
Files 9 9
Lines 1409 1411 +2
==========================================
+ Hits 1204 1206 +2
Misses 205 205
Continue to review full report at Codecov.
|
So I know that you can't go from buffer = Ref{NTuple{N,T}}()
a = SArray{S,T,N,L}(buffer[]) ...does the dereferencing always reallocate that entire buffer to create |
No, sometimes LLVM is in a good mood. |
So is the issue that we want to be able to do something like...
|
I'll redefine VectorizationBase.memory_reference to return a tuple containing "ptr" and memory, to something like @inline memory_reference(A::BitArray) = (Base.unsafe_convert(Ptr{Bit}, A.chunks), A.chunks)
@inline memory_reference(A::AbstractArray) = memory_reference(device(A), A)
@inline memory_reference(::CPUPointer, A) = (pointer(A), preserve_buffer(A))
@inline function memory_reference(::CPUTuple, A::AbstractArray{T}) where {T}
ref = Ref(A)
Base.unsafe_convert(Ptr{T}, ref), ref
end Then the suggested use would be something like: # either
ptra, presa = stridedpointer_and_buffer(A);
ptrb, presb = stridedpointer_and_buffer(B);
# or
(ptra,ptrb), (presa,presb) = groupedstridedpointer((A,B), (#= description of axis similarity =#));
GC.@preserve presa presb begin
# use ptra and ptrb
end This would allow However, the I'll make a breaking change to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Related to #130
I'm open to suggestions on a better name, e.g., if someone were to define:
it should also be a
CPUTuple
. So maybe something likeCPUStruct
is a better name? OrCPUStructMemory
?Or
CPULLVMArray
? Most homogenous tuples lower to LLVM Arrays.The idea is to represent something that should really be
CPUPointer
, except for the niggling detail that Julia semantics don't allow us to get a pointer to it, while C(++) would allow us to just use the&
operator to get the address.Julia's semantics actually more closely match LLVM there, and what Clang does when you
&
is basically the same as callingRef
in Julia and then using thatPtr
, except maybe Clang optimizes away the copy more consistently for some reason? That'll take some more exploration, I just recall in my tests some time ago Julia often failed to optimize them away, but I didn't actually check whether a similar example in C succeeds.So the
CPUTuple
type means to say that this is what we should do.That's to distinguish it from other
AbstractArray
representations that aren't actually backed by memory underlying their loads, likeFill
arrays, ranges, etc.