-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for JLD2 #1833
Comments
JLD2 has never really been supported. I guess the fact it worked was just sheer luck? In any case, I'm not familiar with JLD2, so I'll defer to anybody who is to take a look 🙂 |
Hi @denizyuret, from the perspective of |
|
Alas my hope was shortlived :( I get the same error with CUDA v4.1.2 |
I still can't reproduce your error. (I tried julia 1.8.5 and 1.9.0-rc1 with CUDA 3.13.1 and JLD2 v0.4.31) |
Can you send your CUDA.versioninfo so I can see what the difference may be? (library/driver version, gpu type etc could be a factor?) |
|
I tried JLD2.writeas(), JLD2.wconvert(), and JLD2.rconvert() as you suggested. Now I get the following error message:
What is "refcount"? What purpose does it serve? How can one alter its value, if altering it is necessary? |
This here (and also the refcount ) makes me think that this is a problem with the memory management when creating the CuArray. JLD2 allocates the underlying array and passes it to the
|
The f() function you suggested works without problems. refcount of the resulting array is 1.
CuArray copies the contents of data (stored in RAM) to the GPU memory, and once the GPU array is constructed I don't think it cares about what happens to the RAM array. But I am not sure what refcount is for and how it is set, so I may be talking nonsense. If I change the value of refcount manually to 0, things don't break for example. @maleadt any idea how refcount=0 may appear and whether it may be the source of our problems? |
The refcount field is to keep track of the underlying buffer, so that multiple CuArrays can share the same memory (e.g., when you take a view, or reinterpret an array, or reshape it). refcount=0 may happen when you're serializing a freed array. |
Thank you for this info. |
Hmm, I was misunderstanding how JLD serializes object. If we're really just calling |
Yeah, that's the curious bit. Let me summarize it quickly:
We define a struct When you give Therefore, with this code, we store the data in |
The fact that the deserialized object contains a different buffer pointer indicates that the @denizyuret since only you seem to be able to reproduce this, I'd add some logging to the CuArray finalizer that decrements the refcount, to see when and from where it gets run (e.g. by adding |
Here is what I do to be able to save/load CuArrays with JLD2 files:
This used to work with CuArray{T,N} but no longer works with CuArray{T,N,D}. Here is the error I get:
When I compare the original array with the loaded version they seem similar except for the refcount:
Finally, if I assign the value read to a global variable in rconvert it works without any errors:
The text was updated successfully, but these errors were encountered: