-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce precision conversion when packing #124
Reduce precision conversion when packing #124
Conversation
Thanks Thomas!! I always wanted to clean this up but didn't find the chance to do it. I think it makes sense to enforce this conversion inside the cuda kernels but I'm also find with your current way in python. The test failed because of the format issue. You can fix it with |
…dy broken since it was reverted
Cool I updated the kernels to support that convention, I guess we can also have a conversion whether |
Also one a few tests are broken due to the removal of |
Sorry for the late reply. Got distracted by a traveling. Thanks for the update. You raised up a good point about the #samples. In the example codes I was using up to 2^20 samples, and it is far from reaching to GPU memory limit. I didn't test but likely 2^32 is feasible with a small network. In any case, I agree int64 is more safe for I'm going to merge this PR now and feel free to open a new one if you want. |
Assume that:
ray_indices
are always Long. Indices are always long intorch
. This makes indexing much easier.packed_info
are always Int.Depending on if you agree, we could enforce this convention below as well (in cuda kernels)