-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nearest neighbor operation #58
Conversation
raimis
commented
May 18, 2022
•
edited
Loading
edited
- Move code from TorchMD-NET (POC: neighbor search torchmd/torchmd-net#61)
- Integration
- Tests
- Documentation
@peastman could you review? The PBC (torchmd/torchmd-net#92) will be add with a separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! My comments at torchmd/torchmd-net#61 (comment) still apply. For very small molecules, this should be as fast as anything. Once you get up to several hundred atoms, there are ways it could be made faster. Having each thread loop over multiple atoms or having multiple threads work together could reduce the number of row/column computations, position loads, and atomic adds.
If `max_num_neighbors == -1` (default), all the atom pairs are returned, | ||
(i.e. `num_atoms * (num_atoms + 1) / 2` pairs). If `max_num_neighbors > 0`, | ||
only `num_atoms * max_num_neighbors` pairs are returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit ambiguous. Is it saying that if max_num_neighbors
is -1 then the cutoff is ignored and all pairs are returned regardless of distance? Or is it simply returning the result in a different shape tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I improved the text and added a few example.
------- | ||
neighbors: `torch.Tensor` | ||
Atom pair indices. The shape of the tensor is `(2, num_pairs)`. | ||
If an atom pair is separated by a larger distance than the cutoff, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is num_pairs
? Does it depend on the data? Or is it always the same regardless of positions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defined num_pairs
template<> __device__ __forceinline__ float sqrt_(float x) { return ::sqrtf(x); }; | ||
template<> __device__ __forceinline__ double sqrt_(double x) { return ::sqrt(x); }; | ||
|
||
// Support pre-Kepler GPUs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's serious dedication to maintaining old hardware! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we support CUDA 11, we have to support these cards.
For molecules of 100-200 atoms, the speeds improve with more parallelization exposed, even it means some duplicate calculations. For larger molecules, maybe it is possible to improve a bit as you suggest, but the scaling of the current algorithm is quadratic. So it won't go far. I thing, a better approach is two write another neighbor kernel with the cell list (some preliminary work torchmd/torchmd-net#68), which will be more efficient for large molecule and leave this one optimized for small ones. |
@peastman any more comments? |
Looks good. |