prod_force: support multiple frames in parallel #2600

njzjz · 2023-06-10T00:41:56Z

The previous prod_force did not support multiple frames in parallel, which was slow when the batch size was large.

This PR adds support so that prod_force can be parallelized in the dimension of the samples.

When the batch size is about 70, the prod_force op is 10x faster than before on GPU cards.

The previous `prod_force` did not support multiple frames in parallel, which was slow when batch size is large. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

codecov · 2023-06-10T00:53:10Z

Codecov Report

Patch coverage: 87.75% and project coverage change: +0.02 🎉

Comparison is base (4b822b8) 76.66% compared to head (b7787ee) 76.68%.

Additional details and impacted files

@@            Coverage Diff             @@
##            devel    #2600      +/-   ##
==========================================
+ Coverage   76.66%   76.68%   +0.02%     
==========================================
  Files         233      233              
  Lines       24177    24182       +5     
  Branches     1711     1695      -16     
==========================================
+ Hits        18536    18545       +9     
  Misses       4518     4518              
+ Partials     1123     1119       -4

Impacted Files	Coverage Δ
source/op/prod_force_multi_device.cc	`91.40% <33.33%> (+0.82%)`	⬆️
source/lib/src/prod_force.cc	`83.58% <95.34%> (+6.65%)`	⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

denghuilu

LGTM

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

source/lib/src/prod_force.cc

Similiar to #2600.

Add an experimental model called pairwise DPRc, which is fragment-based and integrated with QM/MM. Compression inference and training are supported. Unit tests and documentation have been added. Some features or bugfix to implement this PR have been merged in #2549, #2600, #2601, #2604, #2631, #2635, #2665, #2666, #2667, and #2679. This PR makes some changes to `model.build_descrpt` additionally: - fix errors when the suffix is not empty - fix errors when `fparam` or `aparam` are given - support model-customized `input_map` --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

prod_force: support multiple frames in parallel

0e864a8

The previous `prod_force` did not support multiple frames in parallel, which was slow when batch size is large. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

github-actions bot added Core CUDA OP ROCM labels Jun 10, 2023

revert double_vec

fec8548

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

njzjz added 3 commits June 9, 2023 21:18

fix prod_force_a_cpu

7aa61f9

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

fix prod_force_a_cpu when start_index is not zero

145e31c

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

typo

736051b

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

njzjz marked this pull request as ready for review June 10, 2023 02:23

njzjz mentioned this pull request Jun 10, 2023

prod_force_grad: support multiple frames in parallel #2601

Merged

denghuilu approved these changes Jun 10, 2023

View reviewed changes

add docs and overload

b7787ee

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

wanghan-iapcm reviewed Jun 12, 2023

View reviewed changes

source/lib/src/prod_force.cc Show resolved Hide resolved

wanghan-iapcm pushed a commit that referenced this pull request Jun 12, 2023

prod_force_grad: support multiple frames in parallel (#2601)

046a5a4

Similiar to #2600.

wanghan-iapcm merged commit bb0d02b into deepmodeling:devel Jun 12, 2023

njzjz mentioned this pull request Jun 16, 2023

[Feature Request] Compute prod_env_mat OP in parallel in the dimension of the frame #2618

Open

njzjz mentioned this pull request Jul 17, 2023

add pairwise DPRc #2682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prod_force: support multiple frames in parallel #2600

prod_force: support multiple frames in parallel #2600

njzjz commented Jun 10, 2023 •

edited

Loading

codecov bot commented Jun 10, 2023 •

edited

Loading

denghuilu left a comment

prod_force: support multiple frames in parallel #2600

prod_force: support multiple frames in parallel #2600

Conversation

njzjz commented Jun 10, 2023 • edited Loading

codecov bot commented Jun 10, 2023 • edited Loading

Codecov Report

denghuilu left a comment

Choose a reason for hiding this comment

njzjz commented Jun 10, 2023 •

edited

Loading

codecov bot commented Jun 10, 2023 •

edited

Loading