[Improve] Migrating DIVUP to GET_BLOCKS #1586

teamwong111 · 2021-12-14T11:44:35Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

resolves #1400
Migrating DIVUP to GET_BLOCKS because GET_BLOCKS looks safer.

Modification

As above.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

Before PR:

I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with some of those projects, like MMDet or MMCls.
CLA has been signed and all committers have signed the CLA in this PR.

zhouzaida · 2021-12-14T12:21:32Z

Please @AllentDan have a look.

AllentDan

Directly replace DIVUP with GET_BLOCKS may need kernel loop then

AllentDan · 2021-12-15T02:51:48Z

mmcv/ops/csrc/common/cuda/common_cuda_helper.hpp

@@ -9,10 +9,10 @@

 #define THREADS_PER_BLOCK 512

-#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))
+// #define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))


can be removed if useless

AllentDan · 2021-12-15T02:57:16Z

mmcv/ops/csrc/parrots/iou3d.cpp

@@ -134,7 +134,7 @@ void iou3d_nms_forward(Tensor boxes, Tensor keep, Tensor keep_num,
    int64_t *keep_data = keep.data_ptr<int64_t>();
    int64_t *keep_num_data = keep_num.data_ptr<int64_t>();

-    const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);
+    const int col_blocks = GET_BLOCKS(boxes_num, THREADS_PER_BLOCK_NMS);


what if actual col_blocks is greater than 4096?

I think it may be unsafe and we should only use GET_BLOCKS for block allocation. I will fix it.

AllentDan · 2021-12-15T02:57:49Z

mmcv/ops/csrc/parrots/iou3d.cpp

@@ -189,7 +189,7 @@ void iou3d_nms_normal_forward(Tensor boxes, Tensor keep, Tensor keep_num,
    int64_t *keep_data = keep.data_ptr<int64_t>();
    int64_t *keep_num_data = keep_num.data_ptr<int64_t>();

-    const int col_blocks = DIVUP(boxes_num, THREADS_PER_BLOCK_NMS);
+    const int col_blocks = GET_BLOCKS(boxes_num, THREADS_PER_BLOCK_NMS);


what if actual col_blocks is greater than 4096?

mmcv/ops/csrc/common/pytorch_cpp_helper.hpp

mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh

AllentDan

LGTM after removing useless lines

AllentDan · 2021-12-23T05:57:27Z

mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh

+  const int blocks =
+      (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
+  CUDA_2D_KERNEL_BLOCK_LOOP(col_start, blocks, row_start, blocks) {
+    // if (row_start > col_start) return;


may remove if useless

This annotation was left before. Shall we delete it? And this statement seems like https://github.com/open-mmlab/mmcv/blame/master/mmcv/ops/csrc/common/cuda/nms_cuda_kernel.cuh#L37, I don't know which is more suitable.

Well, that line seems to depend on how the block is initialized. It is useful when the block is a square or a wide rectangle. As we don't know how the user may initialize the block. Just keep it commented then.

AllentDan · 2021-12-23T05:57:56Z

mmcv/ops/csrc/common/cuda/iou3d_cuda_kernel.cuh

+  const int blocks =
+      (boxes_num + THREADS_PER_BLOCK_NMS - 1) / THREADS_PER_BLOCK_NMS;
+  CUDA_2D_KERNEL_BLOCK_LOOP(col_start, blocks, row_start, blocks) {
+    // if (row_start > col_start) return;


may remove if useless

ZwwWayne · 2021-12-29T15:16:53Z

Need to resolve conflicts.

teamwong111 · 2021-12-30T02:21:36Z

Need to resolve conflicts.

Done.

grimoire

LGTM

[Improve] migrating DIVUP to GET_BLOCKS

7b145d3

teamwong111 linked an issue Dec 14, 2021 that may be closed by this pull request

Suggest migrating all DIVUP to GET_BLOCKS as they essentially do the same thing but GET_BLOCKS looks safer. #1400

Closed

AllentDan reviewed Dec 15, 2021

View reviewed changes

zhouzaida requested review from zhouzaida, grimoire and ZwwWayne December 15, 2021 03:05

grimoire reviewed Dec 16, 2021

View reviewed changes

mmcv/ops/csrc/common/pytorch_cpp_helper.hpp Outdated Show resolved Hide resolved

[Fix] use GET_BLOCKS only for block alloc and del useless statements

6487360

grimoire reviewed Dec 21, 2021

View reviewed changes

mmcv/ops/csrc/common/cuda/assign_score_withk_cuda_kernel.cuh Outdated Show resolved Hide resolved

[Fix] add kernel loop for nms and del useless statements

86e258c

AllentDan approved these changes Dec 23, 2021

View reviewed changes

[Fix] Resolve conflicts

5bdc079

grimoire approved these changes Dec 30, 2021

View reviewed changes

zhouzaida approved these changes Jan 2, 2022

View reviewed changes

zhouzaida merged commit b586cc2 into open-mmlab:master Jan 8, 2022

teamwong111 deleted the fix-block-alloc branch January 10, 2022 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improve] Migrating DIVUP to GET_BLOCKS #1586

[Improve] Migrating DIVUP to GET_BLOCKS #1586

teamwong111 commented Dec 14, 2021 •

edited by zhouzaida

Loading

zhouzaida commented Dec 14, 2021

AllentDan left a comment •

edited

Loading

AllentDan Dec 15, 2021

teamwong111 Dec 23, 2021

AllentDan Dec 15, 2021

teamwong111 Dec 20, 2021

teamwong111 Dec 23, 2021

AllentDan Dec 15, 2021

teamwong111 Dec 20, 2021

AllentDan left a comment

AllentDan Dec 23, 2021

teamwong111 Dec 23, 2021

AllentDan Dec 23, 2021

teamwong111 Dec 23, 2021

AllentDan Dec 23, 2021

ZwwWayne commented Dec 29, 2021

teamwong111 commented Dec 30, 2021

grimoire left a comment

[Improve] Migrating DIVUP to GET_BLOCKS #1586

[Improve] Migrating DIVUP to GET_BLOCKS #1586

Conversation

teamwong111 commented Dec 14, 2021 • edited by zhouzaida Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

zhouzaida commented Dec 14, 2021

AllentDan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AllentDan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZwwWayne commented Dec 29, 2021

teamwong111 commented Dec 30, 2021

grimoire left a comment

Choose a reason for hiding this comment

teamwong111 commented Dec 14, 2021 •

edited by zhouzaida

Loading

AllentDan left a comment •

edited

Loading