Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cuda kernel launch of grid sampler #33100

Merged
merged 3 commits into from
May 31, 2021

Conversation

wanghaoshuang
Copy link
Contributor

@wanghaoshuang wanghaoshuang commented May 25, 2021

PR types

Bug fixes

PR changes

OPs

Describe

Fix cuda kernel launch of grid sampler

Fix #29066

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@wanghaoshuang wanghaoshuang requested a review from jerrywgz May 25, 2021 06:22
Copy link
Contributor

@jerrywgz jerrywgz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议增加一个输入较大的单测case

jerrywgz
jerrywgz previously approved these changes May 27, 2021
@wanghaoshuang
Copy link
Contributor Author

wanghaoshuang commented May 31, 2021

@Avin0323

关于benchmark CI failing的说明

对于inputs:

grid (Variable) - dtype: float32, shape: [4, 256, 246, 2]
x (Variable) - dtype: float32, shape: [4, 1, 128, 128]

在该PR之前, block_size = 4256246 / 512, grid_size = 512, block_size超过了max_thread_num, 所以计算结果实际是错误的。
在该PR修改之后, block_size = 512, grid_size=4256246 / 512, 计算结果正确,benchmark耗时增加。

benchmark部分超时log如下:

2021-05-26 13:02:59 [check_op_benchmark_result.py:80] [INFO] ------ OP: grid_sample_3 (forward) ------
2021-05-26 13:02:59 [check_op_benchmark_result.py:82] [INFO] GPU time change: 7.08968% (develop: 0.0168862 -> PR: 0.0180833)
2021-05-26 13:02:59 [check_op_benchmark_result.py:84] [INFO] Total time change: -2.24423% (develop: 0.0382667 -> PR: 0.0374079)
2021-05-26 13:02:59 [check_op_benchmark_result.py:85] [INFO] backward: False
2021-05-26 13:02:59 [check_op_benchmark_result.py:86] [INFO] parameters:
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	grid (Variable) - dtype: float32, shape: [4, 256, 246, 2]
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	x (Variable) - dtype: float32, shape: [4, 1, 128, 128]
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	align_corners (bool): False
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	mode (string): bilinear
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	out_shape (list): [4, 1, 256, 256]
2021-05-26 13:02:59 [check_op_benchmark_result.py:88] [INFO] 	padding_mode (string): reflection

self.mode = "bilinear"

def test_check_grad_normal(self):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. large input导致慢的原因是单测框架的期望梯度算的较慢吧?有测过大概需要多久吗
  2. 如果已经用了skip_check_grad_ci,下面259~260就不需要写了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 是的,单测框架的期望梯度算的较慢。单个case跑了20min,还没有完成。
  2. 已删除259~260

zhangting2020
zhangting2020 previously approved these changes May 31, 2021
Copy link
Contributor

@zhangting2020 zhangting2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for skip_check_grad_ci

Copy link
Contributor

@jerrywgz jerrywgz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanghaoshuang wanghaoshuang merged commit f61e6ee into PaddlePaddle:develop May 31, 2021
wanghaoshuang added a commit to wanghaoshuang/Paddle that referenced this pull request May 31, 2021
@wanghaoshuang wanghaoshuang deleted the fix_grid_sampler branch May 20, 2022 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

【2.0RC】F.grid_sample在特定条件下造成有cuda异常错误
4 participants