-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 5th No.35】为 Paddle 新增 histogramdd API -part #57880
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
self.bins = tuple([paddle.to_tensor(bin) for bin in self.bins]) | ||
hist, edges = paddle.histogramdd(self.sample_dy, bins=self.bins, weights=self.weights_dy, range=self.range, density=self.density) | ||
|
||
np.testing.assert_allclose(self.expect_hist, hist.numpy(), rtol=1e-4, atol=1e-4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
精度不能完全一致吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
单侧中的expect_hist
是手动跑torch然后把结果贴过来的,看样子torch是自动做了精度精确,例如:
torch中计算结果
>>> import torch
>>> sample = torch.tensor([[0., 1.], [1., 0.], [2., 0.], [2., 2.]])
>>> bins = [3, 3]
>>> weights = torch.tensor([1., 2., 4., 8.])
>>> torch.histogramdd(sample, bins=bins, weight=weights)
torch.return_types.histogramdd(
hist=tensor([[0., 1., 0.],
[2., 0., 0.],
[4., 0., 8.]]),
bin_edges=(tensor([0.0000, 0.6667, 1.3333, 2.0000]), tensor([0.0000, 0.6667, 1.3333, 2.0000])))
paddle计算结果:
>>> import paddle
>>> sample = paddle.to_tensor([[0., 1.], [1., 0.], [2., 0.], [2., 2.]])
>>> bins = [3,3]
>>> weights = paddle.to_tensor([1., 2., 4., 8.])
>>> paddle.histogramdd(sample, bins=bins, weights=weights)
W1007 19:39:17.011324 1623 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 11.8
W1007 19:39:17.042716 1623 gpu_resources.cc:149] device: 0, cuDNN Version: 8.6.
(Tensor(shape=[3, 3], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[[0., 1., 0.],
[2., 0., 0.],
[4., 0., 8.]]), [Tensor(shape=[4], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[0. , 0.66666669, 1.33333337, 2. ]), Tensor(shape=[4], dtype=float32, place=Place(gpu:0), stop_gradient=True,
[0. , 0.66666669, 1.33333337, 2. ])])
现在我重新在torch.set_printoptions中设置了打印精度为8位,但是在assert_allclose中还是需要设置一下atol(打印相同,但是底层值不同,atol不能达到默认的0),如果要完全一致的话,应该就要用numpy中的api来计算,但是numpy中的histogramdd比pytorch中支持的情况更少一些:例如numpy中仅支持2d输入,但是pytorch支持多维(2d及以上);numpy中bins仅支持int和int[],pytorch支持int,int[],和tuple of tensors。所以目前是通过pytorch的打印输出直接作为目标输出,会有绝对误差。
现改为:
np.testing.assert_allclose(hist_out, self.expect_hist, atol=1e-8)
python/paddle/tensor/linalg.py
Outdated
|
||
# weights | ||
__check_weights(sample, weights) | ||
D = sample.shape[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是否需要对sample的shape有判断?是所有的shape都能支持吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
支持维度大于等于2,新加了判断,辛苦review~
代码中建议添加数据类型支持判断,即过滤不支持的数据类型。另外需增加sample和weight数据类型是否一致的判断。除了已经check_type的数据类型,最好其他的输入也能check下。 测试代码中需要加入对应报错的验证。 |
感谢review,我想问一下"测试代码中需要加入对应报错的验证。"是指给错误的类型能正常报错嘛?如果是的话一般怎么判断通过测试呢?(输入如果错误的话,直接就报错了) |
Sorry to inform you that 90488ae's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
补充了一些type检测,并且添加了error test,辛苦review~ |
需要通过 PR-CI-Codestyle-Check 流水线的格式检查 |
Sorry to inform you that 0ef396b's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
可以尝试使用类似test/legacy_test/test_reduce_op.py中的方法 |
嗯嗯谢谢,在前面的add some type check && add error test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
python/paddle/tensor/linalg.py
Outdated
|
||
|
||
def histogramdd( | ||
sample, bins=10, range=None, density=False, weights=None, name=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to API naming conventions, enter the name of Tensor using x
, and the rfc should also be modified synchronously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will modify it soon~
python/paddle/tensor/linalg.py
Outdated
_range = range | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This writing method affects the readability of the code. If it is to avoid conflicts with the name of the range
parameter in histogramdd
, the range
parameter can be adjusted to ranges
@jeff41404 I then removed all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
请提交对应的中文文档,CodeStyle 流水线没过,可以等 @sunzhongkai588 的文档review意见后一起修改。 |
这里的example示例与pytorch的一样,请 @sunzhongkai588 看下是否可以? |
python/paddle/tensor/linalg.py
Outdated
|
||
>>> x = paddle.to_tensor([[0., 1.], [1., 0.], [2.,0.], [2., 2.]]) | ||
>>> bins = [3,3] | ||
>>> weights = paddle.to_tensor([1., 2., 4., 8.]) | ||
>>> paddle.histogramdd(x, bins=bins, weights=weights) | ||
(Tensor(shape=[3, 3], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[[0., 1., 0.], | ||
[2., 0., 0.], | ||
[4., 0., 8.]]), [Tensor(shape=[4], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[0. , 0.66666669, 1.33333337, 2. ]), Tensor(shape=[4], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[0. , 0.66666669, 1.33333337, 2. ])]) | ||
|
||
|
||
>>> y = paddle.to_tensor([[0., 0.], [1., 1.], [2., 2.]]) | ||
>>> bins = [2,2] | ||
>>> ranges = [0., 1., 0., 1.] | ||
>>> density = True | ||
>>> paddle.histogramdd(y, bins=bins, ranges=ranges, density=density) | ||
(Tensor(shape=[2, 2], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[[2., 0.], | ||
[0., 2.]]), [Tensor(shape=[3], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[0. , 0.50000000, 1. ]), Tensor(shape=[3], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[0. , 0.50000000, 1. ])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加上 Examples 和 code block,并注意缩进,参考 API示例代码
如果分成多个代码块,要加上 :name:
Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
Co-authored-by: zachary sun <70642955+sunzhongkai588@users.noreply.github.com>
[0. , 0.66666669, 1.33333337, 2. ]), Tensor(shape=[4], dtype=float32, place=Place(gpu:0), stop_gradient=True, | ||
[0. , 0.66666669, 1.33333337, 2. ])]) | ||
|
||
:name: examp2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参考 API示例代码 得这么写,同时注意缩进~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,请提供中文
PR types
New features
PR changes
APIs
Description
histogramdd rfc: