Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add API paddle.linalg.eig #35674

Merged
merged 51 commits into from
Sep 28, 2021
Merged

add API paddle.linalg.eig #35674

merged 51 commits into from
Sep 28, 2021

Conversation

AshburnLee
Copy link
Contributor

@AshburnLee AshburnLee commented Sep 13, 2021

PR types

New features

PR changes

APIs

Describe

功能

向PaddlePaddle中的线性代数库添加eig算子,该算子计算一般方阵的特征分解。

实现方法

前向实现通过调LAPACK相应的函数,后向计算通过数学原理实现。

效果

分别测试实数数和复数,shape从(3,3)到(6,6,128,128)前向计算符合预期,后向符合预期。其中实数输入时,反向输出最大精度误差为1e-3。
如下是在本地与pytorch.linalg.eig的反向精度对比结果,为方便展示,给出小方阵的结果。附件给出测试脚本:

  • 实数输入

截屏2021-09-28 10 22 41

  • 复数输入

截屏2021-09-28 10 22 09

注意

  • 对于实数输入,如果numpy的eig输出为复数,则paddle.linalg.eig与numpy结果对齐;如果numpy的eig输出为实数,paddle.linalg.eig与numpy结果可以不对齐。
  • 由于Lapack只在CPU上执行,Paddle中调用eig算子时,需要指明设备paddle.device.set_device("cpu")
  • 当前eig只在CPU上计算,为了组网考虑,后续会有支持 eig 在CPU上计算,并将结果传入GPU。

附件

测试脚本:

import numpy as np
import paddle
import torch
paddle.device.set_device("cpu")

# real matrices
a = np.random.random((3,3)).astype('float64')
#a = np.random.random((2,6,6,32,32)).astype('float32')

# complex matrices
#a = np.random.random((2,2,3,3)).astype(np.complex64)
#a = np.random.random((2,3,3,3)).astype(np.complex128)
#a = np.random.random((2,2,5,5)) + np.random.random((2,2,5,5)) * 1j

a_pd = paddle.to_tensor(a)
a_torch = torch.from_numpy(a)
a_pd.stop_gradient = False
a_torch.requires_grad = True

# paddle backward
w2,v2=paddle.linalg.eig(a_pd)
dx_pd = paddle.grad([w2, v2], a_pd)

# torch backward
w1,v1=torch.linalg.eig(a_torch)
torch.autograd.backward([w1,v1], [torch.ones_like(w1), torch.ones_like(v1)])
dx_torch = a_torch.grad

# grad results
print(">>dx_paddle: \n", dx_pd)
print(">>dx_pytorch: \n", dx_torch)

def CheckGrad(dx_pd, dx_torch):
    np_pd_res = dx_pd[0].numpy().flatten()
    np_torch_res = dx_torch.numpy().flatten()

    flag = True
    for i in range(np_pd_res.shape[0]):
        if not np.allclose(np_pd_res[i].real, np_torch_res[i].real, rtol=1e-5, atol=1e-5):
            print(np_pd_res[i].real, " ", np_torch_res[i].real)
            flag = False
    if flag:
        print("对齐")
    else:
        print("不对齐")

CheckGrad(dx_pd, dx_torch)

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle/fluid/operators/eig_op.cc Outdated Show resolved Hide resolved
paddle/fluid/operators/eig_op.cc Outdated Show resolved Hide resolved
paddle/fluid/operators/eig_op.cc Outdated Show resolved Hide resolved
paddle/fluid/operators/eig_op.h Outdated Show resolved Hide resolved
paddle/fluid/operators/eig_op.h Outdated Show resolved Hide resolved
paddle/fluid/operators/eig_op.h Outdated Show resolved Hide resolved
paddle/fluid/operators/svd_helper.h Show resolved Hide resolved
python/paddle/fluid/tests/unittests/test_eig_op.py Outdated Show resolved Hide resolved
python/paddle/tensor/linalg.py Outdated Show resolved Hide resolved
@jeff41404
Copy link
Contributor

should consider REGISTER_OP_CUDA_KERNEL in order to support CUDA

@AshburnLee
Copy link
Contributor Author

should consider REGISTER_OP_CUDA_KERNEL in order to support CUDA

As we discussed earlier, I'll do some research to deal with this problem before the end of version 2.2

Copy link
Contributor

@zhangting2020 zhangting2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测框架无法验证的case,考虑可以在op_benchmark中添加脚本,和其他框架前、反向结果对齐。避免以后修改了OP,导致精度出问题

@AshburnLee
Copy link
Contributor Author

AshburnLee commented Sep 27, 2021

单测框架无法验证的case,考虑可以在op_benchmark中添加脚本,和其他框架前、反向结果对齐。避免以后修改了OP,导致精度出问题

  • PR 描述中给出了在本地与pytorch对比反向测试的脚本及结果。
  • 单测中与numpy,无法完全对齐。会在op-benchmark中与pytorch对比。

Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

功能实现代码没有大问题,但是部分建议没有修改,且部分变量命名不规范,建议后续接着修改

T computed_work_size;
math::lapackEig<T, math::Real<T>>(
jobvl, jobvr, order, input_data, lda, values_data, lvector_data, ldvl,
rvector_data, ldvr, &computed_work_size, lwork, rwork_data, &info);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lvector_data 只是用来作nullptr的话,没必要声明这个变量了

Copy link
Contributor Author

@AshburnLee AshburnLee Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 该算子只计算右特征向量,所有这里的lvector_data会一直是nullptr。不过为了与LAPACK文档的参数名保持一致,还是保留该变量吧。
  • 关于这个函数中的变量名,除了输入输出,其他参数名与LAPACK 文档保持了一致,便于阅读和调试。
  • 关于info参数:文档中没有说明在计算computed_work_size时是否会被修改。保险起见应该在此后添加判断。

会在下一步cpu计算,数据传回gpu的支持中考虑修改。

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample code will be improved later.

@AshburnLee
Copy link
Contributor Author

AshburnLee commented Sep 28, 2021

sample code will be improved later.

Roger that. The input data generation in the sample code will be replaced by paddle functions in the following PR: data transfer from cpu to gpu

@JamesLim-sy JamesLim-sy merged commit bc7e2b9 into PaddlePaddle:develop Sep 28, 2021
@AshburnLee AshburnLee deleted the api_eig branch September 28, 2021 09:07
AshburnLee added a commit to AshburnLee/Paddle that referenced this pull request Sep 28, 2021
* Add paddle.linalg.eig op

* remove comments

* remove comments

* extend batch_size to the origin

* add real times complex functor & destroy the backward complex output bug

* terminate output diff when input real tensors

* correct tiny doc errors

* move functions from eig_helper to svd_helper and remove eig_helper

* remove tensor.Resize

* remove no longer used code

* use existing lapack functions

* reply review comments 21/27

* remove .cu as this op is only executed on CPU

* remove const_cast & add const in argument list for read-only references

* fix sample code error in CI

* remove template typename Tbase and more

* remove eig exposure in paddle.*

* add 'name=None' in eig python implementation

* handle the unittest

* try to solve the unittest

* solve CI coverage

* remove no longer used code

* polish API doc and more

* reply review comments

* polish unittest, commit plan B

* polish unittest
AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this pull request Sep 29, 2021
* Add paddle.linalg.eig op

* remove comments

* remove comments

* extend batch_size to the origin

* add real times complex functor & destroy the backward complex output bug

* terminate output diff when input real tensors

* correct tiny doc errors

* move functions from eig_helper to svd_helper and remove eig_helper

* remove tensor.Resize

* remove no longer used code

* use existing lapack functions

* reply review comments 21/27

* remove .cu as this op is only executed on CPU

* remove const_cast & add const in argument list for read-only references

* fix sample code error in CI

* remove template typename Tbase and more

* remove eig exposure in paddle.*

* add 'name=None' in eig python implementation

* handle the unittest

* try to solve the unittest

* solve CI coverage

* remove no longer used code

* polish API doc and more

* reply review comments

* polish unittest, commit plan B

* polish unittest
lanxianghit pushed a commit that referenced this pull request Sep 29, 2021
向PaddlePaddle中的线性代数库添加eig算子,该算子计算一般方阵的特征分解。
cherry-pick 自#35674.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants