Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

License issue with 1.6.0.rc1 #17329

Closed
9 of 11 tasks
roywei opened this issue Jan 15, 2020 · 36 comments
Closed
9 of 11 tasks

License issue with 1.6.0.rc1 #17329

roywei opened this issue Jan 15, 2020 · 36 comments
Assignees
Labels

Comments

@roywei
Copy link
Member

roywei commented Jan 15, 2020

Following are the license issue found by mentors during 1.6.0.rc1 vote @ general.

https://lists.apache.org/thread.html/rb83ff64bdac464df2f0cf2fe8fb4c6b9d3b8fa62b645763dc606045f%40%3Cgeneral.incubator.apache.org%3E

I'm compiling this list of issues so the community can help fix them.

  • NOTICE incorrectly states "Copyright 2017 and onwards”. This has been pointed out in a number of releases. Is someone going to fix it?
  • I can see several files now contain "Copyright 2019 Intel Corporation” and " Copyright (c) 2019 by Contributors” it looks like some 3rd party code have been updated but the LICENSE file has not.
  • license for this files [2] is missing and these [9] and these [10]
  • files copyright The MathJax Consortium e.g [3]
  • file copyright Martin Hensel [4]
  • file copyright Bytedance Inc [5] of unknown license
  • file copyright NumPy Developers [6] this file look to incorrectly have an ASF header on it
  • file copyright Xiao Liu + others [7] of unknown license which also incorrectly looks to have an ASF header on it
  • file copyright Manuel Carrasco Moñin [8] which again seems to incorrectly have two headers
  • 1. Pybind11 license issue from onnx tensorrt #15547
  • 2. ./docs/python_docs/themes/mx-theme/mxtheme/static/material-design-icons-3.0.1/iconfont/MaterialIcons-Regular.*
  • 3. ./3rdparty/mkldnn/doc/assets/mathjax/jax/output/CommonHTML/fonts/TeX/SansSerif-Regular.js
  • 4. ./3rdparty/mkldnn/doc/assets/mathjax/extensions/TeX/mhchem3/mhchem.js
  • 5. ./3rdparty/ps-lite/src/ibverbs_van.h
  • 6. ./src/operator/numpy/np_einsum_path_op-inl.h
  • 7 ./example/image-classification/predict-cpp/image-classification-predict.cc
  • 8. ./tests/nightly/sh2ju.sh
  • 9. ./docs/python_docs/themes/mx-theme/mxtheme/static/font/Roboto/Roboto-Regular.*
  • 10. ./docs/python_docs/themes/mx-theme/mxtheme/static/webfonts/fa-solid-900.*
  • 11. Revisit white list for rat license check.

How to fix them

Here is the guideline from Apache: https://www.apache.org/legal/src-headers.html
Note the copy right and license header requirements are different for 3rdparty files and mxnet source files.

More references on license issues:
https://www.apache.org/legal/resolved.html

@roywei roywei mentioned this issue Jan 15, 2020
7 tasks
@roywei
Copy link
Member Author

roywei commented Jan 15, 2020

@PatricZhao could you help with the "Copyright 2019 Intel Corporation” and 3 and 4 under mkldnn? Thanks

@roywei
Copy link
Member Author

roywei commented Jan 15, 2020

@hzfan @reminisce @haojin2 Looks like we need to remove the ASF header if this file is directly copied from numpy, and add it to whitelist of the header check.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/numpy/np_einsum_op-inl.h

@roywei
Copy link
Member Author

roywei commented Jan 15, 2020

@aaronmarkham any idea how to fix the font files in #2? Can we move it to s3 and download it when building the docs?

@aaronmarkham
Copy link
Contributor

@aaronmarkham any idea how to fix the font files in #2? Can we move it to s3 and download it when building the docs?

I'd research the fonts' licenses, add the appropriate license references, and leave them where they are. Moving to s3 would just add to site latency, and we don't want that.

@leezu
Copy link
Contributor

leezu commented Jan 15, 2020

@hzfan @reminisce @haojin2 Looks like we need to remove the ASF header if this file is directly copied from numpy, and add it to whitelist of the header check.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/numpy/np_einsum_op-inl.h

@roywei The file includes modifications, such as

https://github.com/apache/incubator-mxnet/blob/bd67723da96e6d36e72c9a42535a4fe68f234a71/src/operator/numpy/np_einsum_op-inl.h#L1088

It seems valid to re-license the file under Apache license given that the original license header is included. (Above link suggests this approach is not recommended though)

If we choose not to relicense files with minor modifications, https://github.com/apache/incubator-mxnet/blob/master/cmake/Modules/FindCUDAToolkit.cmake also needs to be updated to remove the ASF header.
The file contains the following minor modifications

https://github.com/apache/incubator-mxnet/blob/bd67723da96e6d36e72c9a42535a4fe68f234a71/cmake/Modules/FindCUDAToolkit.cmake#L713-L715

@roywei
Copy link
Member Author

roywei commented Jan 15, 2020

@hzfan @reminisce @haojin2 Looks like we need to remove the ASF header if this file is directly copied from numpy, and add it to whitelist of the header check.
https://github.com/apache/incubator-mxnet/blob/master/src/operator/numpy/np_einsum_op-inl.h

@roywei The file includes modifications, such as

https://github.com/apache/incubator-mxnet/blob/bd67723da96e6d36e72c9a42535a4fe68f234a71/src/operator/numpy/np_einsum_op-inl.h#L1088

It seems valid to re-license the file under Apache license given that the original license header is included. (Above link suggests this approach is not recommended though)

If we choose not to relicense files with minor modifications, https://github.com/apache/incubator-mxnet/blob/master/cmake/Modules/FindCUDAToolkit.cmake also needs to be updated to remove the ASF header.
The file contains the following minor modifications

https://github.com/apache/incubator-mxnet/blob/bd67723da96e6d36e72c9a42535a4fe68f234a71/cmake/Modules/FindCUDAToolkit.cmake#L713-L715

I'll leave the authors to decide whether to add Apache license if it's modified by MXNet contributors. Either way, we also need to acknowledge the original License and CopyRight in the LICENSE file.

@leezu
Copy link
Contributor

leezu commented Jan 15, 2020

@roywei thanks. When removing the ASF header from incubator-mxnet/cmake/Modules/FindCUDAToolkit.cmake, how to pass the RAT license check? There is no .rat-excludes files in the mxnet repository. How to exclude the file?

@roywei
Copy link
Member Author

roywei commented Jan 15, 2020

@roywei thanks. When removing the ASF header from incubator-mxnet/cmake/Modules/FindCUDAToolkit.cmake, how to pass the RAT license check? There is no .rat-excludes files in the mxnet repository. How to exclude the file?

here it is: https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/apache_rat_license_check/rat-excludes

It's used by CI step here: https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L1457

More reference and history: https://cwiki.apache.org/confluence/display/MXNET/MXNet+Source+Licenses

@TaoLv
Copy link
Member

TaoLv commented Jan 16, 2020

@roywei Regarding [3], [4], do you think we need list the copyrights/license in MXNet top LICENSE file or ask MKL-DNN itself to change? I noticed that in the latest version of MKL-DNN, it already added MathJax copyrights/license to its LICENSE file. See https://github.com/intel/mkl-dnn/blob/master/LICENSE#L352

@roywei
Copy link
Member Author

roywei commented Jan 16, 2020

@roywei Regarding [3], [4], do you think we need list the copyrights/license in MXNet top LICENSE file or ask MKL-DNN itself to change? I noticed that in the latest version of MKL-DNN, it already added MathJax copyrights/license to its LICENSE file. See https://github.com/intel/mkl-dnn/blob/master/LICENSE#L352

Hi @TaoLv , yes, the existing MathJax is for the docs file, we need to also include mkldnn part(add additional file path that use the same license). Basically for all newly added source files, if there is a different license or copyright statement (different than the Apache contributor copyright) we need to state in the top level license file clearly: which file used what license and what’s the copyright. So we also need to list all files or file paths that has intel copyright

@roywei
Copy link
Member Author

roywei commented Jan 18, 2020

Hi @skylook , could you update your copyright on the file listed as No. 7 above?
./example/image-classification/predict-cpp/image-classification-predict.cc

This file was contributed by you before MXNet was donated to Apache Software Foundation. This file only has your copyright, but there is no license. The Apache Software Foundation does not require individual copyright notice from contributors. If you choose to license this file under Apache-2.0, do you mind adding the license ?
If you do not want to license using Apache-2.0, please also state in the file which license you want.
It seems the CPP package are licensed under Apache-2.0, you can add the same text on top of this file.
https://github.com/apache/incubator-mxnet/blob/master/cpp-package/LICENSE

Thanks a lot!

@roywei
Copy link
Member Author

roywei commented Jan 18, 2020

Now we have either created a PR to fix the above issues or contacted the original author to help with files we can't directly change.

Next step is find a script to automatically find files with license/copyright issue.

@szha
Copy link
Member

szha commented Jan 19, 2020

Note that the list from Justin is not an exhaustive one and we will still need to search for issues again by ourselves.

@szha
Copy link
Member

szha commented Jan 22, 2020

  1. ./docs/python_docs/themes/mx-theme/mxtheme/static/material-design-icons-3.0.1/iconfont/MaterialIcons-Regular.*
  2. ./3rdparty/mkldnn/doc/assets/mathjax/jax/output/CommonHTML/fonts/TeX/SansSerif-Regular.js
  3. ./3rdparty/mkldnn/doc/assets/mathjax/extensions/TeX/mhchem3/mhchem.js
  4. ./3rdparty/ps-lite/src/ibverbs_van.h
  5. ./src/operator/numpy/np_einsum_path_op-inl.h
    7 ./example/image-classification/predict-cpp/image-classification-predict.cc
  6. ./tests/nightly/sh2ju.sh
  7. ./docs/python_docs/themes/mx-theme/mxtheme/static/font/Roboto/Roboto-Regular.*
  8. ./docs/python_docs/themes/mx-theme/mxtheme/static/webfonts/fa-solid-900.*

There are multiple files with multiple headers and an automated tool that checks for multiple headers would have helped. In addition, the checks for these files (except 6) are currently ignored by the following rat-exclude rules:

3rdparty/*
.*svg
themes/*
image-classification/*

It means that these rules should be revisited and we may find more issues hidden by these rules.

@roywei
Copy link
Member Author

roywei commented Jan 22, 2020

Yes, agree we need a script and the exclude rules for rat-check need to be revisited. We should avoid bulk exclusion entire folder. Anything in the exclusion need to be manually checked for license and copyright compliance.

It's quite tricky to translate the following rules to scripts.
https://www.apache.org/legal/src-headers.html
https://www.apache.org/legal/resolved.html

Right now I can think of the following rules. Any suggestion is welcome.
For our own source files:

  1. files that has two license headers
  2. files that has copy right statement, and not acknowledged in top level LICENSE file. (normal ASF contribution does not require specific copyright notice in source file)
  3. files that has license header other than ASF license, and not acknowledged in top level LICENSE file.

For 3rd party source files, it's more tricky and each dependency may come with files with several license and copy right.

We may need to rely on some existing tools.
https://github.com/nexB/scancode-toolkit

@szha
Copy link
Member

szha commented Jan 24, 2020

Let's adopt the work-in-progress disclaimer as suggested by Justin, restart the vote once the disclaimer and the above known issues are updated, and continue to work on this.

This was referenced Jan 24, 2020
@szha
Copy link
Member

szha commented Feb 20, 2020

Now that mshadow is donated to mxnet, its license should be updated: https://github.com/apache/incubator-mxnet/blob/master/3rdparty/mshadow/LICENSE#L1

@mjwall
Copy link
Member

mjwall commented Feb 20, 2020

Now that mshadow is donated to mxnet, its license should be updated: https://github.com/apache/incubator-mxnet/blob/master/3rdparty/mshadow/LICENSE#L1

And the license in all the source files. Thanks @szha

@mjwall
Copy link
Member

mjwall commented Feb 20, 2020

Also should update the following, unsure if these are being tracked elsewhere

./cpp-package/LICENSE
./example/rcnn/LICENSE
./example/gluon/tree_lstm/LICENSE
./docs/python_docs/themes/mx-theme/LICENSE

@ciyongch
Copy link
Contributor

@szha do you know if there's anyone working on updating mshadow license? I can update this part if it's still not started.

@ciyongch
Copy link
Contributor

Also should update the following, unsure if these are being tracked elsewhere

./cpp-package/LICENSE
./example/rcnn/LICENSE
./example/gluon/tree_lstm/LICENSE
./docs/python_docs/themes/mx-theme/LICENSE

@mjwall , ./cpp-package/LICENSE and ./docs/python_docs/themes/mx-theme/LICENSE was updated in #18109, while the ./example/rcnn/LICENSE is already with Apache License, and ./example/gluon/tree_lstm/LICENSE is MIT License which was traced in top-level LICENSE.

@ciyongch
Copy link
Contributor

ciyongch commented Jun 3, 2020

Hi @szha @roywei @leezu @TaoLv , as almost all of the listed issues here were addressed. I tried to enhance the current license check tool to cover multiple license issue in PR #18478, please help to take a review.
With this enhancement, there're several files with multiple license header were exposed on v1.7.x branch, please check below:

  1. python/mxnet/contrib/onnx/mx2onnx/_op_translations.py
  2. python/mxnet/contrib/onnx/mx2onnx/export_onnx.py
  3. src/operator/contrib/nn/modulated_deformable_im2col.cuh
  4. src/operator/contrib/nn/modulated_deformable_im2col.h
  5. src/operator/numpy/np_einsum_op.cc
  6. src/operator/numpy/np_einsum_path_op-inl.h

Regarding 1 and 2, not sure it's covered by https://github.com/apache/incubator-mxnet/blob/v1.7.x/LICENSE#L475 or not, or do we need to update the header?

Regarding 3 and 4, should we need to remove the ASF header and add them in top level license file similar to https://github.com/apache/incubator-mxnet/blob/v1.7.x/LICENSE#L389-L395

Regarding 5 and 6, based on the discussion here, I'm going to remove the ASF header and add them in top level license file, is it ok?

What do you think?
Thanks.
-Ciyong

@leezu
Copy link
Contributor

leezu commented Jun 3, 2020

@ciyongch in general the following applies:

The term "third-party work" refers to a work not submitted directly to the ASF by the copyright owner or owner's agent. This includes parts of a work submitted directly to the ASF for which the submitter is not the copyright owner or owner's agent.
Do not modify or remove any copyright notices or licenses within third-party works.
Do ensure that every third-party work includes its associated license, even if that requires adding a copy of the license from the third-party download site into the distribution.
Do not add the standard Apache License header to the top of third-party source files.
Minor modifications/additions to third-party source files should typically be licensed under the same terms as the rest of the rest of the third-party source for convenience.
Major modifications/additions to third-party should be dealt with on a case-by-case basis by the PMC.

https://www.apache.org/legal/src-headers.html

I would thus recommend to remove the ASF headers from the files that do not contain substantial modifications compared to their 3rd-party original. This will need to be checked for each file and can only happen with agreement of the code authors (see the respective git history who added and modified the files).

Unfortunately I'm not sure what the "Major modifications/additions to third-party should be dealt with on a case-by-case basis by the PMC." process is. How do we handle the license headers in this situation?

Regarding #17329 (comment): In that example I added small modifications to the third-party work and was happy to re-license it under the 3rd-party license after becoming aware of the ASF policy.

For the examples in question here:

  • mx2onnx: The problem here is that the code added by MXNet developers was licensed to the ASF. So formally we can't just remove the license header without asking the developers of the code.

  • modulated_deformable_im2col is a single commit that added the two license headers as well as the majority of the implementation. @zhreshold can you please clarify if the commit introduces substantial modifications/additions compared to the CAFFE version?

  • einsum operator files @hzfan may clarify the extent of their difference to the upstream version and why the Numpy headers are included.

@ciyongch
Copy link
Contributor

ciyongch commented Jun 4, 2020

Thanks for your inputs and help to ping the original author @leezu!

Regarding mx2onnx stuff, the following python/license files and claim in top level license file were added in the very first submit #11213 by @Roshrini .

I'm wondering if it's ok to leave the following files (includes multiple license header) as is now as they're claimed in the top level license file. Any comments? @szha @TaoLv

python/mxnet/contrib/onnx/mx2onnx/_op_translations.py
python/mxnet/contrib/onnx/mx2onnx/export_onnx.py
contrib/onnx/mx2onnx/LICENSE

@ciyongch
Copy link
Contributor

ciyongch commented Jun 5, 2020

Hi @zhreshold , @hzfan , can you help to have a check on the license header of the following files? Thanks.

src/operator/contrib/nn/modulated_deformable_im2col.cuh
src/operator/contrib/nn/modulated_deformable_im2col.h
src/operator/numpy/np_einsum_op.cc
src/operator/numpy/np_einsum_path_op-inl.h

@zhreshold
Copy link
Member

@ciyongch
Copy link
Contributor

ciyongch commented Jun 6, 2020

Thanks @zhreshold for the link.
As the original file only contains MIT License from the link you provided, while the following MXNet's files stated MIT License, ASF License and Caffe Copyright Notice and Disclaimer.
Besides, the new codes in MXNet more like the majority modification to the original version, @zhreshold can you please help to confirm this?

src/operator/contrib/nn/modulated_deformable_im2col.cuh
src/operator/contrib/nn/modulated_deformable_im2col.h

So the above two files matches the term of Major modifications/additions to third-party should be dealt with on a case-by-case basis by the PMC., may I have your comments for how to deal with this case @leezu @szha @TaoLv , Thanks!

Ping @hzfan to help clarify this:

src/operator/numpy/np_einsum_op.cc
src/operator/numpy/np_einsum_path_op-inl.h
*einsum operator files @hzfan may clarify the extent of their difference to the upstream version and why the Numpy headers are included.

@hzfan
Copy link
Contributor

hzfan commented Jun 6, 2020

@ciyongch The two files originate from https://github.com/numpy/numpy/blob/master/numpy/core/einsumfunc.py . I translated them from python to cpp. The original files are subject to the the following license: https://github.com/numpy/numpy/blob/master/LICENSE.txt

@ciyongch
Copy link
Contributor

ciyongch commented Jun 7, 2020

Thanks @hzfan for the update :)

Based on the inputs from @hzfan and @zhreshold , the following files are not minor modifications but major modification/addition compared to the origin version. Any suggestions for these files or just leave them as it is? @leezu @roywei @szha @TaoLv , thanks!

src/operator/contrib/nn/modulated_deformable_im2col.cuh
src/operator/contrib/nn/modulated_deformable_im2col.h
src/operator/numpy/np_einsum_op.cc
src/operator/numpy/np_einsum_path_op-inl.h

@szha
Copy link
Member

szha commented Jun 8, 2020

@ciyongch what are the requirements regarding modification from the original licenses of these source code?

@ciyongch
Copy link
Contributor

ciyongch commented Jun 9, 2020

Hi @szha, for the two im2col files, they're based on Caffe Copyright Notice and Disclaimer and MIT license, for the two np_einsum files, they're based on NumPy Developers Copyright. The common requirement of them is

 * 1. Redistributions of source code must retain the above copyright notice, this
 * list of conditions and the following disclaimer.

Please check below details:
Caffe Copyright Notice and Disclaimer
https://github.com/apache/incubator-mxnet/blob/v1.7.x/src/operator/contrib/nn/modulated_deformable_im2col.cuh#L45

MIT License
https://github.com/apache/incubator-mxnet/blob/v1.7.x/LICENSE#L967

NumPy Developers Copyright
https://github.com/apache/incubator-mxnet/blob/v1.7.x/src/operator/numpy/np_einsum_path_op-inl.h#L28

So the current multiple license header seems valid, what do you think? Thanks!

@szha
Copy link
Member

szha commented Jun 9, 2020

relicensing them would require approval from copyright owners. Both the MIT license and the BSD 3-clause for numpy are in category A which we can include, so there shouldn't be a requirement for relicensing them. https://www.apache.org/legal/resolved.html#category-a

@leezu
Copy link
Contributor

leezu commented Jun 9, 2020

I request a recommendation of the MXNet ASF Mentors for handling the "2 license headers" issue. According to the ASF policy, there should not be two license headers. But maybe it's fine in this case. Please see https://lists.apache.org/thread.html/r6d8f62b1ad34f6dbe109a01e715723e8996cd478a01a08498aa02a2e%40%3Cdev.mxnet.apache.org%3E

@szha
Copy link
Member

szha commented Jul 22, 2020

what's the conclusion for this issue?

@ciyongch
Copy link
Contributor

Hi @szha, the way of handling dual license header was passed with lazy consensus, and I updated the related files based on the following conclusion.

Including the BSD-3 style license in releases wouldn't be a problem, as it's compatible with Apache License 2. As there are >substantial changes, I believe PPMC would prefer to put the ASF header on top of the file (ie. 2 headers) [72 hours lazy >consensus if there are no concerns]. We still need to declare all the numpy einsum derived files in the LICENSE and fix the >inconsistency that ASF header was removed in src/operator/numpy/np_einsum_op-inl.h but remains in >src/operator/numpy/np_einsum_path_op-inl.h

For more details, please see https://lists.apache.org/thread.html/r6d8f62b1ad34f6dbe109a01e715723e8996cd478a01a08498aa02a2e%40%3Cdev.mxnet.apache.org%3E

@szha
Copy link
Member

szha commented Jul 22, 2020

Thanks. I think most of the license issues have been addressed, and the open issue on pybind11 license is already tracked separately.

Thanks for the efforts!

@szha szha closed this as completed Jul 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants