Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using efficient np.linalg.multi_dot #271

Merged
merged 8 commits into from
Nov 10, 2022
Merged

Conversation

maldil
Copy link
Contributor

@maldil maldil commented Jul 5, 2022

It is more succinct and effective to refactor code to use np.linalg.multi_dot rather than numerous np.dot routines.
What do you think about this change that has practical value?

  • Are there docstrings ? Do they follow the [numpydoc] (https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) style ? There are no docstrings
  • Have you run the tests by doing nosetests or py.test at the root of the repo ?
  • Have you checked that the doc builds properly and that any new file has been added to the repo ? How to do that is covered in the documentation. Yes
  • Is there a unit test for the proposed code modification ? If the PR addresses an issue, the test should make sure the issue is fixed. No, this is a refactoring.
  • Last but not least, did you document the proposed change in the CHANGELOG file ? It should go under "Unreleased". No, this is a refactoring.

Happy PR 😃

@fakufaku
Copy link
Collaborator

fakufaku commented Jul 6, 2022

Hi @maldil , thank you so much for taking the time to open a pull request.

I didn't know about multi_dot. Thank you for pointing this out. It is new from numpy v1.19.0.

This would break compatibility for older versions of numpy, so I don't think we should use it, unless there is a large speed improvement for example.
Have you measured the runtime difference ?

@maldil
Copy link
Contributor Author

maldil commented Jul 6, 2022

Hi,
Thank you for the comment. In the API doc, It is recommended to use np.linalg.multi_dot instead of multiple dot due to efficiency. Following example further proved to me that it is 4.7x faster than using multiple dot s.

    from datetime import datetime

    A = np.random.random((10000, 100))
    B = np.random.random((100, 1000))
    C = np.random.random((1000, 5))
    D = np.random.random((5, 333))

    multi_dot_start = datetime.now()
    _ = np.linalg.multi_dot([A, B, C, D])
    multi_dot_end = datetime.now()
    print("multi_dot execution time",(multi_dot_end-multi_dot_start).microseconds)

    dot_start = datetime.now()
    _ = np.dot(np.dot(np.dot(A, B), C), D)
    dot_end = datetime.now()
    print("multi_dot execution time",(dot_end - dot_start).microseconds)
    np.linalg.multi_dot([A[k], SS.ppT[k], A[k].T])```

@fakufaku
Copy link
Collaborator

fakufaku commented Jul 6, 2022

Right, I understand it should be faster because it will pick the optimal ordering of parenthesis.
I'm more interested in the practical impact on the performance of DOA computations. If we can get 4.5x improvement overall on the DOA module that would be worth thinking about it.
Do you have a performance comparison on the DOA module ?

@maldil
Copy link
Contributor Author

maldil commented Jul 6, 2022

No, I do not have performance comparison for DOA module right now.

@fakufaku
Copy link
Collaborator

fakufaku commented Jul 6, 2022

I think you could quickly make one by timing the doa algorithms in the examples/doa_algorithms.py script. For runtime measurements, it is recommended to use time.perf_counter, repeat the computation you want to measure several times and take the mean or median.

@maldil
Copy link
Contributor Author

maldil commented Jul 6, 2022

I'm still looking for spare time to do this experiment. Regarding your concern about the NumPy version, it appears to have been around for a long, at least since 1.13. This is the oldest API doc I could locate. It was present in 1.13, according to that.

To be sure, I installed the oldest installable NumPy version on my Mac. I was able to use np.linalg.multi_dot as seen below.

Screen Shot 2022-07-06 at 12 25 16 PM

On a side topic, if you use a certain NumPy version, I would urge that you declare it here. Because it now downloads the most recent version based on the requirement.txt.

@fakufaku
Copy link
Collaborator

fakufaku commented Jul 7, 2022

Hum, you are right, numpy 1.13.0 was released in June 2017. We can probably reasonably expect most people to use something more recent 😄
But then, we need to update the requirements in

to numpy>=1.13.0.

@fakufaku fakufaku changed the base branch from master to robin/pr271 November 10, 2022 01:04
@fakufaku fakufaku merged commit 4404591 into LCAV:robin/pr271 Nov 10, 2022
fakufaku added a commit that referenced this pull request Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants