Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduction_op with axis + keepdim #845

Closed
mtar opened this issue Jul 26, 2021 · 0 comments · Fixed by #846
Closed

reduction_op with axis + keepdim #845

mtar opened this issue Jul 26, 2021 · 0 comments · Fixed by #846
Assignees
Labels
bug Something isn't working redistribution Related to distributed tensors reduction operators Operators for calculations that reduce tensor dimension(s)

Comments

@mtar
Copy link
Collaborator

mtar commented Jul 26, 2021

Description
A clear and concise description of the bug and the associated functionality.

reduce_op does not work when the axis argument given to a reduce_op function is smaller than the DNDarrays split axis by one.

To Reproduce
Steps to reproduce the behavior:

  1. Which module/class/function is affected?
    _operations.reduce_op and functions calling it (sum, max, min ...) respectively.
  2. What are the circumstances under which the bug appears?
    axis == split-1 + keepdim == True
  3. What is the exact error message / erroneous behavior?
    Two processes:
[1,0]<stderr>:  File "mpi4py/MPI/Comm.pyx", line 652, in mpi4py.MPI.Comm.Allgatherv
[1,0]<stderr>:mpi4py.MPI.Exception: MPI_ERR_TRUNCATE: message truncated

Hangs on more processes.

Expected behavior
A clear and concise description of what you expected to happen.

Return DNDarray

Illustrative
If applicable, add screenshots or minimal examples to help explain your problem.

ht.sum(ht.ones((3,3), split=1), axis=0, keepdim=True)
ht.max(ht.ones((3,3,3), split=1), axis=0, keepdim=True)
ht.sum(ht.ones((3,3,3), split=2), axis=1, keepdim=True)

Version Info
Which version are you using?
master

Additional comments
Any other comments here.

@mtar mtar added bug Something isn't working reduction operators Operators for calculations that reduce tensor dimension(s) redistribution Related to distributed tensors labels Jul 26, 2021
@mtar mtar changed the title reduction_op with axis + keepdim multi-process reduction_op with axis + keepdim Jul 26, 2021
@mtar mtar self-assigned this Jul 27, 2021
@mtar mtar mentioned this issue Jul 27, 2021
4 tasks
@mtar mtar linked a pull request Jul 27, 2021 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working redistribution Related to distributed tensors reduction operators Operators for calculations that reduce tensor dimension(s)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant