Add `out` and `where` args for `ht.div` #945

neosunhan · 2022-03-29T03:00:00Z

Description

Implemention of out and where functionality for ht.divide is fairly straightforward. However, use of both arguments at the same time causes complications due to lack of support for in-place where in pytorch.

Currently, I use a workaround by changing the underlying torch.Tensor of the out DNDarray. Possible alternatives include updating indexing.where to allow in-place modification or implementing a function that allows in-place modification using a mask (similar to numpy.copyto).

Issue/s resolved: #870

Changes proposed:

Use out functionality of true_divide in pytorch backend
Use indexing.where

Type of change

New feature (non-breaking change which adds functionality)

Due Diligence

All split configurations tested
Multiple dtypes tested in relevant functions
Documentation updated (if needed)
Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

mtar · 2022-03-29T03:00:02Z

GPU cluster tests are currently disabled on this Pull Request.

ghost · 2022-03-29T03:01:42Z

CodeSee Review Map:

Review in an interactive map

View more CodeSee Maps

Legend

codecov · 2022-03-30T04:16:28Z

Codecov Report

Merging #945 (9d2926b) into main (aaafea0) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #945   +/-   ##
=======================================
  Coverage   95.39%   95.39%           
=======================================
  Files          65       65           
  Lines        9965     9976   +11     
=======================================
+ Hits         9506     9517   +11     
  Misses        459      459

Flag	Coverage Δ
gpu	`94.63% <100.00%> (+<0.01%)`	⬆️
unit	`91.02% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
heat/core/_operations.py	`96.04% <100.00%> (+0.26%)`	⬆️
heat/core/arithmetics.py	`99.06% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aaafea0...9d2926b. Read the comment docs.

…nhan/heat into features/870-divide-kwargs

neosunhan · 2022-03-30T04:33:59Z

Found a cleaner way to handle indexing of the where argument.

ClaudiaComito

@neosunhan thank you so much for taking this on!

I have a few comments, the main thing is that where should probably be addressed within _operations.__binary_op(). This way, it would be available to all binary operations. And it would be easier to satisfy the condition that ht.divide(t1, t2) returns out, not t1, where where is False (which also holds for all numpy binary operations).

Please let me know if you need help with __binary_ops!

ClaudiaComito · 2022-04-01T12:08:35Z

heat/core/arithmetics.py

@@ -438,6 +444,10 @@ def div(t1: Union[DNDarray, float], t2: Union[DNDarray, float]) -> DNDarray:
        The first operand whose values are divided
    t2: DNDarray or scalar
        The second operand by whose values is divided
+    out: DNDarray, optional
+        The output array. It must have a shape that the inputs broadcast to


shape and split dimension

ClaudiaComito · 2022-04-01T13:11:16Z

heat/core/arithmetics.py

+    out: DNDarray, optional
+        The output array. It must have a shape that the inputs broadcast to
+    where: DNDarray, optional
+        Condition of interest, where true yield divided value else yield original value in t1


We should follow numpy.divide, so ht.divide should actually yield out where where is False (and uninitialized values when out=None)

ClaudiaComito · 2022-04-01T13:31:14Z

heat/core/arithmetics.py

+    if where is not None:
+        t2 = indexing.where(where, t2, 1)
+
+    return _operations.__binary_op(torch.true_divide, t1, t2, out)


We should return out instead of t1 where where is False. And in fact this applies to all binary operations, as I admittedly had not realized when I created this issue.

So the way to go here would be to modify _operations.__binary_op to accomodate the where kwarg once and for all. Do you need help with that?

heat/core/tests/test_arithmetics.py

neosunhan · 2022-04-04T03:19:29Z

@ClaudiaComito Thanks for the feedback! I have taken a look at _operations.__binary_op and here is what I have come up with so far.

The main idea is to create a new DNDarray with uninitialized values (using torch.empty) in the event that out=None. We can then copy over the values from the result tensor (using torch.where to index if a condition is provided).

ClaudiaComito

Great job @neosunhan ! 👏 I have left some comments in the code.

ClaudiaComito · 2022-04-04T11:27:35Z

heat/core/_operations.py

-        out.larray[:] = operation(
-            t1.larray.type(promoted_type), t2.larray.type(promoted_type), **fn_kwargs
+    else:
+        out_tensor = torch.empty(output_shape, dtype=promoted_type)


2 comments here:

output_shape is the global (memory-distributed) shape of the output DNDarray, here you're initializing a potentially huge torch tensor. In this case you should call

factories.empty(output_shape, dtype=..., split=..., device=...)

and that will take care of only initializing slices of the global array on each process

(I think this is also why the tests fail btw)

if I understand the numpy docs correctly, this empty out DNDarray only needs to be initialized if where is not None.

neosunhan · 2022-04-05T07:52:22Z

@ClaudiaComito I have modified the code accordingly and added a few more tests.

ClaudiaComito · 2022-04-05T12:13:26Z

run tests

ClaudiaComito

@neosunhan thanks a lot. This looks good, as far as I can tell. Can you update the CHANGELOG? When you're done, we can run the GPU tests.

neosunhan · 2022-04-05T16:01:12Z

Updated CHANGELOG for the new div kwargs. I'm not sure which section is appropriate to document the new where kwarg for __binary_op (or if it should even be included in the changelog).

Also I believe there was a typo in the CHANGELOG where the "Feature Additions" heading was repeated 3 times, which I have fixed.

ClaudiaComito · 2022-04-06T04:01:42Z

run tests

ClaudiaComito

@neosunhan please bear with me. While we're trying to figure out why the GPU tests fail, it occurred to me that we are not checking for where's distribution scheme and whether it fits the way out is distributed.

heat/core/_operations.py

ClaudiaComito · 2022-04-06T14:00:18Z

heat/core/_operations.py

@@ -43,6 +44,8 @@ def __binary_op(
        The second operand involved in the operation,
    out: DNDarray, optional
        Output buffer in which the result is placed
+    where: DNDarray, optional
+        Condition of interest, where true yield the result of the operation else yield original value in out (uninitialized when out=None)


We can use numpy's docs for where, I think they are a bit clearer. But we must expand on them a bit, e.g. is where supposed/expected to be distributed, and how.

heat/core/tests/test_arithmetics.py

neosunhan · 2022-04-07T12:12:53Z

@ClaudiaComito I've added several test cases with different split configurations of out and where, but they all seem to pass. Not sure if I'm missing something here.

ClaudiaComito · 2022-04-08T08:52:37Z

@ClaudiaComito I've added several test cases with different split configurations of out and where, but they all seem to pass. Not sure if I'm missing something here.

@neosunhan If you look at the list of checks below, you will see that the continuous-integration/jenkins/pr-merge has failed. Click on details, and you can see that the tests passed on 1 process, but failed on 2. Do you need help setting up your environment for multi-process? If you have installed openmpi, you should be able to run the tests on 2 processes with mpirun -n 2 python -m unittest.

neosunhan · 2022-04-11T04:34:46Z

@ClaudiaComito Thanks for the help! I have added the check for where's distribution scheme and the corresponding test cases.

ClaudiaComito · 2022-04-11T07:01:02Z

run tests

ClaudiaComito

I'm ready to approve this, great job @neosunhan . Just a small documentation update needed.

ClaudiaComito · 2022-04-20T04:16:55Z

heat/core/_operations.py

+        will be set to the result of the operation. Elsewhere, the `out` array will retain its original
+        value. If an uninitialized `out` array is created via the default `out=None`, locations within
+        it where the condition is False will remain uninitialized. If distributed, must be distributed
+        along the same dimension as the `out` array.


This is only correct if where and out have the same shape.

For example if out is (100, 10000) and distributed along the columns (out.split is 1), where is (10000,), their shapes are broadcastable but where must be distributed along 0.

If distributed, the split axis (after broadcasting if required) must match that of the out array.

Would this phrasing be accurate?

yes, sounds accurate

ClaudiaComito · 2022-04-20T08:21:28Z

run tests

mtar

One check can be saved

heat/core/_operations.py

Co-authored-by: mtar <m.tarnawa@fz-juelich.de>

ClaudiaComito · 2022-04-22T15:15:46Z

run tests

ClaudiaComito · 2022-04-22T15:47:55Z

run tests

ClaudiaComito

thanks a lot @neosunhan !

Add out and where args for ht.div

1b865c9

Merge branch 'helmholtz-analytics:main' into features/870-divide-kwargs

ecb28cc

neosunhan added 2 commits March 30, 2022 12:27

Add out and where args for ht.div

2755448

Merge branch 'features/870-divide-kwargs' of https://github.com/neosu…

28dcca7

…nhan/heat into features/870-divide-kwargs

ClaudiaComito requested changes Apr 1, 2022

View reviewed changes

neosunhan added 2 commits April 4, 2022 11:08

Add where arg to __binary_op

2a0c14b

Remove unused import

cecbb41

ClaudiaComito requested changes Apr 4, 2022

View reviewed changes

Modify __binary_op to only create empty DNDarray when necessary

f8ed24a

Merge branch 'main' into features/870-divide-kwargs

c8b33ce

ClaudiaComito requested changes Apr 5, 2022

View reviewed changes

Update changelog for new div kwargs

b673091

ClaudiaComito mentioned this pull request Apr 6, 2022

Extend kwargs out, where to all binary operations #958

Closed

ClaudiaComito requested changes Apr 6, 2022

View reviewed changes

Add distributed test cases for ht.div

03f288a

neosunhan changed the title ~~Add out and where args for ht.div~~ Add out and where args for ht.div Apr 11, 2022

neosunhan added 2 commits April 11, 2022 12:19

Add check for distributed where

bdeed00

Remove duplicate test case

1ccbf7a

ClaudiaComito requested changes Apr 20, 2022

View reviewed changes

Fix documentation for where arg

99316a6

ClaudiaComito added the PR talk label Apr 20, 2022

ClaudiaComito added the merge label Apr 20, 2022

mtar reviewed Apr 22, 2022

View reviewed changes

heat/core/_operations.py Outdated Show resolved Hide resolved

ClaudiaComito and others added 2 commits April 22, 2022 16:37

Update heat/core/_operations.py

a39dcb6

Co-authored-by: mtar <m.tarnawa@fz-juelich.de>

Merge branch 'main' into features/870-divide-kwargs

067c36e

Merge branch 'main' into features/870-divide-kwargs

9d2926b

ClaudiaComito approved these changes Apr 22, 2022

View reviewed changes

ClaudiaComito merged commit 860626b into helmholtz-analytics:main Apr 22, 2022

neosunhan deleted the features/870-divide-kwargs branch April 23, 2022 01:51

mtar removed the PR talk label Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `out` and `where` args for `ht.div` #945

Add `out` and `where` args for `ht.div` #945

neosunhan commented Mar 29, 2022 •

edited

Loading

mtar commented Mar 29, 2022

ghost commented Mar 29, 2022 •

edited by ghost

Loading

codecov bot commented Mar 30, 2022 •

edited

Loading

neosunhan commented Mar 30, 2022

ClaudiaComito left a comment

ClaudiaComito Apr 1, 2022

ClaudiaComito Apr 1, 2022

ClaudiaComito Apr 1, 2022

neosunhan commented Apr 4, 2022 •

edited

Loading

ClaudiaComito left a comment

ClaudiaComito Apr 4, 2022 •

edited

Loading

neosunhan commented Apr 5, 2022

ClaudiaComito commented Apr 5, 2022

ClaudiaComito left a comment

neosunhan commented Apr 5, 2022

ClaudiaComito commented Apr 6, 2022

ClaudiaComito left a comment

ClaudiaComito Apr 6, 2022

neosunhan commented Apr 7, 2022 •

edited

Loading

ClaudiaComito commented Apr 8, 2022

neosunhan commented Apr 11, 2022

ClaudiaComito commented Apr 11, 2022

ClaudiaComito left a comment

ClaudiaComito Apr 20, 2022

neosunhan Apr 20, 2022 •

edited

Loading

ClaudiaComito Apr 20, 2022

ClaudiaComito commented Apr 20, 2022

mtar left a comment

ClaudiaComito commented Apr 22, 2022

ClaudiaComito commented Apr 22, 2022

ClaudiaComito left a comment

Add out and where args for ht.div #945

Add out and where args for ht.div #945

Conversation

neosunhan commented Mar 29, 2022 • edited Loading

Description

Changes proposed:

Type of change

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

mtar commented Mar 29, 2022

ghost commented Mar 29, 2022 • edited by ghost Loading

CodeSee Review Map:

Legend

codecov bot commented Mar 30, 2022 • edited Loading

Codecov Report

neosunhan commented Mar 30, 2022

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito Apr 1, 2022

Choose a reason for hiding this comment

ClaudiaComito Apr 1, 2022

Choose a reason for hiding this comment

ClaudiaComito Apr 1, 2022

Choose a reason for hiding this comment

neosunhan commented Apr 4, 2022 • edited Loading

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito Apr 4, 2022 • edited Loading

Choose a reason for hiding this comment

neosunhan commented Apr 5, 2022

ClaudiaComito commented Apr 5, 2022

ClaudiaComito left a comment

Choose a reason for hiding this comment

neosunhan commented Apr 5, 2022

ClaudiaComito commented Apr 6, 2022

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito Apr 6, 2022

Choose a reason for hiding this comment

neosunhan commented Apr 7, 2022 • edited Loading

ClaudiaComito commented Apr 8, 2022

neosunhan commented Apr 11, 2022

ClaudiaComito commented Apr 11, 2022

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito Apr 20, 2022

Choose a reason for hiding this comment

neosunhan Apr 20, 2022 • edited Loading

Choose a reason for hiding this comment

ClaudiaComito Apr 20, 2022

Choose a reason for hiding this comment

ClaudiaComito commented Apr 20, 2022

mtar left a comment

Choose a reason for hiding this comment

ClaudiaComito commented Apr 22, 2022

ClaudiaComito commented Apr 22, 2022

ClaudiaComito left a comment

Choose a reason for hiding this comment

Add `out` and `where` args for `ht.div` #945

Add `out` and `where` args for `ht.div` #945

neosunhan commented Mar 29, 2022 •

edited

Loading

ghost commented Mar 29, 2022 •

edited by ghost

Loading

codecov bot commented Mar 30, 2022 •

edited

Loading

neosunhan commented Apr 4, 2022 •

edited

Loading

ClaudiaComito Apr 4, 2022 •

edited

Loading

neosunhan commented Apr 7, 2022 •

edited

Loading

neosunhan Apr 20, 2022 •

edited

Loading