Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562

codeislife99 · 2021-03-02T04:18:43Z

This PR adds the Segment Sum Op which will serve as a generic op for multiple framework specific ops

Tensorflow -- tf.math.segment_sum, tf.sparse.segment_sum
Caffe -- sparse length sum
PyTorch -- Embedding Bag

Since this PR uses scatter_add , it also makes some small changes which make it work for dynamic inputs.

codeislife99 · 2021-03-02T05:11:12Z

@masahi @tkonolige @mbrookhart @ymwangg PTAL.

masahi · 2021-03-02T05:21:47Z

Nice, are you going to add frontend?

codeislife99 · 2021-03-02T05:24:20Z

Yes, do you prefer I add it in this PR or the next one ? I want to add frontends for multiple framework ops based on this relay op.

masahi · 2021-03-02T05:56:59Z

Yes, I think it's better to add frontends (TF, PT) to make sure they are supported by this op.

codeislife99 · 2021-03-02T07:49:43Z

@masahi I have added 3 TF Ops to the frontend, all of which use this op. Let me know if that's enough.

masahi · 2021-03-02T08:03:28Z

Can you also try PT EmbeddingBag?

codeislife99 · 2021-03-02T10:17:50Z

Hey @masahi , upon closely reading the Embedding Bag documentation, it seems that: (Referencing the tf.sparse.segment_sum documentation )

When inputs is 1D , and offsets is given, we simply have to convert offsets into segment_ids and inputs would directly be indices. To convert offsets into segment_ids we have to use a combination of adjacent_difference, arange and repeat : For example: offsets of [0,4] with size of 10 would translate to [0, 0, 0, 0, 1, 1, 1, 1, 1, 1] and relay.segment_sum could be called on it.
When inputs is 2D, its more easier where we convert the input size [B,N] to [0,0,0... Ntimes ... ,1,1,1,... N Times ..... B-1,B-1, .... N Times] This would require arange and repeat. And then use flattened inputs as indices and the converted input size (now 1D) as segment_ids. Then relay.segment_sum could be called on it.

Now all of these ops exist except adjacent_difference although @ymwangg wrote an IR for it. Is it possible to call it in any form or if not , do you think its worthwhile to make it an op ? Numpy equivalent

Let me know your thoughts on the best way to reuse existing code. After that implementation would be only a trivial few lines.

masahi · 2021-03-02T10:38:22Z

Ok lets do embedding bag later, then.

tkonolige

Looks pretty good. A couple documentation improvements would be nice though.

python/tvm/relay/op/transform.py

tests/python/frontend/tensorflow/test_forward.py

tests/python/relay/test_op_level3.py

python/tvm/relay/op/transform.py

codeislife99 · 2021-03-02T22:32:51Z

@tkonolige I have finished addressing your comments, please re-review

codeislife99 · 2021-03-03T04:22:36Z

Actually I would like to add another related op in this PR. I will ping you after I am done with that.

codeislife99 · 2021-03-03T10:06:57Z

@tkonolige @masahi . I am done with the PR Please review/ re-review.

tkonolige

A couple minor comments

python/tvm/relay/op/transform.py

tests/python/frontend/tensorflow/test_forward.py

mbrookhart

Overall LGTM.

Could you add a direct test for scatter_add with dynamic inputs? That would help identifying problems in the future.

codeislife99 · 2021-03-03T21:26:43Z

@tkonolige int64 is not allowed with tf sparse ops, I put it on the relay op tests and the tf math ops.
@mbrookhart Yes, thanks, added them now.
Please re-review.

masahi · 2021-03-04T00:16:18Z

python/tvm/relay/frontend/tensorflow.py

+        assert len(inputs) == 3, "There should be 3 input tensors"
+        data = _op.take(inputs[0], inputs[1], axis=0)
+        return _op.segment_sum(data, inputs[2])
+


This is ok for now, but we definitely want a fused implementation here, just like TF/PT/C2 does. I don't expect this would work for a huge embedding table people want to use in practice.

I agree. When you say a "fused implementation" , do you mean that all of it happens in a single ir ?

Do you have any examples of what a "fused implementation" is ? Does this mean that in a fused implementation, the frontend will always just be a one liner ?

In this case, I understand we must do the take and the addition from segment_sum simultaneously for performance. So a fused implementation in that case would be a new op ?

By "fused" I meant we shouldn't materialize the result of take, which can be huge. In a fused implementation, we need to look up indices and accumulate the sum on the fly. This is why PT has EmbeddingBag op, see their doc https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html.

Yes, a complicated op like this will not likely be feasible if we rely only on Relay-level op fusion. We need a dedicated sparse_segment_sum TOPI/Relay op.

I think he meant that scatter_nd exactly realizes fused take and segment_sum above. I haven't put deep thought into this but it made sense to me. But I remember parallelizing scatter_nd looked harder than scatter_add.

Yes, I am having a bit of a mind block understanding how take and segment_sum is essentially scatter_nd, do anyone of you mind writing small pseudocode ?

FWIW I did a few variants of torch.nn.EmbeddingBag, c2::sparse_length_sum, etc in TVM IR in https://github.com/ajtulloch/tvm/blob/4b98beb75ca1505ec81ddca358ad61282ab6a05b/topi/python/topi/x86/sparse.py#L162-L257, https://github.com/ajtulloch/tvm/blob/sparse-ops/topi/python/topi/sparse/sparse_lengths_sum.py#L45-L98, https://github.com/ajtulloch/sparse-ads-baselines/blob/a495ea076882615d454d27a1a5b191ec675d3acc/lxu_cache_cpu_funcs.py#L8-L149, etc if that's of interest.

Thinking about this more, I believe the take is necessary if we are using scatter_nd. We could make a more generic version of scatter_nd and gather_nd that has indices in both the input and output buffers. That would cover this case.

ok I'll merge this as it is then.

masahi · 2021-03-04T19:37:57Z

Thanks @codeislife99 @tkonolige @mbrookhart

…add dynamic bug (apache#7562) * Add segment sum Op * Remove unnecessary * Documentation * Black * Add GPU * Uncomment * Add documentation * Add dynamic tests * Add TF Op * Add Sparse Segment Sum * Add test coverage * PR Comments * Int64 tests * Add SparseSegmentSqrtN * Add SparseSegmentSqrtNOp * Deduplicate code * Add SparseSegmentMean * Parametrize Tests * Remove * Modularize * Black * Modularize Code * Pylint * PR Comments * Add scatter add tests * Remove Test Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-251.us-east-2.compute.internal>

Ubuntu added 4 commits March 2, 2021 04:17

Add segment sum Op

4fe855b

Remove unnecessary

20bcbba

Documentation

fa9fb48

Black

74e8370

Ubuntu added 2 commits March 2, 2021 05:12

Add GPU

8d59675

Uncomment

6575048

Add documentation

9ea802d

Ubuntu added 4 commits March 2, 2021 06:46

Add dynamic tests

23f8b2b

Add TF Op

a8d77fa

Add Sparse Segment Sum

5d2d192

Add test coverage

c3d90a6

codeislife99 changed the title ~~Add segment sum Op~~ Add segment sum Op to relay and corresponding TF Ops , fix scatter_add dynamic bug Mar 2, 2021

tkonolige requested changes Mar 2, 2021

View reviewed changes

masahi self-assigned this Mar 2, 2021

ymwangg reviewed Mar 2, 2021

View reviewed changes

python/tvm/relay/op/transform.py Outdated Show resolved Hide resolved

Ubuntu added 2 commits March 2, 2021 21:18

PR Comments

bac338e

Int64 tests

e11f5e1

Ubuntu added 2 commits March 3, 2021 04:23

Add SparseSegmentSqrtN

9b76291

Add SparseSegmentSqrtNOp

20ece40

codeislife99 changed the title ~~Add segment sum Op to relay and corresponding TF Ops , fix scatter_add dynamic bug~~ Add segment sum Op to relay and 5 corresponding TF Ops , fix scatter_add dynamic bug Mar 3, 2021

Ubuntu added 4 commits March 3, 2021 08:24

Deduplicate code

59a0bdd

Add SparseSegmentMean

3d495f3

Parametrize Tests

e2fb098

Remove

907c3d8

codeislife99 changed the title ~~Add segment sum Op to relay and 5 corresponding TF Ops , fix scatter_add dynamic bug~~ Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug Mar 3, 2021

Ubuntu added 3 commits March 3, 2021 09:37

Modularize

20c0f3e

Black

93b87d3

Modularize Code

c20a026

Pylint

82b2a13

tkonolige approved these changes Mar 3, 2021

View reviewed changes

python/tvm/relay/op/transform.py Outdated Show resolved Hide resolved

tests/python/frontend/tensorflow/test_forward.py Show resolved Hide resolved

mbrookhart reviewed Mar 3, 2021

View reviewed changes

Ubuntu added 2 commits March 3, 2021 21:27

PR Comments

97a2446

Add scatter add tests

6e77c0a

mbrookhart approved these changes Mar 3, 2021

View reviewed changes

Remove Test

bfd71b6

masahi reviewed Mar 4, 2021

View reviewed changes

masahi merged commit 83ab234 into apache:main Mar 4, 2021

codeislife99 deleted the segment_sum branch March 4, 2021 19:45

codeislife99 restored the segment_sum branch March 4, 2021 19:45

codeislife99 deleted the segment_sum branch March 4, 2021 20:40

codeislife99 mentioned this pull request Mar 4, 2021

Sparse segment sum sqrtn op #7149

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562

Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562

codeislife99 commented Mar 2, 2021 •

edited

Loading

codeislife99 commented Mar 2, 2021 •

edited

Loading

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021 •

edited

Loading

masahi commented Mar 2, 2021

tkonolige left a comment

codeislife99 commented Mar 2, 2021

codeislife99 commented Mar 3, 2021

codeislife99 commented Mar 3, 2021

tkonolige left a comment

mbrookhart left a comment

codeislife99 commented Mar 3, 2021 •

edited

Loading

masahi Mar 4, 2021

codeislife99 Mar 4, 2021

codeislife99 Mar 4, 2021

codeislife99 Mar 4, 2021 •

edited

Loading

masahi Mar 4, 2021

masahi Mar 4, 2021 •

edited

Loading

codeislife99 Mar 4, 2021

ajtulloch Mar 4, 2021

tkonolige Mar 4, 2021

masahi Mar 4, 2021

masahi commented Mar 4, 2021

Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562

Add segment sum Op to relay and 7 corresponding TF Ops , fix scatter_add dynamic bug #7562

Conversation

codeislife99 commented Mar 2, 2021 • edited Loading

codeislife99 commented Mar 2, 2021 • edited Loading

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021

masahi commented Mar 2, 2021

codeislife99 commented Mar 2, 2021 • edited Loading

masahi commented Mar 2, 2021

tkonolige left a comment

Choose a reason for hiding this comment

codeislife99 commented Mar 2, 2021

codeislife99 commented Mar 3, 2021

codeislife99 commented Mar 3, 2021

tkonolige left a comment

Choose a reason for hiding this comment

mbrookhart left a comment

Choose a reason for hiding this comment

codeislife99 commented Mar 3, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codeislife99 Mar 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Mar 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi commented Mar 4, 2021

codeislife99 commented Mar 2, 2021 •

edited

Loading

codeislife99 commented Mar 2, 2021 •

edited

Loading

codeislife99 commented Mar 2, 2021 •

edited

Loading

codeislife99 commented Mar 3, 2021 •

edited

Loading

codeislife99 Mar 4, 2021 •

edited

Loading

masahi Mar 4, 2021 •

edited

Loading