Kernels for GroupNorm #353

pramodith · 2024-11-05T18:15:30Z

Summary

Implementation of group norm that achieves output parity with torch's GroupNorm.

This is feature is a part of #285

Details

The formulas/equations involved in GroupNorm are the same as LayerNorm/BatchNorm. The main differences lie in the axis along which the mean and std are computed + the dimensions of the Affine transformation parameters.

In group norm W and B are of shape (n_channels), however the mean and std are calculated over all the channels in a given group.

Testing Done

Testing was done on a A100 PCIE and a A100 SXM-4.

We see an increase in speed, while the total memory used remains about the same. Note that benchmarking was done using a batch size of 128, Hidden Dim size of 512 and the number of channels per group fixed at 4.

These results look very similar to the layer norm benchmark too.

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

lancerts · 2024-11-07T19:42:57Z

src/liger_kernel/ops/group_norm.py

+            c2 += tl.sum(wdy)
+
+        # Need to ensure additions to the same channel are atomic
+        tl.atomic_add(DW_ptr + channel_idx, dW.to(dtype))


@ByronHsu is it possible for us to test on multiple GPU, specifically around

scope (str, optional) – Defines the scope of threads that observe the synchronizing effect of the atomic operation. Acceptable values are “gpu” (default), “cta” (cooperative thread array, thread block), or “sys” (stands for “SYSTEM”). The default value is “gpu”.

whether the default value works for multi-gpu.

what kind of testing? run on a 4 GPUs env to ensure the kernel working fine on a single GPU? Not sure how this is related to multi gpu. My understanding is the kernel only happens on 1 gpu

lancerts · 2024-11-07T19:43:27Z

Very solid PR!

ByronHsu · 2024-11-07T20:38:10Z

@pramodith can you update the readme to include groupnorm

pramodith · 2024-11-07T22:08:27Z

@pramodith can you update the readme to include groupnorm

Will do tomorrow!

pramodith and others added 17 commits October 31, 2024 14:02

Initial commit

da2fd5d

Forward pass works

50880c9

Backward works partially

24e3201

Fixed some edge cases

ba087fb

Find the current group the right way.

799cbe1

More pointer bugs.

5c29ff0

DB and DW work!

06e2277

cuanges

bfe29bf

progress

a949c89

Fp32 all tests pass

fd7009e

Remove line

44a3f78

V1

4728e13

Style check

8196d25

Merge remote-tracking branch 'origin/main' into pramodith/group_norm

86243f3

Compute mean and variance using the online algorithm

0e3fb03

New benchmark data

8379d6b

checkstyle

0186bff

pramodith marked this pull request as ready for review November 5, 2024 19:44

pramodith and others added 4 commits November 6, 2024 10:58

Add a few comments.

0f88ae3

Merge branch 'main' into pramodith/group_norm

e6e7bc5

Merge branch 'main' into pramodith/group_norm

c0c279e

Merge branch 'main' into pramodith/group_norm

bb167be

lancerts reviewed Nov 7, 2024

View reviewed changes

Merge branch 'main' into pramodith/group_norm

5ff64c5

lancerts previously approved these changes Nov 7, 2024

View reviewed changes

Update group_norm.py

138cbf8

lancerts dismissed their stale review via 138cbf8 November 7, 2024 19:52

lancerts approved these changes Nov 7, 2024

View reviewed changes

lancerts requested a review from ByronHsu November 7, 2024 19:59

ByronHsu merged commit a954b73 into linkedin:main Nov 7, 2024
2 checks passed

ByronHsu mentioned this pull request Nov 8, 2024

added group norm #225

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernels for GroupNorm #353

Kernels for GroupNorm #353

pramodith commented Nov 5, 2024 •

edited

Loading

lancerts Nov 7, 2024

ByronHsu Nov 7, 2024

lancerts commented Nov 7, 2024

ByronHsu commented Nov 7, 2024

pramodith commented Nov 7, 2024

Kernels for GroupNorm #353

Kernels for GroupNorm #353

Conversation

pramodith commented Nov 5, 2024 • edited Loading

Summary

Details

Testing Done

lancerts Nov 7, 2024

Choose a reason for hiding this comment

ByronHsu Nov 7, 2024

Choose a reason for hiding this comment

lancerts commented Nov 7, 2024

ByronHsu commented Nov 7, 2024

pramodith commented Nov 7, 2024

pramodith commented Nov 5, 2024 •

edited

Loading