Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic scaling triton kernel #28

Merged
merged 4 commits into from
Apr 12, 2024
Merged

Dynamic scaling triton kernel #28

merged 4 commits into from
Apr 12, 2024

Conversation

drisspg
Copy link
Owner

@drisspg drisspg commented Apr 12, 2024

Summary

Add dynamic scaling kernel in 1 go with spinlock, the numerics feel not quite right though

100%|████████████████████████████████████████████████████████████████…
…████████████████████████████████| 8/8 [00:20<00:00, 2.57s/it]

   numel  high_precision_dtype    low_precision_dtype      triton_time    pytorch_time    compiled_pytorch_time
--------  ----------------------  ---------------------  -------------  --------------  -----------------------
 2097152  torch.float32           torch.float8_e4m3fn          65.2814         60.3875                  86.7003
 2097152  torch.float32           torch.float8_e5m2            65.1289         60.8456                  85.949
 4194304  torch.float32           torch.float8_e4m3fn          64.6113         80.8327                  92.3341
 4194304  torch.float32           torch.float8_e5m2            66.3706         81.2847                  92.8501
 8388608  torch.float32           torch.float8_e4m3fn          66.6676        139.581                   45.354
 8388608  torch.float32           torch.float8_e5m2            64.9911        139.302                   41.2224
16777216  torch.float32           torch.float8_e4m3fn          64.3637        298.605                   87.7525
16777216  torch.float32           torch.float8_e5m2            64.8017        298.626                   80.7421

drisspg added 3 commits April 12, 2024 15:12
…████████████████████████████████| 8/8 [00:20<00:00, 2.57s/it]

   numel  high_precision_dtype    low_precision_dtype      triton_time    pytorch_time    compiled_pytorch_time
--------  ----------------------  ---------------------  -------------  --------------  -----------------------
 2097152  torch.float32           torch.float8_e4m3fn          65.2814         60.3875                  86.7003
 2097152  torch.float32           torch.float8_e5m2            65.1289         60.8456                  85.949
 4194304  torch.float32           torch.float8_e4m3fn          64.6113         80.8327                  92.3341
 4194304  torch.float32           torch.float8_e5m2            66.3706         81.2847                  92.8501
 8388608  torch.float32           torch.float8_e4m3fn          66.6676        139.581                   45.354
 8388608  torch.float32           torch.float8_e5m2            64.9911        139.302                   41.2224
16777216  torch.float32           torch.float8_e4m3fn          64.3637        298.605                   87.7525
16777216  torch.float32           torch.float8_e5m2            64.8017        298.626                   80.7421
@drisspg drisspg merged commit 0a2d2ae into main Apr 12, 2024
2 checks passed
@drisspg drisspg deleted the dynamic-scaling branch April 12, 2024 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant