Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[PyTorch] Skip t3hd/th3d for MQA/GQA tests
#1293 opened Oct 28, 2024 by cyanguwa Loading…
8 of 13 tasks
expose cp params to jax DPA api
#1292 opened Oct 27, 2024 by kocchop Loading…
6 of 13 tasks
[TE/JAX] XLA FFI calls for layer norm and RMS norm
#1290 opened Oct 26, 2024 by huanghua1994 Loading…
6 of 13 tasks
[TE/JAX] Custom call with FFI - lowering all attributes with bind all
#1289 opened Oct 25, 2024 by phu0ngng Loading…
6 of 13 tasks
Add check for GPU availability in attention
#1287 opened Oct 24, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Fix get_swa_mask() for padding masks
#1281 opened Oct 21, 2024 by cyanguwa Loading…
6 of 13 tasks
[PyTorch] MultiheadAttention: Pass cu_seqlens to apply_rotary_pos_emb
#1279 opened Oct 21, 2024 by Marks101 Loading…
1 of 13 tasks
[PyTorch] Fix autocast deprecation warnings
#1277 opened Oct 21, 2024 by yaox12 Loading…
13 tasks
attention_mask fill with -inf for UnfusedDotProductAttention
#1268 opened Oct 18, 2024 by Agoniii Loading…
1 of 13 tasks
Draft: reduce cudagraph mem via preoallcations
#1253 opened Oct 15, 2024 by JimmyZhang12 Loading…
13 tasks
fused out correction in CP
#1248 opened Oct 14, 2024 by xiaoyao0115 Loading…
12 tasks
Save CUDA Graph memory by reusing input and output tensors
#1234 opened Oct 9, 2024 by buptzyb Loading…
5 of 13 tasks
Support CUDA Graph for MoE models
#1233 opened Oct 9, 2024 by buptzyb Loading…
6 of 13 tasks
[PyTorch] Improve CP P2P efficiency
#1208 opened Sep 26, 2024 by yenchenlin Loading…
1 of 6 tasks
Draft: Use fused push_send_recv kernel for TP AG and RS overlaps
#1200 opened Sep 24, 2024 by erhoo82 Loading…
13 tasks
[PyTorch] Fused dbias-cast-transpose in bias operation
#1168 opened Sep 6, 2024 by timmoon10 Loading…
7 of 13 tasks
Fix autocast deprecation warning.
#1167 opened Sep 6, 2024 by jondeaton Loading…
[PyTorch] Activation operations
#1164 opened Sep 6, 2024 by timmoon10 Loading…
6 of 13 tasks
[PyTorch] Avoid saving fp8_tensors in certain scenarios
#1143 opened Aug 28, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Userbuffers support in operation-based API
#1142 opened Aug 27, 2024 by timmoon10 Loading…
7 of 13 tasks
Norms Refractor
#1140 opened Aug 27, 2024 by phu0ngng Draft
5 of 13 tasks
ProTip! Filter pull requests by the default branch with base:main.