-
Notifications
You must be signed in to change notification settings - Fork 704
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix PyTorch-1.12.1-foss-2022a (CUDA) on POWER
- Loading branch information
Showing
3 changed files
with
79 additions
and
0 deletions.
There are no files selected for viewing
25 changes: 25 additions & 0 deletions
25
easybuild/easyconfigs/p/PyTorch/PyTorch-1.11.0_fix-fp16-quantization-without-fbgemm.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
Fix use-after free leading to random failures in nn/test_embedding | ||
on e.g. POWER platforms where FBGEMM isn't used | ||
|
||
From https://github.com/pytorch/pytorch/pull/84750 | ||
|
||
Author: Alexander Grund (TU Dresden) | ||
|
||
diff --git a/aten/src/ATen/native/quantized/cpu/qembeddingbag_prepack.cpp b/aten/src/ATen/native/quantized/cpu/qembeddingbag_prepack.cpp | ||
index 224a66f8abf..f4d018007bf 100644 | ||
--- a/aten/src/ATen/native/quantized/cpu/qembeddingbag_prepack.cpp | ||
+++ b/aten/src/ATen/native/quantized/cpu/qembeddingbag_prepack.cpp | ||
@@ -252,9 +252,10 @@ Tensor& qembeddingbag_byte_prepack_out(Tensor& output, const Tensor& weight) { | ||
} | ||
|
||
#else | ||
- const auto weight_data = weight_contig->scalar_type() == at::ScalarType::Half | ||
- ? weight_contig->to(at::ScalarType::Float).data_ptr<float>() | ||
- : weight_contig->data_ptr<float>(); | ||
+ const Tensor& float_weight = weight_contig->scalar_type() == at::ScalarType::Half | ||
+ ? weight_contig->to(at::ScalarType::Float) | ||
+ : *weight_contig; | ||
+ const auto weight_data = float_weight.data_ptr<float>(); | ||
constexpr float kEpsilon = 1e-8f; | ||
for (auto row : c10::irange(embedding_rows)) { | ||
const float* input_row = weight_data + row * embedding_cols; |
48 changes: 48 additions & 0 deletions
48
easybuild/easyconfigs/p/PyTorch/PyTorch-1.12.0_fix-EmbeddingBag-without-fbgemm.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
There is a bug in the fallback path for the case where FBGEMM isn't available (e.g. on POWER) | ||
which leads to a race condition: | ||
Data is "copied" for the full buffer while it is processed in chunks by different threads. | ||
This a) duplicates the work and b) might write incomplete/wrong data to the output. | ||
|
||
Found in failing test_embedding_bag_half_cpu_* of nn/test_embedding: | ||
ERROR: test_embedding_bag_half_cpu_int32_int32 (__main__.TestEmbeddingNNDeviceTypeCPU) | ||
---------------------------------------------------------------------- | ||
Traceback (most recent call last): | ||
File "/dev/shm/s3248973-EasyBuild/PyTorch/1.13.1/foss-2022a/pytorch-v1.13.1/test/nn/test_embedding.py", line 936, in _test_EmbeddingBag_vs_Embedding | ||
self.assertEqual(output, ref_output, atol=dtype2prec_DONTUSE[wdtype], rtol=0) | ||
File "/tmp/eb-tmp-2022a/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2470, in assertEqual | ||
assert_equal( | ||
File "/tmp/eb-tmp-2022a/lib/python3.10/site-packages/torch/testing/_comparison.py", line 1093, in assert_equal | ||
raise error_metas[0].to_error(msg) | ||
AssertionError: Tensor-likes are not close! | ||
|
||
Mismatched elements: 1 / 4 (25.0%) | ||
Greatest absolute difference: 1.18359375 at index (1, 1) (up to 0.01 allowed) | ||
Greatest relative difference: 1.0 at index (1, 1) (up to 0 allowed) | ||
|
||
|
||
Introduced by https://github.com/pytorch/pytorch/pull/74844 | ||
|
||
Author: Alexander Grund (TU Dresden) | ||
|
||
diff --git a/aten/src/ATen/native/EmbeddingBag.cpp b/aten/src/ATen/native/EmbeddingBag.cpp | ||
index 6d8cea26f52..604ea16bace 100644 | ||
--- a/aten/src/ATen/native/EmbeddingBag.cpp | ||
+++ b/aten/src/ATen/native/EmbeddingBag.cpp | ||
@@ -246,7 +246,7 @@ index_select_add(const Tensor &select_indices, | ||
/*scale_bias=*/nullptr, | ||
/*normalize_by_lengths=*/false, | ||
/*out=*/output_data_fp32 + start_idx * ddim); | ||
- for (const auto i : c10::irange(output_size)) { | ||
+ for (const auto i : c10::irange(start_idx, end_idx)) { | ||
// Convert FP32 intermediate buffer result back to FP16 for output dtype | ||
for (const auto d : c10::irange(ddim)) { | ||
(output_data + i * ddim)[d] = static_cast<at::Half>((output_data_fp32 + ddim * i)[d]); | ||
@@ -590,7 +590,7 @@ index_select_scale_add(const Tensor &select_indices, | ||
/*scale_bias=*/nullptr, | ||
/*normalize_by_lengths=*/false, | ||
/*out=*/output_data_fp32 + start_idx * ddim); | ||
- for (const auto i : c10::irange(output_size)) { | ||
+ for (const auto i : c10::irange(start_idx, end_idx)) { | ||
// Convert FP32 intermediate buffer result back to FP16 for output dtype | ||
for (const auto d : c10::irange(ddim)) { | ||
(output_data + i * ddim)[d] = static_cast<at::Half>((output_data_fp32 + ddim * i)[d]); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters