You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking at the result of the run for 2.6.0 vs 2.5.1 https://github.com/pytorch/benchmark/actions/runs/12878326305/job/35904096937
Benchmark,pytorch-2.5.1-cuda-12.4,pytorch-2.6.0-cuda-12.4
mnist-cpu_memory,1118.67,1146.76
mnist-gpu_memory,0.0,0.0
mnist-latency,42.46,40.00
mnist_hogwild-cpu_memory,556.57,601.289
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,671.28,586.02
wlm_cpu_lstm-cpu_memory,885.141,907.066
wlm_cpu_lstm-gpu_memory,0.0,0.0
wlm_cpu_lstm-latency,1266.83,1079.37
wlm_cpu_trans-cpu_memory,852.113,899.531
wlm_cpu_trans-gpu_memory,0.0,0.0
wlm_cpu_trans-latency,1081.98,1078.99
wlm_gpu_lstm-cpu_memory,995.402,954.391
wlm_gpu_lstm-gpu_memory,0.0,0.0
wlm_gpu_lstm-latency,54.78,52.76
wlm_gpu_trans-cpu_memory,1007.86,993.949
wlm_gpu_trans-gpu_memory,0.0,0.0
wlm_gpu_trans-latency,56.41,55.54
Run 2.4.1 vs 2.5.0 (mnist_hogwild only): https://github.com/pytorch/benchmark/actions/runs/12895573722
Benchmark,pytorch-2.4.1-cuda-12.4,pytorch-2.5.0-cuda-12.4
mnist_hogwild-cpu_memory,561.797,556.758
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,613.91,610.53
Run 2.5.1 vs 2.6.0 (mnist_hogwild only): https://github.com/pytorch/benchmark/actions/runs/12894636482
Benchmark,pytorch-2.5.1-cuda-12.4,pytorch-2.6.0-cuda-12.4
mnist_hogwild-cpu_memory,561.73,579.324
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,592.67,599.23
Comparing mnist_hogwild-latency number with run on A100 hosted on GCP I see 10x difference:
Run 2.4.1 vs 2.5.0:
3803mnist_hogwild-latency ,61.42,62.19
The text was updated successfully, but these errors were encountered:
atalman
changed the title
[release-test] A100 3803mnist_hogwild-latency increase 10x
[release-test] A100 3803mnist_hogwild-latency increase 10x on linux.aws.a100 vs [a100-runner]
Jan 21, 2025
Yes, those AWS instances are way weaker in terms of CPU compared to the GCP A100. They have only 10 or 11 cores available.
Maybe we should consider moving away those simple CPU benchmark to cheaper instances. Given the very limited pool we have at hands and constrained budget for A100 instances.
Looking at the result of the run for 2.6.0 vs 2.5.1
https://github.com/pytorch/benchmark/actions/runs/12878326305/job/35904096937
Benchmark,pytorch-2.5.1-cuda-12.4,pytorch-2.6.0-cuda-12.4
mnist-cpu_memory,1118.67,1146.76
mnist-gpu_memory,0.0,0.0
mnist-latency,42.46,40.00
mnist_hogwild-cpu_memory,556.57,601.289
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,671.28,586.02
wlm_cpu_lstm-cpu_memory,885.141,907.066
wlm_cpu_lstm-gpu_memory,0.0,0.0
wlm_cpu_lstm-latency,1266.83,1079.37
wlm_cpu_trans-cpu_memory,852.113,899.531
wlm_cpu_trans-gpu_memory,0.0,0.0
wlm_cpu_trans-latency,1081.98,1078.99
wlm_gpu_lstm-cpu_memory,995.402,954.391
wlm_gpu_lstm-gpu_memory,0.0,0.0
wlm_gpu_lstm-latency,54.78,52.76
wlm_gpu_trans-cpu_memory,1007.86,993.949
wlm_gpu_trans-gpu_memory,0.0,0.0
wlm_gpu_trans-latency,56.41,55.54
Run 2.4.1 vs 2.5.0 (mnist_hogwild only):
https://github.com/pytorch/benchmark/actions/runs/12895573722
Benchmark,pytorch-2.4.1-cuda-12.4,pytorch-2.5.0-cuda-12.4
mnist_hogwild-cpu_memory,561.797,556.758
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,613.91,610.53
Run 2.5.1 vs 2.6.0 (mnist_hogwild only):
https://github.com/pytorch/benchmark/actions/runs/12894636482
Benchmark,pytorch-2.5.1-cuda-12.4,pytorch-2.6.0-cuda-12.4
mnist_hogwild-cpu_memory,561.73,579.324
mnist_hogwild-gpu_memory,0.0,0.0
mnist_hogwild-latency,592.67,599.23
Comparing mnist_hogwild-latency number with run on A100 hosted on GCP I see 10x difference:
Run 2.4.1 vs 2.5.0:
3803mnist_hogwild-latency ,61.42,62.19
The text was updated successfully, but these errors were encountered: