-
Notifications
You must be signed in to change notification settings - Fork 10
Benchmarks
Hüseyin Tuğrul BÜYÜKIŞIK edited this page May 16, 2017
·
15 revisions
Generally adding just 1 CPU-core to a GPU doesn't add more performance, but decrease it if data doesn't fit cache and both devices share same memory. I wish I had a VEGA or VOLTA GPU to compare but these are all I've got.
Celeron N3060 with 2 cores @1.6GHz and 1 iGPU(HD-400) with 12 cores @600MHz (laptop on battery):
CPU-C# | CPU-1core-OpenCL | iGPU OpenCL | CPU+iGPU OpenCL | |
c=sqrt(a)+pow(b,0.5) for 1M elements, 100 repeats | 18118ms | 22409ms | 3211ms | 4450ms |
c=exp(sin(a)+cos(b)) | 3523ms | 1196ms | 174ms | 292ms |
FX8150 with R7-240(320 cores):
CPU-C# | CPU-7core-OpenCL | GPU OpenCL | CPU+GPU OpenCL | |
c=a+b for 4M elements | 39.9ms | 4.7ms | 12.4ms | 4.5ms |
image resize from 2592x1944 to 512x512 | 23.5ms(System.DrawImage) | 6.4ms | 6.3ms | 5.7ms |
- pipelining increases performance when compute latency is comparable to buffer access/copy latency and only if there are enough cpu threads to control all work-scheduling fast enough. Otherwise, it decreases performance as in this "low-end laptop on battery" example.