Skip to content

[QST]There are two questions about TPCxBB Like query results in README.md #5374

Answered by abellina
YeahNew asked this question in General
Discussion options

You must be logged in to vote
  1. Do your results exclude IO and network bottlenecks, but simply compare the computational efficiency of CPU and GPU modes?

Our results in the README include IO and network bottlenecks, it is the wall-clock time. These are 2 DGX-2 machines, and so they have very fast NVLink, each GPU is sharing a 100Gb/sec connection using RoCE with 1 other GPU. Additionally the files are stored locally as Parquet on the DGX-2 RAID. The shuffle was done using UCX in this scenario, and so we used both NVLink and RoCE to send the shuffle data, so to some degree we are minimizing the IO/Network bottleneck and trying to feed the GPUs as much as we can. We ran at 10TB and had generously sized partitions (we …

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by sameerz
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
3 participants
Converted from issue

This discussion was converted from issue #1212 on April 28, 2022 22:52.