-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add num_iterations
axis to the multi-threaded Parquet benchmarks
#17231
Conversation
@@ -109,12 +111,15 @@ void BM_parquet_multithreaded_read_common(nvbench::state& state, | |||
|
|||
nvtxRangePushA(("(read) " + label).c_str()); | |||
state.exec(nvbench::exec_tag::sync | nvbench::exec_tag::timer, | |||
[&](nvbench::launch& launch, auto& timer) { | |||
[&, num_files = num_files](nvbench::launch& launch, auto& timer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated: we used to capture a structured binding variable in lambdas, which is not supported in C++17.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I wasn't aware of that!
@@ -267,6 +278,7 @@ NVBENCH_BENCH(BM_parquet_multithreaded_read_mixed) | |||
.add_int64_axis("cardinality", {1000}) | |||
.add_int64_axis("total_data_size", {512 * 1024 * 1024, 1024 * 1024 * 1024}) | |||
.add_int64_axis("num_threads", {1, 2, 4, 8}) | |||
.add_int64_axis("num_iterations", {1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What values do we want to use here? Are the results interesting in comparing 1 to e.g. 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing very interesting results on my system (checked 1, 2, 4 and 8), but that might be because of low H2D/D2H transfer rates on it.
I added the axis so we can easily change the number of iterations when we're looking into pipelining. So, for now, I think we should stick to a single value to keep the number of benchmarks from growing.
I'm fine with defaulting to a larger value. Thoughts? CC @GregoryKimball
/merge |
Description
Added an axis that controls the number of times each thread reads its input. Running with a higher number of iterations should better show how work from different threads pipelines.
The new axis, "num_iterations", is added to all multi-threaded Parquet reader benchmarks.
Checklist