Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add google benchmark for NormalEstimation #4506

Merged
merged 17 commits into from
Jun 28, 2021

Conversation

mvieth
Copy link
Member

@mvieth mvieth commented Nov 11, 2020

So far, this is in a very early state. I am not sure when this will be review-ready (maybe never). This is mainly to start a discussion.

@mvieth mvieth force-pushed the benchmarks branch 2 times, most recently from e334f44 to 9125824 Compare November 11, 2020 12:08
@mvieth mvieth mentioned this pull request Nov 11, 2020
@mvieth mvieth force-pushed the benchmarks branch 3 times, most recently from 7bb72b4 to 310ba9b Compare November 11, 2020 16:23
@kunaltyagi
Copy link
Member

For ubuntu, you'll have to modify the docker image first

benchmarks/example.cpp Outdated Show resolved Hide resolved
@mvieth mvieth force-pushed the benchmarks branch 2 times, most recently from 9c8c095 to 88131d4 Compare November 13, 2020 14:02
kunaltyagi
kunaltyagi previously approved these changes Nov 16, 2020
Copy link
Member

@kunaltyagi kunaltyagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (IMO it's a good base for future PRs to improve upon)
Nice work @mvieth

@SunBlack could you PTAL at the cmake changes

pcl::PointCloud<pcl::Normal>::Ptr cloud_normals (new pcl::PointCloud<pcl::Normal>);
for (auto _ : state) {
// This code gets timed
ne.compute (*cloud_normals);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this give satisfactory graphs? Or does the compiler optimize the loop to no-op?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this give satisfactory graphs? Or does the compiler optimize the loop to no-op?

I'm not sure if this is the same, but the method is only run for one iteration on the CI:
https://dev.azure.com/PointCloudLibrary/pcl/_build/results?buildId=17871&view=logs&jobId=cd6332f9-d180-53f5-8d9a-3b5d9c3252a2&j=cd6332f9-d180-53f5-8d9a-3b5d9c3252a2&t=7e929b77-dd35-56c3-fbaf-db0044c91b15

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input file is too large for the benchmark to run it multiple times

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried with the bun0.pcd and here it runs for about 224 times - so it just seems to be saturated when using 2.5 sec for a benchmark, ie using table_scene_mug_stereo_textured.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input file is too large for the benchmark to run it multiple times

Saw this after my own test 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not optimized away, I assume that the for (auto _ : state) has some hidden do-not-optimize-me-magic.
I think it is possible to request it to run more than once, or for a minimum time, if you think once is not enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More runs are better since they remove the noise due to ambiguousness of the new cpu architectures (and client sharing on the cloud )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true. The number of repetitions can be set for each benchmark individually in the code by calling Repetitions, or for all benchmarks in one executable with --benchmark_repetitions=x. I would suggest the latter, since more flexible (adding it to ARGUMENTS of PCL_ADD_BENCHMARK)

benchmarks/CMakeLists.txt Outdated Show resolved Hide resolved
set_target_properties(benchmark_${_name} PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})
add_custom_target(run_benchmark_${_name} benchmark_${_name} ${PCL_ADD_BENCHMARK_ARGUMENTS})
add_dependencies(run_benchmarks run_benchmark_${_name})
endmacro()
Copy link
Contributor

@SunBlack SunBlack Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes to the CMake code when I take only a short look at it:

  • Why it is a macro and not a function?
  • IDE_FOLDER is missing (we should create a new group for it)
  • is there a error handling necessary in case the developer is missing a parameter?
  • I dislike add_executable(benchmark_${_name} ${_name}.cpp) as therefore you have for each benchmark a fixed source name. In general I would prefer to have a benchmark per PCL module (e.g. pcl_benchmark_common). In case someone want to run just one test, he can use --benchmark_filter=...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your notes!

  • what advantages would a function have over a macro? I mainly oriented myself at the existing macros, e.g. PCL_ADD_TEST
  • error handling: optional IMO. Then it would make sense to implement it for all macros in pcl_targets.cmake
  • my current plan was to have a cpp file for each module (e.g. features.cpp, common.cpp, filters.cpp, ...), and corresponding executables (benchmark_features, benchmark_common, benchmark_filters, ...). If a module ever has that many benchmarks that it makes sense to split it, we would have to rethink the structure
    Please also take another look at the current macro after I applied @larshg's suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • what advantages would a function have over a macro? I mainly oriented myself at the existing macros, e.g. PCL_ADD_TEST

It is like in C++: Content of a macro will be injected where it is called, during function are creating a new scope. So in general: Use function except there is a reason, that you need really a macro

  • error handling: optional IMO. Then it would make sense to implement it for all macros in pcl_targets.cmake

Indeed it makes sense to introduce it for all. Question here was more: Is everything finde, when LINK_WITH & ARGUMENTS is empty

  • my current plan was to have a cpp file for each module (e.g. features.cpp, common.cpp, filters.cpp, ...), and corresponding executables (benchmark_features, benchmark_common, benchmark_filters, ...). If a module ever has that many benchmarks that it makes sense to split it, we would have to rethink the structure

In general I would prefer to start in a clean way. So your current benchmark would be within benchmarks/features/normal_3d.cpp. Reason: Normally when someone is adding sth. he just add it and don't think: Oh well, no the file is to big, let's restructure the project. So it is better to start clean (and it is better for the Git history ;-) ).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take another look, if I missed any of your suggestions 😃

@mvieth
Copy link
Member Author

mvieth commented Nov 17, 2020

I considered adding UseRealTime() as discussed with @larshg here, but now that I look at the benchmarks I don't really see how they would be measured differently. What's your take on this?

cmake/pcl_targets.cmake Outdated Show resolved Hide resolved
#Only applies to MSVC
if(MSVC)
#Requires CMAKE version 3.13.0
if(CMAKE_VERSION VERSION_LESS "3.13.0" AND (NOT ArgumentWarningShown))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArgumentWarningShown will not work as excepted currently. The variable will be expanded within scope of benchmarks/CMakeLists.txt , so it is only available there (and in sub directories of it). So when calling PCL_ADD_BENCHMARK first wihin a subdirectory and the in the main directory, the variable is unset at first time. So you should use a global variable instead for it (I don't recommend CACHE INTERNAL here, as then the warning will be only shown when the CMakeCache will be created and not anytime configure will be called).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it works if its a macro, so its inserted in benchmarks/CMakeLists.txt in which the parent scope will be our main CMakeLists?
Its at least the same solution I used for the unit tests and I'm pretty sure it worked as expected, but my memory might fail me here.

However, if its changed to a function, we probably should change it to a global variable.

Copy link
Contributor

@SunBlack SunBlack Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it works if its a macro, so its inserted in benchmarks/CMakeLists.txt in which the parent scope will be our main CMakeLists?

No, this our main CMakeLists will not see this variable.

Time for a short explanation.

Example 1

Let's asume we have following CMake files:

benchmark/CMakeList.txt

PCL_ADD_BENCHMARK(features_normal_3d FILES ...)

add_subdirectory(another_dir_with_benachmark)
add_subdirectory(another_dir_with_benachmark2)

benchmark/another_dir_with_benachmark/CMakeList.txt

PCL_ADD_BENCHMARK(another_benchmark FILES ...)

benchmark/another_dir_with_benachmark2/CMakeList.txt

PCL_ADD_BENCHMARK(another_benchmark2 FILES ...)

This works as you want

Example 1 - Explanation

benchmark/CMakeList.txt

# ArgumentWarningShown is not set until now => PCL_ADD_BENCHMARK will show a warning
PCL_ADD_BENCHMARK(features_normal_3d FILES ...)
# ArgumentWarningShown is now TRUE after calling PCL_ADD_BENCHMARK
add_subdirectory(another_dir_with_benachmark)
add_subdirectory(another_dir_with_benachmark2)

benchmark/another_dir_with_benachmark/CMakeList.txt

# ArgumentWarningShown is TRUE like in parent CMakeLists => no warning
PCL_ADD_BENCHMARK(another_benchmark FILES ...)
#

benchmark/another_dir_with_benachmark2/CMakeList.txt

# ArgumentWarningShown is TRUE like in parent CMakeLists => no warning
PCL_ADD_BENCHMARK(another_benchmark2 FILES ...)

Example 2

Now let us a modify the snippet a little bit, by moving first PCL_ADD_BENCHMARK below both calls to add_subdirectory :

benchmark/CMakeList.txt

add_subdirectory(another_dir_with_benachmark)
add_subdirectory(another_dir_with_benachmark2)

PCL_ADD_BENCHMARK(features_normal_3d FILES ...)

All other files are equal

And now you are getting the warning 3 times. Why? See below

Example 2 - Explanation

benchmark/CMakeList.txt

# ArgumentWarningShown is not set until now
add_subdirectory(another_dir_with_benachmark)
add_subdirectory(another_dir_with_benachmark2)

# ArgumentWarningShown is not set until now, as changes from child scope are not injected into parent scope => PCL_ADD_BENCHMARK will show a warning
PCL_ADD_BENCHMARK(features_normal_3d FILES ...)

benchmark/another_dir_with_benachmark/CMakeList.txt

# ArgumentWarningShown is not set until now => PCL_ADD_BENCHMARK will show a warning
PCL_ADD_BENCHMARK(another_benchmark FILES ...)
# ArgumentWarningShown is now TRUE after calling PCL_ADD_BENCHMARK
#

benchmark/another_dir_with_benachmark2/CMakeList.txt

# ArgumentWarningShown is not set until now => PCL_ADD_BENCHMARK will show a warning
PCL_ADD_BENCHMARK(another_benchmark2 FILES ...)
# ArgumentWarningShown is now TRUE after calling PCL_ADD_BENCHMARK

Conclusion

Calling add_subdirectory is like passing all existing variables to a C++ function as copy and not as reference. The caller will not see the changes of the callee.

Nevertheless it is possible in CMake to modify parent scope:

  • via global states
  • using PARENT_SCOPE

PARENT_SCOPE will be usually used in functions, where you pass the variable name and want that the function stores the result into this variable. In other cases I don't recommend using PARENT_SCOPE. We had it in our framework a long time, but it was bug prune, as in case someone is missing one time a call to PARENT_SCOPE, you don't first see why sth. is not working like excepted.

As the variable ArgumentWarningShown is only related to benchmark I would use:

set_target_properties(ArgumentWarningShown run_benchmark PCL_ARGUMENTS_WARNING_SHOWN)
if(NOT ArgumentWarningShown)
  ..
  set_target_properties(run_benchmark PROPERTIES PCL_ARGUMENTS_WARNING_SHOWN TRUE)
endif()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thats true if its organized differently as first intended, it will stop working. Thanks for the thorough explanation.
@mvieth can you reorganize appropriately?
And maybe change the name ArgumentWarningShown to BenchArgumentWarningShown so it doesn't clash with the one in Unit Tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, macros are basically evil in almost all languages. (:crab:)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SunBlack I assume you mean get_target_property in the first line, and run_benchmarks, or did I misunderstand something?

cmake/pcl_targets.cmake Outdated Show resolved Hide resolved
@mvieth mvieth mentioned this pull request Nov 25, 2020
@larshg
Copy link
Contributor

larshg commented Dec 20, 2020

Should this be marked as ready for review soon 😄 ? or maybe just ready for merge 😁

@mvieth mvieth marked this pull request as ready for review March 30, 2021 19:06
@mvieth
Copy link
Member Author

mvieth commented Mar 30, 2021

Ok, it's ready for review, but I think before actually merging it, I would disable the benchmarks on the CI again - until we find a way to fix that, it seems to be difficult to compare two runs because they could have been on different machines or under different CPU loads. I think currently the benchmarks are most useful for developers to run on their own computers.

@kunaltyagi
Copy link
Member

I would disable the benchmarks on the CI again

Seconded

@mvieth mvieth requested review from kunaltyagi and larshg June 28, 2021 12:19
Copy link
Member

@kunaltyagi kunaltyagi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

cmake/pcl_targets.cmake Show resolved Hide resolved
@kunaltyagi kunaltyagi merged commit 2d7ebf1 into PointCloudLibrary:master Jun 28, 2021
@mvieth mvieth deleted the benchmarks branch August 5, 2021 08:39
tin1254 pushed a commit to tin1254/pcl that referenced this pull request Aug 10, 2021
* Add google benchmark for NormalEstimation

* Add benchmarks step to ubuntu and windows ci

* Suggested changes.

* More suggestions.

* Add NormalEstimationOMP

* Make cmake set lowercase

* Change macro to function

* Fix unit test->benchmark

Co-authored-by: Lars Glud <larshg@gmail.com>

* Restructure benchmarks

* Add more benchmark variants for NormalEstimation

* Correct use of ArgumentWarningShown

* Hopefully fix build error on 18.04 GCC

* Improve formatting

* Fix Windows tests/benchmarks

* Do not run benchmarks on CIs

Co-authored-by: Lars Glud <larshg@gmail.com>
mvieth added a commit to mvieth/pcl that referenced this pull request Dec 27, 2021
* Add google benchmark for NormalEstimation

* Add benchmarks step to ubuntu and windows ci

* Suggested changes.

* More suggestions.

* Add NormalEstimationOMP

* Make cmake set lowercase

* Change macro to function

* Fix unit test->benchmark

Co-authored-by: Lars Glud <larshg@gmail.com>

* Restructure benchmarks

* Add more benchmark variants for NormalEstimation

* Correct use of ArgumentWarningShown

* Hopefully fix build error on 18.04 GCC

* Improve formatting

* Fix Windows tests/benchmarks

* Do not run benchmarks on CIs

Co-authored-by: Lars Glud <larshg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants