Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve VoxelGrid filter runtime by ~20% by using spread sort #3853

Merged

Conversation

facontidavide
Copy link
Contributor

@facontidavide facontidavide commented Apr 3, 2020

VoxelGrid spends a lot of time sorting.

Spread sort implemented in boost seems to be faster than std::sort.

I get an average speed improvement of 20%, but it would be nice if anyone can reproduce this.

VoxelGrid spends a lot of time sorting.
Radix sort implemted in boost seems to be faster than std::sort
@kunaltyagi kunaltyagi added changelog: enhancement Meta-information for changelog generation needs: feedback Specify why not closed/merged yet labels Apr 3, 2020
@kunaltyagi kunaltyagi changed the title Optimize VoxelGrid Optimize VoxelGrid Filter Apr 3, 2020
@Morwenn
Copy link
Contributor

Morwenn commented Apr 3, 2020

As long as you're sorting integers it should indeed always be faster than std::sort. I've got a benchmark where several sorting algorithms are used to sort an std::vector<int> with 10^7 elements:

spread_sorter in the image above is Boost integer_sort and as you can see it's consistently faster than std::sort except for the Alternating pattern.

@facontidavide
Copy link
Contributor Author

@Morwenn . thanks for adding some context.

You may find a reproducible benchmark here:

https://github.com/facontidavide/PCL_VoxelGrid_enhanced

image

@kunaltyagi
Copy link
Member

How about making the sort algorithm a policy or input of the filter? Since other places in PCL could have bad defaults too, and this can be a start in giving the users more freedom.

PCL should still provide good defaults (ala this PR), but we can let the users choose an algorithm that works best for them.

@facontidavide
Copy link
Contributor Author

facontidavide commented Apr 4, 2020

Honestly, if no one cared to optimize this so far, no one will care to customize it in the future.

The general user doesn't need to know that VoxelGrid filter uses a sorting algorithm and even less to know which is the best sorting algorithm for his/her use case.

And if std::sort is always slower, why should I have the "freedom" to select it?

Copy link
Member

@taketwo taketwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from small inline remarks, I'm in favor of merging this.

filters/include/pcl/filters/impl/voxel_grid.hpp Outdated Show resolved Hide resolved
filters/include/pcl/filters/impl/voxel_grid.hpp Outdated Show resolved Hide resolved
filters/include/pcl/filters/impl/voxel_grid.hpp Outdated Show resolved Hide resolved
filters/include/pcl/filters/impl/voxel_grid.hpp Outdated Show resolved Hide resolved
Copy link
Member

@taketwo taketwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@SergioRAgostinho SergioRAgostinho self-requested a review April 6, 2020 07:16
@kunaltyagi kunaltyagi added needs: code review Specify why not closed/merged yet and removed needs: feedback Specify why not closed/merged yet labels Apr 16, 2020
@SergioRAgostinho SergioRAgostinho merged commit 4e2e9af into PointCloudLibrary:master Apr 27, 2020
@SergioRAgostinho SergioRAgostinho removed the needs: code review Specify why not closed/merged yet label Apr 27, 2020
@taketwo taketwo changed the title Optimize VoxelGrid Filter Improve VoxelGrid filter runtime by ~20% by using spread sort May 10, 2020
@awesomebytes
Copy link

@facontidavide I had the chance to test the efficacy of your PR (I did need to backport it to PCL 1.8.1 and build from source) and my testing with real data from Velodyne pointclouds gave 23-24% speed improvement, with 0.8% of the computations taking equal or longer than with the previous implementation (could just be my CPU having hickups). I computed it by doing doing an average over comparing pointclouds processed one-to-one and also over total times running some relatively big custom datasets.

@facontidavide
Copy link
Contributor Author

Good to know the actual numbers 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: enhancement Meta-information for changelog generation module: filters
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants