-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector3dvector and other vector Eigen bindings speed up #657
Conversation
This tries to address #403 |
Update: squeezed in another optimization ed122ac with direct memory mapping, 30% more speed up: 2e6 points:
2e5 points (as used in #403):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r1, 2 of 2 files at r2.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @takanokage and @qianyizh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great PR. It is good to have unit test for this as well. As a sanity check, could you also check other tutorial examples that uses VectorXXVectors?
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @takanokage and @qianyizh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syncle For vector and matrices, we have
Vector3dVector
Vector3iVector
Vector2iVector
Matrix4dVector
The first 3 has been optimized. The fourth Matrix4dVector
could be optimized in the same way, however, it is only used for converting camera parameters, so performance shouldn't be an issue for now.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @takanokage and @qianyizh)
Improves speed of
open3d.Vector3dVector
,Vector3iVector
,Vector2iVector
,Matrix4dVector
by 40-200x, resolving issue #403 and pybind/pybind11#1481.Special thanks to Wenzel's feedback. According to Wenzel, the slowness is due to "casting millions of small vectors, which requires a proportional amount of Python API calls".
Comparison
Before
After
Discussions
This solution is not ideal yet:
vector<Eigen::Vector3d>
to one blob of buffer. However, this requires significant rework of the code base.2) We pay the penalty accessing numpy array index individually. We can do more aggressive optimizations (e.g. more direct memory mapping) if we we handledouble
andint
types separately (i.e. handleVector3dVector
andVector3iVector
separately) instead of using a generic function. assert that the incoming array is contiguous. Preliminary tests shows about 20% - 30% speed up.Edit:2) is addressed in the improved direct mapping approach
Future works
Some of the templated functions can be further merged. Please let me know if you have suggestions.
This change is