-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test 85 failing under Windows in Debug configuration #2570
Comments
Detailed explanation of the issue in PointCloudLibrary#2570 Closes PointCloudLibrary#2570 See also PointCloudLibrary#2399
I'm having trouble connecting all the dots. I need to have a look at the stack trace to be able to guide myself. Regarding your proposal, it feels to me like we're hiding the problem under the carpet. I'm not fully into the problem like you, but intuitively it feels the problem needs to be fixed somewhere else. Inside the |
Sorry for the confusion, I updated my initial post and hopefully it is a bit clearer. Please go over it again 😄 To be fair, I'm not sure my fix is correct becuse I'm unuware of the expected behaviour. I agree that you should also have a look at the stack trace. My intuition is the following:
|
Detailed explanation of the issue in PointCloudLibrary#2570 Closes PointCloudLibrary#2570 See also PointCloudLibrary#2399
Just want to quickly report my partial findings. The first occurrence of nans in features happens on line 868 pcl/test/registration/test_registration.cpp Lines 863 to 868 in 50ca79c
The first and last points of the input cloud pcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp Lines 294 to 299 in 50ca79c
I was able to verify the same with My next steps would be to understand if:
|
@SergioRAgostinho we are synchronized on the stack trace 👍 I also noticed that the first and last element are always NaN, but I fear is something related to the test or on a common use case of the feature (evaluate the feature on a PointCloud without providing indices). If you look at how NaN are added pcl/features/include/pcl/features/impl/ppf.hpp Lines 69 to 118 in cd65105
it depends on the content of indices that cannot be known a-priori. As a result (by my understanding), there could be multiple NaN in the feature.
Let me reply in order:
I tried skipping NaN altogether in L298 of pcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp Lines 294 to 299 in 50ca79c
but the tests failed because the tolerance on the recognition tests were not respected. My fear is that pcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp Lines 237 to 243 in cd65105
is accomulated differently and something in the algorithm is altered for it to work properly. That is the reason why I forced NaN to be pushed in bin 0, because it is actually what happens in the current implementation.
My initial reasoning suggests that there may be other NaN points.
I should read the paper of the subject feature to understand what is going on to propose a fix and some more detailed reasoning about this 😭 Anyway, I'll try to skip again the NaN points to see whether I did something wrong during my first attempts and let you know. It is important however that we agree that there is an issue in the code. |
Detailed explanation of the issue in PointCloudLibrary#2570 Closes PointCloudLibrary#2570
Marking this as stale due to 30 days of inactivity. It will be closed in 7 days if no further activity occurs. |
Cast of floating NaN and Inf to integer as defined as UB. That means, we have to check the cast location and manually convert it to 0 in order to have consistent performance across platforms. Though, @claudiofantacci mentioned trying to shoehorn NaNs into bin 0 already |
This problem was fixed by pull request #4711 |
Here is the error:
This error is quite subtle and please let me know whether my understanding of the error, and its possible fix, is correct or not.
The main point of failure is this line here
pcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp
Line 265 in cd65105
that pushes back into
access
variable an "unknown" number whenfeature
isnan
. Feature is never checked whether it could benan
or not and this may happen because of this snippet code fromPPFEstiamtion
feature used for the test (and maybe in user code also)pcl/features/include/pcl/features/impl/ppf.hpp
Lines 69 to 118 in cd65105
variable
access
is then used to evaluate an index of a vector and since it results innan
its use is undefined (or platform dependent since it will be converted to asize_t
).I made some tests on some platform and apparently (take this with care, I'm not 100% sure) accessing a vector with
nan
results in accessing element at0
(that is, the cast tosize_t
results in0
).It turns out that in
Debug
configuration under Windows, the index evaluated usingaccess
variable resulted in a very large number, notnan
. As a consequence I got the out-of-bound access fualt from which I started the investigation.Please note that if
access
is an arbitrary large number (as inDebug
mode under Windows), the following linepcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp
Line 239 in cd65105
that translates
access
to a an index sometimes go over the maximum allowed representation of the type and, depending on the system, may result in a0
, thus masking the unwanted behaviour.The fix I propose, assuming that the expected behavior is to have
nan
features mapped to histogram bin0
, is to change the following linespcl/registration/include/pcl/registration/impl/pyramid_feature_matching.hpp
Lines 264 to 265 in cd65105
in
to force the inex to be
0
fornan
s. By doing this the tests then run properly.See further comments here #2399
The text was updated successfully, but these errors were encountered: