Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] fix segfault caused by to few outliers and harden container usage #96

Merged
merged 2 commits into from
May 17, 2018

Conversation

hendrikmuhs
Copy link
Contributor

Do not re-weight outliers if there is just 1, preventing a crash downstream
and harden accumulators to prevent empty containers.

relates to #90
fixes #94

Note: backport to be done together with pending backport of #90

…stream

and harden accumulators to prevent empty containers.

fixes elastic#94
CTools::logisticFunction(static_cast<double>(i) / numberOutliers,
0.1, 1.0)};
CBasicStatistics::count(values[outliers[i].second]) *= weight;
if (numberOutliers > 1.0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes below are just formatings

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, a good idea to make this class more robust! I left a couple of suggestions for slightly rewording the errors. I don't think I need to review again, so going ahead and approving now.


//! Reset the number of statistics to gather to \p n.
void resize(std::size_t n) {
if (n == 0) {
LOG_ERROR(<< "Invalid resize of 0 for statistics accumulator");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this reads slightly better as Invalid resize to 0 for order statistics accumulator.

: TImpl{std::vector<T>(n, T{}), less} {}
: TImpl{std::vector<T>(std::max(n, std::size_t(1)), T{}), less} {
if (n == 0) {
LOG_ERROR(<< "Invalid size of 0 for statistics accumulator");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd make this Invalid size of 0 for order statistics accumulator, since there are other types of statistics accumulators.

@hendrikmuhs hendrikmuhs merged commit 18ebdd6 into elastic:master May 17, 2018
tveasey pushed a commit that referenced this pull request May 22, 2018
#96)

Do not re-weight outliers if there is just 1, preventing a crash downstream
and harden accumulators to prevent empty containers.

fixes #94
@sophiec20 sophiec20 added :ml and removed :ml labels Jun 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Autodetect fatal error in gallery 302
3 participants