Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: LOF with QuantileFilter raises IndexError #1330

Merged
merged 8 commits into from
Oct 10, 2023

Conversation

MarekWadinger
Copy link
Contributor

This pull request resolves the issue #1329 Error combining LocalOutlierFactor with AnomalyFilter

river/anomaly/lof.py Outdated Show resolved Hide resolved
@MarekWadinger
Copy link
Contributor Author

Hey Max,

thank you, you are absolutely right with the default score of 0. I looked at it from the perspective of two-tailed test but that's not how QuantileFIlter and ThresholdFIlter work here.

I fixed the failing test due to stateless score_one fixture, related to issue #1331. Added some quick tests if some new way of handling the statelessness would be developed and tried to run some tests from other scorers (kept the quick one).
Interestingly, the ROC AUC score for LOF on CreditCard is very low (while being quite sluggish).

PS. Sorry for my clumsy Pull requests, still learning how to contribute properly, though the guidelines here helped a lot.

@MaxHalford
Copy link
Member

Very cool that you added a test, I appreciate it.

You still have a little doctest issue to fix, probably a normal consequence of your change.

PS. Sorry for my clumsy Pull requests, still learning how to contribute properly, though the guidelines here helped a lot.

Don't apologize, you're doing very well ;)

@MarekWadinger
Copy link
Contributor Author

Hmm, there's something weird going on with the failed test. It reports that the expected value is the one prior to this commit f33e3ca. However, there's already a new value.

Am I missing something?

________________ [doctest] river.anomaly.lof.LocalOutlierFactor ________________
[gw1] darwin -- Python 3.11.5 /Users/runner/.venv/bin/python3.11
214     >>> for x, _ in datasets.CreditCard().take(200):
215     ...     lof.learn_one(x)
216 
217     >>> lof.learn_many(cc_df[201:401])
218 
219     >>> scores = []
220     >>> for x in cc_df[0][401:406]:
221     ...     scores.append(lof.score_one(x))
222 
223     >>> [round(score, 3) for score in scores]
Expected:
    [1.802, 1.937, 1.567, 1.181, 1.28]
Got:
    [1.802, 1.936, 1.566, 1.181, 1.272]

/Users/runner/work/river/river/river/anomaly/lof.py:223: DocTestFailure

@smastelini
Copy link
Member

Hi @MarekWadinger. Did you update the docstring test? As you fixed bugs in the code, it expected that the outputs will change. So you will need to update that accordingly :)

@MaxHalford
Copy link
Member

You're right @MarekWadinger, there's a bug in CI whereby the old code is being used to run the tests. I will take a look tonight to fix this. In the meantime, let's merge your PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants