Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of the StandardAbsoluteDeviation (SAD) anomaly detection algorithm #1357

Merged
merged 12 commits into from
Nov 2, 2023

Conversation

hoanganhngo610
Copy link
Contributor

This PR contains the implementation of the Standard Absolute Deviation (SAD) anomaly detection algorithm.

This implementation is adapted from the implementation within PySAD (Python Streaming Anomaly Detection) package, which calculates the anomaly score by dividing the deviation from the mean/median to the standard deviation of all points seen within the data stream. This idea is adapted from the paper by Hochenbaum et al. (2017).

Despite being a fairly simple algorithm, the implementation of this algorithm allows the anomaly submodule of River to have a solid competitor to conduct any benchmarking work.

There is no unit test for this algorithm since the result has already been compared locally with the implementation in PySAD, which yielded exactly similar results. Creating unit tests upstream would require the installation of another package, which seem unnecessary.

@hoanganhngo610
Copy link
Contributor Author

hoanganhngo610 commented Oct 31, 2023

@MaxHalford Would you mind having a look through this PR? There are two problems that are making the tests fail, including

  • mypy: I believe these errors do not come from the files that I modified.
  • test_estimators tests: I believe I have added SAD to the list of estimators that are expected to be ignored upon testing, but somehow they are still tested. I personally do not encounter these errors running pytest locally.

Thank you so much in advance!

__all__ = ["StandardAbsoluteDeviation"]


class StandardAbsoluteDeviation(anomaly.base.AnomalyDetector):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with this method is that it is univariate. You're trying to fit into the multivariate anomaly detection framework, which is not the right way.

What you should do is inspire yourself from anomaly.GaussianScorer. It's also a univariate anomaly detection method.

Let me know if this isn't clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxHalford I think I can understand the idea, and have also implemented it in my upcoming commits. It would be much appreciated if you can have a look through and see whether the new approach is appropriate!

Copy link
Member

@MaxHalford MaxHalford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Can you add a note to unreleased.md?

@hoanganhngo610
Copy link
Contributor Author

hoanganhngo610 commented Nov 2, 2023

@MaxHalford Thank you so much for approving the changes! The entry has also been added to the UNRELEASED.md file.

@hoanganhngo610 hoanganhngo610 merged commit bc1b7cb into main Nov 2, 2023
9 of 11 checks passed
@hoanganhngo610 hoanganhngo610 deleted the sad-implementation branch November 2, 2023 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants