Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

55 outlier detection #105

Merged
merged 20 commits into from
Mar 2, 2018
Merged

Conversation

Maximophone
Copy link
Contributor

No description provided.

Copy link
Contributor

@ukclivecox ukclivecox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link from top level readme in a new section "components" ? to the notebook describing the outlier detector.

@@ -25,7 +25,7 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"collapsed": false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a missing proto compile/copy as used in other notebooks. One gets the error:
ImportError Traceback (most recent call last)
in ()
1 import requests
2 from requests.auth import HTTPBasicAuth
----> 3 from proto import prediction_pb2
4 from proto import prediction_pb2_grpc
5 import grpc

ImportError: cannot import name prediction_pb2

Copy link
Contributor Author

@Maximophone Maximophone Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you build the protos locally first? Using the makefile in notebooks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always add to the notebooks:

!cp ../proto/prediction.proto ./proto
!python -m grpc.tools.protoc -I. --python_out=. --grpc_python_out=. ./proto/prediction.proto

so they are self contained

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@@ -569,7 +569,7 @@
"* Two models\n",
"\n",
"The outlier detector is a special kind of transformer that will populate a tag in the response metadata with the outlier score it has calculated. \n",
"We use the docker image seldonio/mock_outlier_detector:1.0 for the outlier detector.\n",
"We use the docker image seldonio/outlier_mahalanobis:0.2 for the outlier detector.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding some explanation of the returned values from this test. Are are the outlier scores meat to be useful here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No since the features sent are meaningless they can't really be interpreted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, if you always send the same 2 points, as is the case with the rest_request, you will always see an outlier score of 0

"source": [
"The output of the algorithm (outlier score) is a measure of distance from the center of the features distribution (Mahalanobis distance). The algorithm is online, which means that it starts without knowledge about the distribution of the features and learns as requests arrive. Consequently you should expect the output to be bad at the start and to improve over time. \n",
"\n",
"The output being a real positive number, we leave it to the user to decide on a threshold for when a point will be consider to be an outlier.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - considered

"As observations arrive, the algorithm will:\n",
"- Keep track and update the mean and sample covariance matrix of the dataset\n",
"- Apply a principal component analysis using these moments and project the new observations on the first 3 principal components (default value, can be changed)\n",
"- Compute the Mahalanobis distance from this projections to the projected mean\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - projection

"cell_type": "markdown",
"metadata": {},
"source": [
"To compute the outlier score of each point in the new batch, we need the inverse of the covariance matrix of all the points up to this one. This means inverting $b$ matrices. We made this operation faster by leveraging the fast that each covariance matrix is a rank one update of the previous one. \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo - "the fast"

@@ -1,5 +1,5 @@
IMAGE_NAME=docker.io/seldonio/core-python-wrapper
IMAGE_VERSION=0.7
IMAGE_VERSION=0.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we updated all docs to version 0.8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. These changes only impact someone who wants to build and wrap an outlier detector, but this isn't document anywhere at the moment...

@ukclivecox ukclivecox merged commit f6ad032 into SeldonIO:master Mar 2, 2018
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants