-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Regression Rescorer #52059
[ML] Regression Rescorer #52059
Conversation
Pinging @elastic/ml-core (:ml) |
retest this please |
331bd8f
to
fcc8b7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simple implementation. Seems like a natural place to put inference support :)
modelProvider.getTrainedModel(modelId, true, ActionListener.wrap(trainedModel -> { | ||
LocalModel model = new LocalModel( | ||
modelId, | ||
trainedModel.ensureParsedDefinition(ctx.getXContentRegistry()).getModelDefinition(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a re-scorer run only on the coordinating node or down on the shards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is down on the shards, It might be good to not inflate the definition until it is used, that way the query definition is smaller for serializing across the wire.
I am not sure the computation cost of inflating it once and serializing it vs inflating it X number of times on each shard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 The rescorer only runs at the shard. The coordinating node receives the scores from each shard and is responsible for choosing which documents to return but doesn't actually do any rescoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching is another consideration. I am not sure if we can cache the inflated object on the shard so that we know not to inflate it and just use the cached object.
The cost of writing the inflated model on the wire might not be THAT bad if you take into consideration all nodes having to gunzip + parse the JSON every time.
As in, on each rescore (so once per shard per query), we load the model again? |
Well there is always room for improvement. The caching framework used by the inference ingest processor can be utilised and should be fairly easy to incorporate. |
It is loaded once per query. Not once per shard since @davidkyle is loading it up in the coordinating node. |
Right, and then it's serialized and sent to each shard? I'll wait to hear back from the Elasticsearch team about caching. As a PoC, this seems totally fine to me, and beyond that we can iterate on caching strategies. |
run elasticsearch-ci/2 |
run elasticsearch-ci/2 |
run elasticsearch-ci/2 |
Adds a rescorer utilising the regression models currently used in the Inference Processor, in fact the configuration shares many similarities.
The rescorer loads the specified model (
model_id
) during rewrite then for each document extracts the fields required by the model and rescores the hit according to the regression result. The final score is computed from the search score and model score using the samescore_mode
operations as query rescore.As with all prototypes most of the work is left todo: