-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto detect backend metrics not defined in metrics configuration #2769
Auto detect backend metrics not defined in metrics configuration #2769
Conversation
frontend/server/src/main/java/org/pytorch/serve/metrics/MetricCache.java
Outdated
Show resolved
Hide resolved
for (Dimension dimension : parsedMetric.getDimensions()) { | ||
dimensionNames.add(dimension.getName()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need recopy to a new list? is the return value of getDimensions immutable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, why parsedMetric.getDimensions().add("Hostname") can not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsedMetric
is an instance of Metric. parsedMetric.getDimensions()
returns a list of Dimension objects.
- When registering a metric to cache, we need to provide a list of dimension names. So, we need to extract the dimension names from the Dimension objects.
- When updating a metric, we need to provide a list of dimension values. So, we need to extract the dimension values from the Dimension objects.
Therefore the new lists are being created above.
Also, the Metric class API returns references to private objects which should ideally be immutable. So, I believe it is better to create a new list like we are doing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it takes time to deepcopy. Metric object will be GC eventually if it is just used once at here and not referenced any more. The shallow copy of dimension name is fine and faster if Metric is only used at here for building IMetrics and it will not be GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added getDimensionNames
and getDimensionValues
utility functions to Metric class.
for (Dimension dimension : parsedMetric.getDimensions()) { | ||
dimensionNames.add(dimension.getName()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, why parsedMetric.getDimensions().add("Hostname") can not work?
List<String> dimensionValues = new ArrayList<String>(); | ||
for (Dimension dimension : parsedMetric.getDimensions()) { | ||
dimensionValues.add(dimension.getValue()); | ||
} | ||
// Hostname is added as a dimension by default to backend metrics | ||
dimensionValues.add(parsedMetric.getHostName()); | ||
|
||
this.metricCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added details in this comment: https://github.com/pytorch/serve/pull/2769/files#r1399626654
04724d2
to
e18dc55
Compare
ts/model_service_worker.py
Outdated
# Backwards Compatibility with releases <=0.6.0 | ||
# Request ID is not set for model load requests | ||
# TODO: UUID serves as a temporary request ID for model load requests | ||
self.metrics_cache.set_request_ids(str(uuid.uuid4())) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is just initialize metrics_cache in service, not real model load metrics. uuid should be set within func load. So far there is no model loading metrics as i know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved logic for request id
assignment for model load requests from model_service_worker.py
to model_loader.py
.
That's correct, we don't have any default metrics that we emit during model load time, but a custom handler can emit metrics in the initialize
method. If the request id is not set, it can cause the metric to only have Level:Error
as the default dimension that gets added instead of ModelName:<model_name>
and Level:Model
. Described the issue here: #2772
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting uuid for model loading should be at func load, not here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I've already moved it to the load method: https://github.com/pytorch/serve/pull/2769/files#diff-bd9cbbce71a9f862d4edafb574d20bd4dcdf5ae8f80bf2aa8ad3b73e80fb5807R93-R99
this.hostName = hostName; | ||
this.dimensions = Arrays.asList(dimensions); | ||
this.setDimensions(Arrays.asList(dimensions)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why original "this.dimensions = Arrays.asList(dimensions);" needs to be replaced with setDimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because setDimensions
now takes care of setting up the values of dimensionNames
and dimensionsValues
as well to support the getDimensionNames
and getDimensionValues
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. Thanks
…_worker.py to model_loader.py
Description
Enable auto detection of backend metrics that are not defined in the metrics configuration file.
To enable backwards compatibility with version
<=0.6.0
, set request ID for model load reqeusts so that metrics updated in theinitialize
method of the handler include bothModelName
andLevel
dimensions.Fixes: #2747, #2772
Type of change
Feature/Issue validation/testing
Note that
HandlerMethodTime
andExamplePercentMetric
are not defined in the metrics configuration file above.As can be seen above,
HandlerMethodTime
andExamplePercentMetric
are auto detected, registered to the prometheus client and updated.