-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support system_metrics_cmd in config.properties #3000
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This should support something like this, where the user passes this as an extra file.
torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler examples/image_classifier/mnist/mnist_handler.py --extra-files "examples/custom_system_metrics/metric_collector.py"
and in the config.properties
, I specify
system_metrics_cmd="metric_collector.py --gpu 0"
I tried this, currently it gives an error
can't open file '/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/"metric_collector.py': [Errno 2] No such file or directory
Q1: |
Case1: it is invalid to add system metrics in model archiver in TorchServe architecture. The reason is system metrics is global, not model specific. The scripts included in model archiver is invisible for frontend when TorchServe is started. Case2: By format definition in config.properties, the value |
Confirmed this to be working.
Will send a separate PR to show this can be used with a custom python script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM to me.
nit: please edit the example as suggested to make it obvious
@@ -297,6 +297,7 @@ e.g. : To allow base URLs `https://s3.amazonaws.com/` and `https://torchserve.py | |||
* For security reason, `use_env_allowed_urls=true` is required in config.properties to read `allowed_urls` from environment variable. | |||
* `workflow_store` : Path of workflow store directory. Defaults to model store directory. | |||
* `disable_system_metrics` : Disable collection of system metrics when set to "true". Default value is "false". | |||
* `system_metrics_cmd`: The customized system metrics python script name with arguments. For example:`ts/metrics/metric_collector.py --gpu 0`. Default: empty which means TorchServe collects system metrics via "ts/metrics/metric_collector.py --gpu $CUDA_VISIBLE_DEVICES". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest changing the example to system_metrics_cmd=ts/metrics/metric_collector.py --gpu 0
to disable GPU metrics
Description
Please read our CONTRIBUTING.md prior to creating your first pull request.
Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes #(issue)
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Checklist:
cc @namannandan