Metrics helper classes implementation for C++ backend #1874

namannandan · 2022-09-23T21:27:33Z

This PR implements the following Metrics helper classes in C++ for Torchserve backend

Units
Metric

Testing

Each of the classes have their associated unit tests

Reference design

[RFC]: Metrics Refactoring #1492

lxning · 2022-09-29T17:25:18Z

cpp/src/backends/CMakeLists.txt

@@ -2,6 +2,7 @@ set(TS_BACKENDS_SRC_DIR "${torchserve_cpp_SOURCE_DIR}/src/backends")
 set(TS_BACKENDS_CORE_SRC_DIR "${TS_BACKENDS_SRC_DIR}/core")
 set(TS_BACKENDS_PROTOCOL_SRC_DIR "${TS_BACKENDS_SRC_DIR}/protocol")
 set(TS_BACKENDS_PROCESS_SRC_DIR "${TS_BACKENDS_SRC_DIR}/process")
+set(TS_BACKENDS_METRICS_SRC_DIR "${TS_BACKENDS_SRC_DIR}/metrics")


metrics should be in src/utils, not src/backends

lxning · 2022-09-29T17:56:19Z

cpp/src/backends/metrics/dimension.hh

+    class Dimension {
+        public:
+        Dimension(const std::string& name, const std::string& value);
+        std::string ToString() const;


A dimension is a map b/w name and a list of values. eg.
dimension name: batch_size and values can be 1, 2, 4, 8.

lxning · 2022-09-29T18:34:44Z

cpp/src/backends/metrics/metric.cc

+    template <typename ValueType>
+    Metric<ValueType>::Metric(
+            const std::string& name, const MetricType& type, const ValueType& value,
+            const std::string& unit, const std::vector<Dimension>& dimensions,
+            const std::string& request_id, const bool& is_updated) :
+            name{name}, type{type}, value{value}, unit{Units::GetUnitMapping(unit)},
+            dimensions{dimensions}, request_id{request_id}, is_updated{is_updated} {}


Pls take a look at prometheus client API.

lxning · 2022-10-06T18:46:46Z

cpp/src/backends/CMakeLists.txt

 list(APPEND TS_BACKENDS_TORCH_SCRIPTED_SOURCE_FILES ${TS_BACKENDS_TORCH_SCRIPTED_SRC_DIR}/torch_scripted_backend.cc)
 list(APPEND TS_BACKENDS_TORCH_SCRIPTED_SOURCE_FILES ${TS_BACKENDS_TORCH_SCRIPTED_SRC_DIR}/handler/base_handler.cc)
 add_library(ts_backends_torch_scripted SHARED ${TS_BACKENDS_TORCH_SCRIPTED_SOURCE_FILES})
-target_include_directories(ts_backends_torch_scripted PUBLIC 
+target_include_directories(ts_backends_torch_scripted PUBLIC
  ${TS_BACKENDS_TORCH_SCRIPTED_SRC_DIR} ${TS_BACKENDS_TORCH_SCRIPTED_SRC_DIR}/handler ${TORCH_INCLUDE_DIRS})
-target_link_libraries(ts_backends_torch_scripted PUBLIC ts_utils ts_backends_core ${TORCH_LIBRARIES}) 
+target_link_libraries(ts_backends_torch_scripted PUBLIC ts_utils ts_backends_core ${TORCH_LIBRARIES})
 install(TARGETS ts_backends_torch_scripted DESTINATION ${torchserve_cpp_SOURCE_DIR}/_build/libs)

 # build library ts_backend_torch_deploy
 #set(TS_BACKENDS_TORCH_DEPLOY_SOURCE_FILES "")
 #add_library(ts_backends_torch_deploy SHARED ${TS_BACKENDS_TORCH_DEPLOY_SOURCE_FILES})
 #target_include_directories(ts_backends_torch_deploy PUBLIC ${TS_BACKENDS_TORCH_DEPLOY_SRC_DIR})


There are no changes for this CMakeLists.txt. Please rebase your code to remove this file from this PR.

I have rebased my dev branch and removed all code changes to this file, but since I had touched this file in a previous commit, the pre-commit hook to remove trailing whitespaces is still running on this file.

lxning · 2022-10-06T19:03:21Z

cpp/src/utils/CMakeLists.txt

+target_include_directories(ts_utils PRIVATE ${Boost_INCLUDE_DIRS})
+target_link_libraries(ts_utils ${FOLLY_LIBRARIES} ${CMAKE_DL_LIBS} ${Boost_LIBRARIES})


why boost is needed here?

Good catch. Boost is not required in this PR, so I'll remove it. I had initially included it to fetch the hostname in a platform independent manner while generating the metic log. Will consider including it if required in a subsequent PR when implementing the STDOUTMetricEmitter class.

On second thought, Boost is required in this PR since I've included implementation for the Emitter.

lxning · 2022-10-06T19:06:36Z

cpp/src/utils/metrics/dimension.hh

+        Dimension(const std::string& name, const std::string& value);
+        const std::string GetName() const;
+        const std::string GetValue() const;
+
+        private:
+        const std::string name;
+        const std::string value;
+    };


please address my previous comments.

I do agree with your comment that a Dimension has a name and can take on multiple values. The intention of defining this class was to use it in the context of a particular metric instance. For example, if we have a metric that tracks the inference latency for a fastrcnn model with batch size 2, then the dimensions for that specific metric instance could be something like {"model_name": "fastrcnn", "batch_size": "2"}. Example usage in testcase. Also, I believe that we will not know all the possible dimension values during initialization time, since we don't enforce that in the metrics configuration. Please let me know your thoughts on this.

It is impossible to know all the possible values of a dimension. As an API, the data structure should be able to reflect this use case and also save memory spaces. In concept, a metric includes

metric name

Vector dimension names

ConcurrentHashMap<Vector, Sample> // Sample is a concrete metric data point such as a counter sample point.

So, does dimension class a Must have class?

That's true, for storing dimension names and values within a Metric object, the Dimension object may not be absolutely necessary. I was also considering its use in the context of supporting the MetricCache API, i.e Metric addMetric(metricName, unit, dimensions, metricType); and Metric getMetric(metricName, dimensions);. In both of these cases, the dimensions argument will need to be a vector of <dimension_name:single_dimension_value> tuples. So here, it may be clear to the user, if we express the dimensions input argument to be of type std::vector<Dimension>. What do you think?

lxning · 2022-10-06T19:12:32Z

cpp/src/utils/metrics/metric.hh

+        const std::string request_id;
+    };
+
+    class Metric {


As we sync up last week, we need first define torchserve Metric API so that we are able to extend it to different implementation such as frontend prometheus and backend log metrics.

Here the code is more related to backend. it is difficult to later extend to frontend. So please define metric api first, and then implement backend metrics.

I believe we are on the same page. The class I'm defining here is going to be used to store the metric attributes (name, unit, type, dimensions and values) for a single instance of a particular metric. The plan is to add a MetricCache class which will hold several of these Metric objects. The modularity to emit metrics either as a log or to prometheus or a different metric framework in future will come from the MetricEmitter handle in the MetricCache instance. For example, the backend MetricCache instance will use the STDOUTMetricEmitter and the frontend MetricCache instance will use the PrometheusMetricEmitter. The plan is to implement MetricCache and MetricEmitter classes in subsequent PRs. Please let me know let me know your thoughts on this.

sounds good.

Updated implementation to follow the API defined in metrics redesign RFC: #1492

lxning · 2022-10-07T19:22:29Z

cpp/src/utils/metrics/metric.hh

+        const std::vector<Dimension> dimensions;
+        std::vector<MetricValue> values;


The data structure can be simplified and also prepared for sample data point fast updating, eg hashmap. see comments

Agree that it may be more efficient to use a hashmap to store the metric values, but a couple of quick questions on the suggested data structure ConcurrentHashMap<Vector, Sample> to be used inside the Metric object:

Since Vector is mutable, is it valid to use it as a key in a hashmap?

If we plan to store the Sample value associated with each possible combination of dimension values in a single Metric object, how would we support the Metric API void Update(value);? Everytime Update is called, the caller will need to supply the dimension values as well otherwise we would not know which Sample to update. Instead, if we store the value associated with a single metric with specific combination of dimension values in a single Metric object, wouldn't that make the implementation simpler to support the MetricCache APIs Metric addMetric(metricName, unit, dimensions, metricType);, Metric getMetric(metricName, dimensions); and also the Metric API void Update(value);? Although the tradeoff here is that we will use more space by duplicating metric attributes across Metric objects.

namannandan · 2022-10-12T17:27:12Z

cpp/src/utils/CMakeLists.txt

+target_include_directories(ts_utils PRIVATE ${Boost_INCLUDE_DIRS})
+target_link_libraries(ts_utils ${FOLLY_LIBRARIES} ${CMAKE_DL_LIBS} ${Boost_LIBRARIES})


On second thought, Boost is required in this PR since I've included implementation for the Emitter.

namannandan · 2022-10-12T17:29:24Z

cpp/src/utils/metrics/metric.hh

+        const std::string request_id;
+    };
+
+    class Metric {


Updated implementation to follow the API defined in metrics redesign RFC: #1492

namannandan · 2022-10-12T17:34:58Z

cpp/src/utils/metrics/metric_cache.hh

+namespace torchserve {
+    class MetricCache {
+        public:
+        virtual Metric AddMetric(const std::string& name, const std::string& unit,


LogMetricCache implementation for backed which will derive from MetricCache will support the AddMetric and GetMetric API by:

Storing allowed metric names and corresponding unit and dimension names obtained from the metrics configuration.

Storing a map of type std::unordered_map<std::string, LogMetric>. Here, the key will be a string consisting of metric name, unit, dimension name and value corresponding to each dimension.

are we going to have other implementations other than LogMetricCache in the future?

Yes, we can have a PrometheusMetricCache which the frontend can use to store references to prometheus client metric objects.

rohithkrn · 2022-10-12T23:36:32Z

cpp/src/utils/metrics/log_metric.hh

+
+namespace torchserve {
+    // Value entry for log metric consisting of <metric_value, timestamp, request_id>
+    typedef std::tuple<double, std::uint64_t, std::string> LogMetricValue;


nit: using LogMetricValue = std::tuple<double, std::uint64_t, std::string> is more modern?

Good point. Will update.

namannandan · 2022-10-17T16:47:41Z

Discussion on design updates and implementation details are tracked in this document.

lxning · 2022-10-20T05:21:09Z

cpp/src/utils/metrics/metric.hh

+        const MetricType type;
+        const std::string name;
+        const std::string unit;
+        const std::set<std::string> dimension_names;


A metrics dimension has order. is std::set able to keep order as input from metrics.yaml?

I believe std::set will maintain its elements in sorted order (ascending by default) but may not be the same order as in the metrics configuration file. This will not matter for the backend metrics implementation in this PR since it does not make any assumptions about the order in which the dimension names or values are provided. AddOrUpdate method accepts a map consisting of keys as dimension names and values as dimension values.

Do we want the implementation to use the metric dimension order specified in the configuration file?

The backend should also keep the same order defined in the metrics configuration b/c backend metrics needs to be injected into prometheus metrics by frontend.

lxning · 2022-10-20T05:23:24Z

cpp/src/utils/metrics/log_metric.hh

+        TSLogMetric(const MetricType& type, const std::string& name, const std::string& unit,
+                    const std::set<std::string>& dimension_names) :
+                    IMetric(type, name, unit, dimension_names) {}
+        void AddOrUpdate(const std::map<std::string, std::string>& dimension_values, const double& value);


Please change dimension_values as dimensions if here "dimension_values" means dimension name and dimension value.

Yes, the argument here is a map with key as dimension name and value as dimension value.

Do we want this argument to be a vector of dimension values instead, with dimension name order assumed from configuration file?

The variable name should reflect its meaning. That's my comments.

Got it, makes sense. Also, considering the fact that we want to retain order in which dimension names are specified in the metrics configuration, the dimension_values argument can be a vector of strings corresponding to the dimension names. Will update implementation.

lxning · 2022-10-20T05:23:58Z

cpp/src/utils/metrics/log_metric.hh

+                    const std::set<std::string>& dimension_names) :
+                    IMetric(type, name, unit, dimension_names) {}
+        void AddOrUpdate(const std::map<std::string, std::string>& dimension_values, const double& value);
+        void AddOrUpdate(const std::map<std::string, std::string>& dimension_values,


Please change dimension_values as dimensions if here "dimension_values" means dimension name and dimension value.

lxning · 2022-10-20T05:24:30Z

cpp/src/utils/metrics/log_metric.hh

+                         const std::string& request_id, const double& value);
+
+        private:
+        void ValidateDimensionValues(const std::map<std::string, std::string>& dimension_values);


Please change dimension_values as dimensions if here "dimension_values" means dimension name and dimension value.

lxning · 2022-10-20T06:03:10Z

cpp/src/utils/metrics/log_metric.cc

+    }
+
+    void TSLogMetric::ValidateMetricValue(const double& value) {
+        if(type == MetricType::COUNTER && value < 0.0) {


why value must be largeOrequal 0?

A counter metric value can only increase or be reset to 0 on restart as per the prometheus metric definition: https://prometheus.io/docs/concepts/metric_types/#counter.
The prometheus client metric implementation also has this check: https://github.com/jupp0r/prometheus-cpp/blob/871e782d57fec2fba1f00e1c4ed44cc1fc4a0f66/core/src/counter.cc#L8

Do we want to include this check here or ignore it and let the frontend handle it?

lxning · 2022-10-20T06:04:19Z

cpp/src/utils/metrics/log_metric.hh

+        private:
+        void ValidateDimensionValues(const std::map<std::string, std::string>& dimension_values);
+        void ValidateMetricValue(const double& value);
+        void Emit(const std::map<std::string, std::string>& dimension_values, const double& value,


Please change dimension_values as dimensions if here "dimension_values" means dimension name and dimension value

lxning · 2022-10-20T06:23:49Z

cpp/src/utils/metrics/log_metric.cc

+        std::string hostname = ::boost::asio::ip::host_name();
+        std::uint64_t timestamp = std::chrono::duration_cast<std::chrono::seconds>(
+                std::chrono::system_clock::now().time_since_epoch()).count();
+
+        if (request_id.empty()) {
+            TS_LOGF(INFO, "[METRICS]{}.{}:{}|#{}|#hostname:{},{}",
+                    name, unit, value, dimension_values_string, hostname, timestamp);
+        }
+        else {
+            TS_LOGF(INFO, "[METRICS]{}.{}:{}|#{}|#hostname:{},{},{}",
+                    name, unit, value, dimension_values_string, hostname, timestamp, request_id);
+        }


Which end (frontend or backend) uses this emit function? Only frontend log includes host name and timestamp, backend does not.

This is intended to be used by backend. I included the hostname and timestamp to align with the current python implementation for backend metrics:

serve/ts/metrics/metric.py

Line 77 in 877c7b6

self.name, self.unit, self.value, dims, socket.gethostname(), int(time.time()))

which is invoked here:

serve/ts/service.py

Line 140 in d258da8

logger.info("[METRICS]%s", str(met))

Do we want to exclude these in the CPP backend metrics implementation?

ok, then keep the same pattern as the existing python backend,

Dimension, Units and Metric

1) Move metrics helper classes from src/backends to src/utils 2) Update Metric class to store a vector of values instead of a single value

metrics refactor RFC: pytorch#1492

is required as part of the metrics implementation: cpp/src/backends/CMakeLists.txt cpp/test/CMakeLists.txt

…ation by emitting logs when the metrics API is called instead of storing them until the completion of an inference request to flush the metrics

for dimension values argument in the metrics API. Fix clang-tidy warnings.

lxning · 2022-10-28T18:29:22Z

cpp/test/utils/metrics/log_metric_test.cc

+ protected:
+  const std::string logfile_path{"test/resources/logging/test.log"};
+  const std::string logger_config_path_str{
+      "test/resources/logging/log_to_file.config"};


where is the log_to_file.config?

It is here: https://github.com/pytorch/serve/blob/cpp_backend/cpp/test/resources/logging/log_to_file.config
and is used for the logging unit tests. I've re-used the same here.

lxning · 2022-10-28T18:38:31Z

cpp/test/utils/metrics/log_metric_test.cc

+      "^.*\\[METRICS\\]test_metric\\.Milliseconds\\:1\\.5\\|#level\\:model,"
+      "model_name\\:test_model\\"
+      "|#hostname\\:.*,[0-9]+$";


This is how frontend parse backend metrics log. Please follow the same regex.

Good point, will update unit tests to use same metric log regex as frontend.

lxning · 2022-10-28T21:29:53Z

cpp/test/utils/metrics/log_metric_test.cc

+namespace torchserve {
+class TSLogMetricTest : public ::testing::Test {
+ protected:
+  const std::string logfile_path{"test/resources/logging/test.log"};


it is not a good way to write a log in resource. Backend log is stdio log.

Will update the log path to test/utils/metrics/metrics_test.log

Also, I considered the option of capturing the stdout for the unit test, but it appears that gtest does not offer any public API for capturing test stdout from what I could find. Although it does have an internal API for capturing stdout which may not be advisable to use in the test code:

https://github.com/google/googletest/blob/191ca1f3a9262b90a586ae2c2e8c742c3d867801/googletest/src/gtest-port.cc#L1117

https://stackoverflow.com/questions/3803465/how-to-capture-stdout-stderr-with-googletest

Therefore instead, I used log to file and then parse the file to validate the log output.

…parse metric logs

namannandan · 2022-11-01T08:17:54Z

Addressed review comments. Verified successful build and unit tests on Mac and Ubuntu. Merging PR.

* add workaround solution from nvidia * add comments * expand runtimeType * add runtimeType in model config * add unit test * revert test/buildspec_gpu.yml * update testng.xml * update json files * fmt * fmt * init cpp dir * init code structure * Init socket code and otf protocol * add log api * decouple backend and model worker; impl torchscript load model; build scripts [ci skip] * delete src/CMakeLists.txt * init model archive manifest loader * add manifest and unit test * integrate manifest into backend; add unit test * update otf message internal structure; add inference request message * update otfmessage function return [skip ci] * add torch base handler * support dynamic load shared lib * disable install libtorch * update utils/CMakeLists.txt * add dynamic lib loader unit test * [skip CI] update src/utils/CMakeLists.txt * install kineto in build.sh * [skip ci] add handler factory * [skip ci] update inference request message * vision handler inference impl. * [skip ci] update torchscript backend api * change model_path to model_dir [skip ci] * [skip ci] torchscripted handler load model pass postive test * [skip ci] fix dll test teardown * [skip ci] add mnist inference positive test * update torchscripted base handler postprocess * [skip ci] add model instance status in backend * [skip ci]add prediction test for base and dll handler * [skip ci] clean up * add batch inference test * [skip ci] add dll close * [skip ci] file_system clean up * [skip ci] add mnist scripted model pt file for testing * [skip ci] torch_scripted/torch_scripted_backend_test refactory * [skip ci] torch_scripted_backend_test refactory * [skip ci] extend LoadPredict api * [skip ci] add negative test in torch_scripted_backend_test * explicit set ABI=1 * support different build step for linux and mac * [skip ci] update folly installation * add sudo for folly dep installation * [skip ci] update libtorch cuda installation * [skip ci] update sudo mac * [skip ci] update cuda option flag * [skip ci] set cuda compiler * [skip ci] skip install kineto on linux * [skip ci] fix cude compile path * add documnetation * update gcc version description * add cpp log config file option * add parameters * update setup.py for package cpp * set cpp libpath env * add unit test * [skip ci] install lib * [skip ci] add cpp log config path for cpp backend start * CPP OTFProtocol implementation for inference request and response (#1817) * Add folly logging * Adding model response serializer * Slight refactor * Adding test for otf protocol * Address review comments * Adding some logging tests * Refactoring socket methods into its own class to enable mocking for testing * otf protocol implementation for inference request and response * rebase from #1814 * rebase from #1814 * refactor after PR#1814 rebase * add unit tests for inference request and response otf protocol * add cpp backend test for otf_protocol and handler * Update logging flag to read log config file * Address test review comments * Remove response end from LoadModelResponse data struct * Removing googletest since folly already has it * Adding errno to socket send and receive failures * address review comments * refactor LoadRequest and Response OTF protocol to remove use of new and minor improvements * address review comments Co-authored-by: Aaqib Ansari <maaquib@gmail.com> * update model archiver for cpp * Bug fix in cpp integration (#1887) * bug fixes in java - cpp integration * revert arg changes * add clang-tidy in build (#1896) * replace strcpy with strncpy (#1898) * add clang-tidy in build * replace strcpu with strncpy * [WIP] cpp backend with Java integ (#1890) * Fix build script * Fix socket issues * Fix socket name truncation for uds * Fix expected log lines format from frontend * Removing some debug logs * Address review comments * Remove incorrect log line * Fix inference issue * Update filesystem import * Fix path of default logging file * Make build.sh executable * add clang-tidy and clang-format for cpp backend lint (#1910) * add clang-tidy in build * replace strcpu with strncpy * fix warnings for torchscripte backends * add .clang-tidy and update CMakeLists * add clang-format * remove unused parameter in cpp basehandler (#1917) * add clang-tidy in build * replace strcpu with strncpy * fix warnings for torchscripte backends * add .clang-tidy and update CMakeLists * add clang-format * remove unused parameters in basehandler and update mnist handler * remove libmnist_handler.dylib * remove not necessary func softmax in mnist example handler * fix clang-tidy warnings (#1915) * CPP mnist_base postman test (#1907) * add mnist base cpp postman integration test * refactor based on #1917 * add response body validation * disable grpc inference api test for cpp backend model * fix typo * install clang-format on linux (#1926) * Add CI for cpp_backend branch (#1916) * Create ci-cpu-cpp.yml * Update ci-cpu-cpp.yml * Update ci-cpu-cpp.yml * Update ci-cpu-cpp.yml * Metrics helper classes implementation for C++ backend (#1874) * Metrics helper classes implementation Dimension, Units and Metric * Refactor metrics helper classes 1) Move metrics helper classes from src/backends to src/utils 2) Update Metric class to store a vector of values instead of a single value * Fix metrics headers include guard to follow naming convention * Refactor metrics implementation to follow the API described in the metrics refactor RFC: #1492 * Revert changes to the following CMakeLists files since no change is required as part of the metrics implementation: cpp/src/backends/CMakeLists.txt cpp/test/CMakeLists.txt * Fix compiler warnings related to std::experimental::filesystem * Refactor metrics helper classes to simplify backend metrics implementation by emitting logs when the metrics API is called instead of storing them until the completion of an inference request to flush the metrics * Infer dimension names order from config file and use the same order for dimension values argument in the metrics API. Fix clang-tidy warnings. * Refactor backend metrics unit tests to use same regex as frontend to parse metric logs * install cpp via install_from_src (#1883) * add clang-tidy in build * replace strcpu with strncpy * fix warnings for torchscripte backends * add .clang-tidy and update CMakeLists * add clang-format * remove unused parameters in basehandler and update mnist handler * remove libmnist_handler.dylib * remove not necessary func softmax in mnist example handler * feature install cpp from install_from_src * add --install-dependencies in setup.py * fix typo * update MANIFEST.in and readme * update readme * code cleanup * update readme * update logging path * fix backend worker started checking * update readme * Update README.md * YAML metrics configuration handling for C++ backend (#1941) * fix yaml_cpp installation in build script (#1996) * fix yaml_cpp installation * build request id strings for one batch * Metrics cache implementation and integration with C++ backend (#1975) * Metrics cache implementation for C++ backend * Metrics cache integration with C++ backend Co-authored-by: Naman Nandan <namannan@amazon.com> * Revert "Metrics cache implementation and integration with C++ backend (#1975)" (#2011) This reverts commit 3451bb7. * Metrics cache implementation and integration with C++ backend (#2012) * Metrics cache implementation for C++ backend * Metrics cache integration with C++ backend Co-authored-by: Naman Nandan <namannan@amazon.com> * Fix lint error * Fix lint error * Fix model-archiver after cpp merge * Adjust signature of workerLifeCycleMnist.startWorker in test * Fix unit tests after merging master into cpp_backend * Fix linting error * Install dependencies for cpp backend * Fix unit tests after cpp merge * Fix formatting * Move installation of cpp deps to ts_scripts/install_dependencies.py * Build cpp backend for regression and sanity tests * Fix formatting * Fix typo * Temp fix hanging after starting cpp worker * Add pytest for cpp backend * Roll back building of cpp abckend in ci regression and sanity tests; install deps in cpp ci * Fix formatting * Remove mnist_custom_cpp.mar file from postman test as we do not build cpp backend for general regression test * Remove cpp model archive in additional place * Remove cpp build from setup.py * Remove additional ref to build_cpp in setup.py * fix code link * Update README.md * Update libtorch versions + move installation of cpp backend to build.sh * Prepare cpp build workflow for merge into master * Update cuda version in cpp/build.sh * Remove reference to LDP * Fix WorkerLifeCycleTest * rm src/test/resources/config_test_cpp.properties * Remove debug prints * Skip cpp backend test if cpp backend is not available --------- Co-authored-by: lxning <lninga@amazon.com> Co-authored-by: Aaqib Ansari <maaquib@gmail.com> Co-authored-by: lxning <23464292+lxning@users.noreply.github.com> Co-authored-by: rohithkrn <rohith.nallamaddi@gmail.com> Co-authored-by: Naman Nandan <namankt55@gmail.com> Co-authored-by: Naman Nandan <namannan@amazon.com> Co-authored-by: Geeta Chauhan <4461127+chauhang@users.noreply.github.com>

namannandan marked this pull request as ready for review September 23, 2022 22:13

namannandan changed the title ~~Metrics helper classes implementation for C++ bakend~~ Metrics helper classes implementation for C++ backend Sep 23, 2022

msaroufim added c++ metrics labels Sep 26, 2022

lxning reviewed Sep 29, 2022

View reviewed changes

namannandan force-pushed the naman-cpp-backend-metrics branch from ce97df0 to 983a1de Compare October 4, 2022 17:10

lxning reviewed Oct 6, 2022

View reviewed changes

lxning reviewed Oct 7, 2022

View reviewed changes

namannandan force-pushed the naman-cpp-backend-metrics branch from 3cd109a to c249f0c Compare October 12, 2022 16:49

namannandan commented Oct 12, 2022

View reviewed changes

rohithkrn reviewed Oct 12, 2022

View reviewed changes

namannandan requested a review from maaquib October 17, 2022 11:05

namannandan force-pushed the naman-cpp-backend-metrics branch from c249f0c to 0e6b880 Compare October 18, 2022 16:54

namannandan requested review from lxning and rohithkrn October 19, 2022 11:05

namannandan force-pushed the naman-cpp-backend-metrics branch from 0e6b880 to 6f643c8 Compare October 19, 2022 11:22

lxning reviewed Oct 20, 2022

View reviewed changes

namannandan requested a review from lxning October 20, 2022 14:13

Naman Nandan added 8 commits October 21, 2022 15:41

[skip ci] Metrics helper classes implementation

db62886

Dimension, Units and Metric

Refactor metrics helper classes

c3f94f9

1) Move metrics helper classes from src/backends to src/utils 2) Update Metric class to store a vector of values instead of a single value

Fix metrics headers include guard to follow naming convention

5cddf38

Refactor metrics implementation to follow the API described in the

2ac9efc

metrics refactor RFC: pytorch#1492

Revert changes to the following CMakeLists files since no change

fc37818

is required as part of the metrics implementation: cpp/src/backends/CMakeLists.txt cpp/test/CMakeLists.txt

Fix compiler warnings related to std::experimental::filesystem

afa3cd3

Refactor metrics helper classes to simplify backend metrics implement…

1b9f1a0

…ation by emitting logs when the metrics API is called instead of storing them until the completion of an inference request to flush the metrics

Infer dimension names order from config file and use the same order

7f136d5

for dimension values argument in the metrics API. Fix clang-tidy warnings.

namannandan force-pushed the naman-cpp-backend-metrics branch from 6f643c8 to 7f136d5 Compare October 21, 2022 18:23

rohithkrn approved these changes Oct 26, 2022

View reviewed changes

lxning reviewed Oct 28, 2022

View reviewed changes

Refactor backend metrics unit tests to use same regex as frontend to …

cf78b79

…parse metric logs

namannandan requested a review from lxning October 31, 2022 12:08

namannandan merged commit fb22b88 into pytorch:cpp_backend Nov 1, 2022

		target_include_directories(ts_utils PRIVATE ${Boost_INCLUDE_DIRS})
		target_link_libraries(ts_utils ${FOLLY_LIBRARIES} ${CMAKE_DL_LIBS} ${Boost_LIBRARIES})

		const std::vector<Dimension> dimensions;
		std::vector<MetricValue> values;

Metrics helper classes implementation for C++ backend #1874

Metrics helper classes implementation for C++ backend #1874

Conversation

namannandan commented Sep 23, 2022 • edited Loading

This PR implements the following Metrics helper classes in C++ for Torchserve backend

Testing

Reference design

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lxning Oct 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan Oct 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lxning Oct 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan commented Oct 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan Oct 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan Oct 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan Oct 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namannandan commented Nov 1, 2022

namannandan commented Sep 23, 2022 •

edited

Loading

lxning Oct 7, 2022 •

edited

Loading

namannandan Oct 6, 2022 •

edited

Loading

lxning Oct 7, 2022 •

edited

Loading

namannandan Oct 12, 2022 •

edited

Loading

namannandan Oct 20, 2022 •

edited

Loading

namannandan Oct 20, 2022 •

edited

Loading

namannandan Oct 31, 2022 •

edited

Loading