From bfeb476f9f55ee6de1c7dac15ead654feb099e58 Mon Sep 17 00:00:00 2001 From: Ryan Thompson Date: Wed, 19 Oct 2022 18:44:11 +0000 Subject: [PATCH] Move Tensorflow Documentation (#23729) * Moves tensorflow section next to sklearn and pytorch section of documentation * moved keyed model handler data to be framework neutral * fixed grammar typo --- .../sdks/python-machine-learning.md | 99 ++++++++++--------- 1 file changed, 55 insertions(+), 44 deletions(-) diff --git a/website/www/site/content/en/documentation/sdks/python-machine-learning.md b/website/www/site/content/en/documentation/sdks/python-machine-learning.md index bcd430d0072d1..b899e51496422 100644 --- a/website/www/site/content/en/documentation/sdks/python-machine-learning.md +++ b/website/www/site/content/en/documentation/sdks/python-machine-learning.md @@ -20,7 +20,9 @@ limitations under the License. {{< button-pydoc path="apache_beam.ml.inference" class="RunInference" >}} -You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation. +You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. Tensorflow models are supported through tfx-bsl. + +You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation. ## Why use the RunInference API? @@ -62,6 +64,7 @@ from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerNumpy from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerPandas from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerTensor from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerKeyedTensor +from tfx_bsl.public.beam.run_inference import CreateModelHandler ``` ### Use pre-trained models @@ -83,6 +86,57 @@ You need to provide a path to a file that contains the pickled Scikit-learn mode `model_uri=` and `model_file_type: `, where you can specify `ModelFileType.PICKLE` or `ModelFileType.JOBLIB`, depending on how the model was serialized. +#### TensorFlow + +To use TensorFlow with the RunInference API, you need to do the following: + +* Use `tfx_bsl` version 1.10.0 or later. +* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. +* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. + +A sample pipeline might look like the following example: + +``` +import apache_beam as beam +from apache_beam.ml.inference.base import RunInference +from tensorflow_serving.apis import prediction_log_pb2 +from tfx_bsl.public.proto import model_spec_pb2 +from tfx_bsl.public.tfxio import TFExampleRecord +from tfx_bsl.public.beam.run_inference import CreateModelHandler + +pipeline = beam.Pipeline() +tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples') +saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model') +inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) +model_handler = CreateModelHandler(inference_spec_type) +with pipeline as p: + _ = (p | tfexample_beam_record.RawRecordBeamSource() + | RunInference(model_handler) + | beam.Map(print) + ) +``` + +Note: A model handler that is created with `CreateModelHander()` is always unkeyed. + +### Keyed Model Handlers +To make a keyed model handler, wrap any unkeyed model handler in the keyed model handler. For example: + +``` +from apache_beam.ml.inference.base import RunInference +from apache_beam.ml.inference.base import KeyedModelHandler +model_handler = +keyed_model_handler = KeyedModelHandler(model_handler) + +with pipeline as p: + p | ( + RunInference(keyed_model_handler) + ) +``` + +If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`. + +For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler). + ### Use custom models If you would like to use a model that isn't specified by one of the supported frameworks, the RunInference API is designed flexibly to allow you to use any custom machine learning models. @@ -91,7 +145,6 @@ You only need to create your own `ModelHandler` or `KeyedModelHandler` with logi A simple example can be found in [this notebook](https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_custom_inference.ipynb). The `load_model` method shows how to load the model using a popular `spaCy` package while `run_inference` shows how to run the inference on a batch of examples. - ### Use multiple models You can also use the RunInference transform to add multiple inference models to your pipeline. @@ -198,48 +251,6 @@ For detailed instructions explaining how to build and run a pipeline that uses M The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). -## TensorFlow support - -To use TensorFlow with the RunInference API, you need to do the following: - -* Use `tfx_bsl` version 1.10.0 or later. -* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. -* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. - -A sample pipeline might look like the following example: - -``` -import apache_beam as beam -from apache_beam.ml.inference.base import RunInference -from tensorflow_serving.apis import prediction_log_pb2 -from tfx_bsl.public.proto import model_spec_pb2 -from tfx_bsl.public.tfxio import TFExampleRecord -from tfx_bsl.public.beam.run_inference import CreateModelHandler - -pipeline = beam.Pipeline() -tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples') -saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model') -inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) -model_handler = CreateModelHandler(inference_spec_type) -with pipeline as p: - _ = (p | tfexample_beam_record.RawRecordBeamSource() - | RunInference(model_handler) - | beam.Map(print) - ) -``` - -The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: - -``` -from apache_beam.ml.inference.base import RunInference -from apache_beam.ml.inference.base import KeyedModelHandler -RunInference(KeyedModelHandler(tf_handler)) -``` - -If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`. - -For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler). - ## Troubleshooting If you run into problems with your pipeline or job, this section lists issues that you might encounter and provides suggestions for how to fix them.