Skip to content

Commit

Permalink
Move Tensorflow Documentation (#23729)
Browse files Browse the repository at this point in the history
* Moves tensorflow section next to sklearn and pytorch section of documentation

* moved keyed model handler data to be framework neutral

* fixed grammar typo
  • Loading branch information
ryanthompson591 authored Oct 19, 2022
1 parent 437c015 commit bfeb476
Showing 1 changed file with 55 additions and 44 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ limitations under the License.

{{< button-pydoc path="apache_beam.ml.inference" class="RunInference" >}}

You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation.
You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. Tensorflow models are supported through tfx-bsl.

You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation.

## Why use the RunInference API?

Expand Down Expand Up @@ -62,6 +64,7 @@ from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerNumpy
from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerPandas
from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerTensor
from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerKeyedTensor
from tfx_bsl.public.beam.run_inference import CreateModelHandler
```
### Use pre-trained models

Expand All @@ -83,6 +86,57 @@ You need to provide a path to a file that contains the pickled Scikit-learn mode
`model_uri=<path_to_pickled_file>` and `model_file_type: <ModelFileType>`, where you can specify
`ModelFileType.PICKLE` or `ModelFileType.JOBLIB`, depending on how the model was serialized.

#### TensorFlow

To use TensorFlow with the RunInference API, you need to do the following:

* Use `tfx_bsl` version 1.10.0 or later.
* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`.
* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform.

A sample pipeline might look like the following example:

```
import apache_beam as beam
from apache_beam.ml.inference.base import RunInference
from tensorflow_serving.apis import prediction_log_pb2
from tfx_bsl.public.proto import model_spec_pb2
from tfx_bsl.public.tfxio import TFExampleRecord
from tfx_bsl.public.beam.run_inference import CreateModelHandler
pipeline = beam.Pipeline()
tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples')
saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model')
inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)
model_handler = CreateModelHandler(inference_spec_type)
with pipeline as p:
_ = (p | tfexample_beam_record.RawRecordBeamSource()
| RunInference(model_handler)
| beam.Map(print)
)
```

Note: A model handler that is created with `CreateModelHander()` is always unkeyed.

### Keyed Model Handlers
To make a keyed model handler, wrap any unkeyed model handler in the keyed model handler. For example:

```
from apache_beam.ml.inference.base import RunInference
from apache_beam.ml.inference.base import KeyedModelHandler
model_handler = <Instantiate your model handler>
keyed_model_handler = KeyedModelHandler(model_handler)
with pipeline as p:
p | ( <Your Pipeline>
RunInference(keyed_model_handler)
)
```

If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`.

For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler).

### Use custom models

If you would like to use a model that isn't specified by one of the supported frameworks, the RunInference API is designed flexibly to allow you to use any custom machine learning models.
Expand All @@ -91,7 +145,6 @@ You only need to create your own `ModelHandler` or `KeyedModelHandler` with logi
A simple example can be found in [this notebook](https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_custom_inference.ipynb).
The `load_model` method shows how to load the model using a popular `spaCy` package while `run_inference` shows how to run the inference on a batch of examples.


### Use multiple models

You can also use the RunInference transform to add multiple inference models to your pipeline.
Expand Down Expand Up @@ -198,48 +251,6 @@ For detailed instructions explaining how to build and run a pipeline that uses M

The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java).

## TensorFlow support

To use TensorFlow with the RunInference API, you need to do the following:

* Use `tfx_bsl` version 1.10.0 or later.
* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`.
* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform.

A sample pipeline might look like the following example:

```
import apache_beam as beam
from apache_beam.ml.inference.base import RunInference
from tensorflow_serving.apis import prediction_log_pb2
from tfx_bsl.public.proto import model_spec_pb2
from tfx_bsl.public.tfxio import TFExampleRecord
from tfx_bsl.public.beam.run_inference import CreateModelHandler
pipeline = beam.Pipeline()
tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples')
saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model')
inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)
model_handler = CreateModelHandler(inference_spec_type)
with pipeline as p:
_ = (p | tfexample_beam_record.RawRecordBeamSource()
| RunInference(model_handler)
| beam.Map(print)
)
```

The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example:

```
from apache_beam.ml.inference.base import RunInference
from apache_beam.ml.inference.base import KeyedModelHandler
RunInference(KeyedModelHandler(tf_handler))
```

If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`.

For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler).

## Troubleshooting

If you run into problems with your pipeline or job, this section lists issues that you might encounter and provides suggestions for how to fix them.
Expand Down

0 comments on commit bfeb476

Please sign in to comment.