Skip to content

Commit

Permalink
[Docs] Fix :book: Doctest (GPU) (#36383)
Browse files Browse the repository at this point in the history
#36206 replaced `:options: +MOCK` with `:skipif: True` in `batch_inference.rst`. As a result, doctest is failing: https://buildkite.com/ray-project/oss-ci-build-pr/builds/25117#0188a1b9-31f9-47df-89cb-f508a90685d9. 

This PR reverts the change.


Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
  • Loading branch information
bveeramani authored Jun 13, 2023
1 parent d0e42e9 commit c77ea54
Showing 1 changed file with 38 additions and 38 deletions.
76 changes: 38 additions & 38 deletions doc/source/data/batch_inference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ End-to-end: Offline Batch Inference

.. tip::

`Get in touch <https://forms.gle/sGX7PQhheBGL6yxQ6>`_ to get help using Ray Data, the industry's fastest and cheapest solution for offline batch inference.
`Get in touch <https://forms.gle/sGX7PQhheBGL6yxQ6>`_ to get help using Ray Data, the industry's fastest and cheapest solution for offline batch inference.

Offline batch inference is a process for generating model predictions on a fixed set of input data. Ray Data offers an efficient and scalable solution for batch inference, providing faster execution and cost-effectiveness for deep learning applications.

Expand All @@ -27,7 +27,7 @@ To start, install Ray Data:
Using Ray Data for offline inference involves four basic steps:

- **Step 1:** Load your data into a Ray Dataset. Ray Data supports many different data sources and formats. For more details, see :ref:`Loading Data <loading_data>`.
- **Step 2:** Define a Python class to load the pre-trained model.
- **Step 2:** Define a Python class to load the pre-trained model.
- **Step 3:** Transform your dataset using the pre-trained model by calling :meth:`ds.map_batches() <ray.data.Dataset.map_batches>`. For more details, see :ref:`Transforming Data <transforming_data>`.
- **Step 4:** Get the final predictions by either iterating through the output or saving the results. For more details, see the :ref:`Iterating over data <iterating-over-data>` and :ref:`Saving data <saving-data>` user guides.

Expand All @@ -37,14 +37,14 @@ For how to configure batch inference, see :ref:`the configuration guide<batch_in
.. tabs::

.. group-tab:: HuggingFace

.. testcode::

from typing import Dict
import numpy as np

import ray

# Step 1: Create a Ray Dataset from in-memory Numpy arrays.
# You can also create a Ray Dataset from many other sources and file
# formats.
Expand Down Expand Up @@ -77,12 +77,12 @@ For how to configure batch inference, see :ref:`the configuration guide<batch_in
predictions = ds.map_batches(HuggingFacePredictor, compute=scale)
# Step 4: Show one prediction output.
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'data': 'Complete this', 'output': 'Complete this information or purchase any item from this site.\n\nAll purchases are final and non-'}


.. group-tab:: PyTorch

Expand Down Expand Up @@ -129,7 +129,7 @@ For how to configure batch inference, see :ref:`the configuration guide<batch_in
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'output': array([0.5590901], dtype=float32)}

Expand All @@ -141,7 +141,7 @@ For how to configure batch inference, see :ref:`the configuration guide<batch_in
import numpy as np

import ray

# Step 1: Create a Ray Dataset from in-memory Numpy arrays.
# You can also create a Ray Dataset from many other sources and file
# formats.
Expand Down Expand Up @@ -174,15 +174,15 @@ For how to configure batch inference, see :ref:`the configuration guide<batch_in
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'output': array([0.625576], dtype=float32)}

.. _batch_inference_examples:

More examples
-------------
- :doc:`Image Classification Batch Inference with PyTorch ResNet18 </data/examples/pytorch_resnet_batch_prediction>`
- :doc:`Image Classification Batch Inference with PyTorch ResNet18 </data/examples/pytorch_resnet_batch_prediction>`
- :doc:`Object Detection Batch Inference with PyTorch FasterRCNN_ResNet50 </data/examples/batch_inference_object_detection>`
- :doc:`Image Classification Batch Inference with Huggingface Vision Transformer </data/examples/huggingface_vit_batch_prediction>`

Expand All @@ -200,21 +200,21 @@ To use GPUs for inference, make the following changes to your code:

1. Update the class implementation to move the model and data to and from GPU.
2. Specify `num_gpus=1` in the :meth:`ds.map_batches() <ray.data.Dataset.map_batches>` call to indicate that each actor should use 1 GPU.
3. Specify a `batch_size` for inference. For more details on how to configure the batch size, see `batch_inference_batch_size`_.
3. Specify a `batch_size` for inference. For more details on how to configure the batch size, see `batch_inference_batch_size`_.

The remaining is the same as the :ref:`Quickstart <batch_inference_quickstart>`.

.. tabs::

.. group-tab:: HuggingFace

.. testcode::

from typing import Dict
import numpy as np

import ray

ds = ray.data.from_numpy(np.asarray(["Complete this", "for me"]))

class HuggingFacePredictor:
Expand All @@ -230,21 +230,21 @@ The remaining is the same as the :ref:`Quickstart <batch_inference_quickstart>`.

# Use 2 actors, each actor using 1 GPU. 2 GPUs total.
predictions = ds.map_batches(
HuggingFacePredictor,
HuggingFacePredictor,
num_gpus=1,
# Specify the batch size for inference.
# Specify the batch size for inference.
# Increase this for larger datasets.
batch_size=1,
batch_size=1,
# Set the ActorPool size to the number of GPUs in your cluster.
compute=ray.data.ActorPoolStrategy(size=2),
compute=ray.data.ActorPoolStrategy(size=2),
)
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'data': 'Complete this', 'output': 'Complete this poll. Which one do you think holds the most promise for you?\n\nThank you'}


.. group-tab:: PyTorch

Expand Down Expand Up @@ -277,18 +277,18 @@ The remaining is the same as the :ref:`Quickstart <batch_inference_quickstart>`.

# Use 2 actors, each actor using 1 GPU. 2 GPUs total.
predictions = ds.map_batches(
TorchPredictor,
TorchPredictor,
num_gpus=1,
# Specify the batch size for inference.
# Specify the batch size for inference.
# Increase this for larger datasets.
batch_size=1,
# Set the ActorPool size to the number of GPUs in your cluster.
compute=ray.data.ActorPoolStrategy(size=2)
compute=ray.data.ActorPoolStrategy(size=2)
)
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'output': array([0.5590901], dtype=float32)}

Expand All @@ -302,7 +302,7 @@ The remaining is the same as the :ref:`Quickstart <batch_inference_quickstart>`.
from tensorflow import keras

import ray

ds = ray.data.from_numpy(np.ones((1, 100)))

class TFPredictor:
Expand All @@ -320,18 +320,18 @@ The remaining is the same as the :ref:`Quickstart <batch_inference_quickstart>`.

# Use 2 actors, each actor using 1 GPU. 2 GPUs total.
predictions = ds.map_batches(
TFPredictor,
TFPredictor,
num_gpus=1,
# Specify the batch size for inference.
# Specify the batch size for inference.
# Increase this for larger datasets.
batch_size=1,
# Set the ActorPool size to the number of GPUs in your cluster.
compute=ray.data.ActorPoolStrategy(size=2)
compute=ray.data.ActorPoolStrategy(size=2)
)
predictions.show(limit=1)

.. testoutput::
:skipif: True
:options: +MOCK

{'output': array([0.625576], dtype=float32)}

Expand All @@ -345,7 +345,7 @@ Configure the size of the input batch that is passed to ``__call__`` by setting
Increasing batch size results in faster execution because inference is a vectorized operation. For GPU inference, increasing batch size increases GPU utilization. Set the batch size to as large possible without running out of memory. If you encounter OOMs, decreasing ``batch_size`` may help.

.. testcode::

import numpy as np

import ray
Expand All @@ -355,7 +355,7 @@ Increasing batch size results in faster execution because inference is a vectori
def assert_batch(batch: Dict[str, np.ndarray]):
assert len(batch) == 2
return batch

# Specify that each input batch should be of size 2.
ds.map_batches(assert_batch, batch_size=2)

Expand Down Expand Up @@ -392,12 +392,12 @@ Suppose your cluster has 4 nodes, each with 16 CPUs. To limit to at most

.. testcode::
:skipif: True

from typing import Dict
import numpy as np

import ray

ds = ray.data.from_numpy(np.asarray(["Complete this", "for me"]))

class HuggingFacePredictor:
Expand All @@ -411,10 +411,10 @@ Suppose your cluster has 4 nodes, each with 16 CPUs. To limit to at most
return batch

predictions = ds.map_batches(
HuggingFacePredictor,
HuggingFacePredictor,
# Require 5 CPUs per actor (so at most 3 can fit per 16 CPU node).
num_cpus=5,
# 3 actors per node, with 4 nodes in the cluster means ActorPool size of 12.
compute=ray.data.ActorPoolStrategy(size=12)
compute=ray.data.ActorPoolStrategy(size=12)
)
predictions.show(limit=1)

0 comments on commit c77ea54

Please sign in to comment.