Add "RequestFormatter" config class to work with different inference platforms #436

htappen · 2020-06-08T19:45:48Z

Problem statement

Many model serving systems provide a signature for request bodies. Examples include:

Data scientists use these multi-framework systems to manage deployments of many different models, possibly written in different languages and frameworks. The platforms offer additional analytics on top of model serving, including skew detection, explanations and A/B testing. These platforms need a well-structured signature in order to both standardize calls across different frameworks and to understand the input data. To simplify support for many frameworks, though, these platforms will simply pass the request body along to the underlying model server.

Torchserve currently has no fixed request body signature. Data scientists are expected to define the MIME type they support and the structure of the request body. As a result, data scientists must manually write the code in their handler functions to accept inputs from one of the above model servers. This makes it challenging for an individual data scientist to integrate PyTorch serving into them. Worse yet, none of the Torchserve examples or built-in handlers are able to accept requests from these platforms.

Proposal

In order to (a) enable Torchserve to function with multi-framework model platforms while (b) maintaining compatibility with the existing, flexible data plane, we should introduce a property to BaseHandler that controls the input and output format.

It's expected that end users DO NOT implement these formatters themselves, nor do they have to consume the functionality manually. Instead, they either pass the formatter object into get_default_handler:

handle = <my_class>.get_default_handler(request_formatter=TFServingFormatter())

...or request it in torch-model-archiver:

torch-model-archiver ... --handler image_classifier --request_formatter TFServingFormatter```

The BaseHandler class should accept that config and add the parsing step into the default handling pipeline. Example:

```python
_singleton_handler = None

class BaseHandler(ABC)
...

  def _handle(self, data, context):
      if data is None:
          return None

      if self.request_formatter:
        data = self.request_formatter.parse_input(data)

      processed = self.preprocess(data)
      predictions = self.inference(processed)
      output = self.postprocess(predictions)

      if self.request_formatter:
        output = self.request_formatter.format_output(data)

      return output

We need to define an interface for these formatters. Their role is to strictly convert from an arbitrary format (e.g. JSON) to the format required by preprocess (a list of binary strings, dict, or vectors)

class RequestFormatter(ABC):
  @abstractmethod
  def parse_input(self, data):
    pass

  @abstractmethod
  def format_output(self, data):
    pass

Of course, we'll need an implementation for compatibility with Seldon, KFServing and Cloud AI Platform:

class TFServingFormatter(RequestFormatter):
  ...

Dependencies

The refactors outlined here make this a lot easier: #434

Implementation

Here's an example of the TFServingFormatter object. I will follow up with a complete file.

class TFServingFormatter(RequestFormatter):
  def parse_input(self, data):
    lengths, batch = self._batch_from_json(data)
    self._lengths = lengths
    return batch

  def format_output(self, data):
    return self._batch_to_json(output, self.lengths)

  def _batch_from_json(self, data_rows):
        """
        Joins the instances of a batch of JSON objects
        """
        lengths = [len(data_row) for data_row in data_rows]
        full_batch = list(
            chain.from_iterable(
                self._from_json(data_row)
                for data_row in data_rows
            )
        )
        return (lengths, full_batch)

    def _from_json(self, data):
        """
        Extracts the data from the Cloud AI Platform object
        """
        rows = data['body']['instances']
        if isinstance(rows[0], dict):
            for row_i, row in enumerate(rows):
                if list(row.keys()) == ['b64']:
                    rows[row_i] = b64decode(row['b64'])
                else:
                    for col, col_value in row.items():
                        if isinstance(col_value, dict) and list(col_value.keys()) == ['b64']:
                            row[col] = b64decode(col_value['b64'])
        return rows

    def _batch_to_json(self, batch, lengths):
        """
        Splits the batched output into mini-batches and returns JSON
        """
        outputs = []
        cursor = 0
        for length in lengths:
            cursor_end = cursor + length

            mini_batch = batch[cursor:cursor_end]
            outputs.append(
                self._to_json(
                    mini_batch
                )
            )

            cursor = cursor_end
        return outputs

    def _to_json(self, output):
        """
        Converts the output of the model back into Cloud AI Platform compatible JSON
        """
        if (isinstance(output, np.ndarray)
            or isinstance(output, torch.Tensor)
            ):
            output = output.tolist()
        out_dict = {
            'predictions': output
        }
        return json.dumps(out_dict)

dhaniram-kshirsagar · 2020-06-09T17:42:50Z

torch-model-archiver ... --handler image_classifier --request_formatter TFServingFormatter```

Liked the idea of including formatter in mar. torch-model-archive util supports --extra-files flag, may be we can think of using same with some standard module/class naming convention.

dhaniram-kshirsagar · 2020-06-16T18:38:38Z

@htappen Instead of changing the base handler and adding new method i.e. getDefaultHandler that accepts Formatter, can we add new package named formatters that will contain apart from formatters, a new default base handler class say BaseHandlerWithFormatter [extending BaseHandler] however has overridden initialize method for initializing formatter and handle method for pre and post formatting like this

class BaseHandlerWithFormatter(**BaseHandler**)
  def __init__(self):
     self.request_formatter = None

  def initialize(self, context):
     **super(BaseHandlerWithFormatter, self).initialize(context)**
     # Initialize formatter by using context
     self.request_formatter = TFServingFormatter()

  def handle(self, data, context):
   
      if self.request_formatter:
        data = self.request_formatter.parse_input(data)

      **output = super(BaseHandlerWithFormatter, self).handle(data, context)**

      if self.request_formatter:
        output = self.request_formatter.format_output(data)

      return output

htappen · 2020-06-18T18:06:48Z

Thanks for taking a look! Implementing the request formatters in a replacement base class for BaseHandler won't work because then classes derived from BaseHandler (e.g. all the built-ins) won't get the behavior.

However, another similar that could work would be to have the RequestFormatter wrap a handler object instead of the other way around:

class RequestEnvelope(ABC):
  def __init__(self, handler):
    self._handler = handler

  def initialize(self, context):
    """
      For RequestEnvelope to be used like a Torch Handler, it needs to implement this method.
      Simple calls the wrapped handler's initialize.
    """
    self._handler.initialize(context)

  def handle(self, data, context):
    input_list = self.parse_input(data)
    results = self._handler.handle(input_list, context)
    output_list = self.format_output(results)

    return output_list

  @abstractmethod
  def parse_input(self, data):
    pass

  @abstractmethod
  def format_output(self, data):
    pass

To use this, someone would have to create a handler.py like

MyHandler = TFRequestEnvelope(ImageClassifier())

Something about that feels a bit fragile, but let me think some more.

chauhang · 2020-11-25T21:02:37Z

@htappen This should be resolved with the merge of PR #749 Please run the tests on GCAIP side

harshbafna · 2020-12-10T16:09:23Z

#749 is now merged to master. Closing.

htappen mentioned this issue Jun 8, 2020

Refactor torch handlers (especially BaseHandler) for better reusability and functionality #434

Closed

maaquib added the enhancement New feature or request label Jun 8, 2020

maaquib added this to the v0.2.0 milestone Jun 8, 2020

harshbafna assigned dhaniram-kshirsagar Jun 10, 2020

dhaniram-kshirsagar assigned htappen Jul 10, 2020

maaquib modified the milestones: v0.2.0, v0.3.0 Jul 23, 2020

chauhang linked a pull request Nov 25, 2020 that will close this issue

Add request envelope to support Multiserving frameworks #749

Merged

6 tasks

harshbafna closed this as completed Dec 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "RequestFormatter" config class to work with different inference platforms #436

Add "RequestFormatter" config class to work with different inference platforms #436

htappen commented Jun 8, 2020

dhaniram-kshirsagar commented Jun 9, 2020 •

edited

Loading

dhaniram-kshirsagar commented Jun 16, 2020

htappen commented Jun 18, 2020

chauhang commented Nov 25, 2020

harshbafna commented Dec 10, 2020 •

edited

Loading

Add "RequestFormatter" config class to work with different inference platforms #436

Add "RequestFormatter" config class to work with different inference platforms #436

Comments

htappen commented Jun 8, 2020

Problem statement

Proposal

Dependencies

Implementation

dhaniram-kshirsagar commented Jun 9, 2020 • edited Loading

dhaniram-kshirsagar commented Jun 16, 2020

htappen commented Jun 18, 2020

chauhang commented Nov 25, 2020

harshbafna commented Dec 10, 2020 • edited Loading

dhaniram-kshirsagar commented Jun 9, 2020 •

edited

Loading

harshbafna commented Dec 10, 2020 •

edited

Loading