Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "RequestFormatter" config class to work with different inference platforms #436

Closed
htappen opened this issue Jun 8, 2020 · 5 comments · Fixed by #749
Closed

Add "RequestFormatter" config class to work with different inference platforms #436

htappen opened this issue Jun 8, 2020 · 5 comments · Fixed by #749
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@htappen
Copy link
Contributor

htappen commented Jun 8, 2020

Problem statement

Many model serving systems provide a signature for request bodies. Examples include:

Data scientists use these multi-framework systems to manage deployments of many different models, possibly written in different languages and frameworks. The platforms offer additional analytics on top of model serving, including skew detection, explanations and A/B testing. These platforms need a well-structured signature in order to both standardize calls across different frameworks and to understand the input data. To simplify support for many frameworks, though, these platforms will simply pass the request body along to the underlying model server.

Torchserve currently has no fixed request body signature. Data scientists are expected to define the MIME type they support and the structure of the request body. As a result, data scientists must manually write the code in their handler functions to accept inputs from one of the above model servers. This makes it challenging for an individual data scientist to integrate PyTorch serving into them. Worse yet, none of the Torchserve examples or built-in handlers are able to accept requests from these platforms.

Proposal

In order to (a) enable Torchserve to function with multi-framework model platforms while (b) maintaining compatibility with the existing, flexible data plane, we should introduce a property to BaseHandler that controls the input and output format.

It's expected that end users DO NOT implement these formatters themselves, nor do they have to consume the functionality manually. Instead, they either pass the formatter object into get_default_handler:

handle = <my_class>.get_default_handler(request_formatter=TFServingFormatter())

...or request it in torch-model-archiver:

torch-model-archiver ... --handler image_classifier --request_formatter TFServingFormatter```

The BaseHandler class should accept that config and add the parsing step into the default handling pipeline. Example:

```python
_singleton_handler = None

class BaseHandler(ABC)
...

  def _handle(self, data, context):
      if data is None:
          return None

      if self.request_formatter:
        data = self.request_formatter.parse_input(data)

      processed = self.preprocess(data)
      predictions = self.inference(processed)
      output = self.postprocess(predictions)

      if self.request_formatter:
        output = self.request_formatter.format_output(data)

      return output

We need to define an interface for these formatters. Their role is to strictly convert from an arbitrary format (e.g. JSON) to the format required by preprocess (a list of binary strings, dict, or vectors)

class RequestFormatter(ABC):
  @abstractmethod
  def parse_input(self, data):
    pass

  @abstractmethod
  def format_output(self, data):
    pass

Of course, we'll need an implementation for compatibility with Seldon, KFServing and Cloud AI Platform:

class TFServingFormatter(RequestFormatter):
  ...

Dependencies

The refactors outlined here make this a lot easier: #434

Implementation

Here's an example of the TFServingFormatter object. I will follow up with a complete file.

class TFServingFormatter(RequestFormatter):
  def parse_input(self, data):
    lengths, batch = self._batch_from_json(data)
    self._lengths = lengths
    return batch

  def format_output(self, data):
    return self._batch_to_json(output, self.lengths)

  def _batch_from_json(self, data_rows):
        """
        Joins the instances of a batch of JSON objects
        """
        lengths = [len(data_row) for data_row in data_rows]
        full_batch = list(
            chain.from_iterable(
                self._from_json(data_row)
                for data_row in data_rows
            )
        )
        return (lengths, full_batch)

    def _from_json(self, data):
        """
        Extracts the data from the Cloud AI Platform object
        """
        rows = data['body']['instances']
        if isinstance(rows[0], dict):
            for row_i, row in enumerate(rows):
                if list(row.keys()) == ['b64']:
                    rows[row_i] = b64decode(row['b64'])
                else:
                    for col, col_value in row.items():
                        if isinstance(col_value, dict) and list(col_value.keys()) == ['b64']:
                            row[col] = b64decode(col_value['b64'])
        return rows

    def _batch_to_json(self, batch, lengths):
        """
        Splits the batched output into mini-batches and returns JSON
        """
        outputs = []
        cursor = 0
        for length in lengths:
            cursor_end = cursor + length

            mini_batch = batch[cursor:cursor_end]
            outputs.append(
                self._to_json(
                    mini_batch
                )
            )

            cursor = cursor_end
        return outputs

    def _to_json(self, output):
        """
        Converts the output of the model back into Cloud AI Platform compatible JSON
        """
        if (isinstance(output, np.ndarray)
            or isinstance(output, torch.Tensor)
            ):
            output = output.tolist()
        out_dict = {
            'predictions': output
        }
        return json.dumps(out_dict)
@dhaniram-kshirsagar
Copy link
Contributor

dhaniram-kshirsagar commented Jun 9, 2020

torch-model-archiver ... --handler image_classifier --request_formatter TFServingFormatter```

Liked the idea of including formatter in mar. torch-model-archive util supports --extra-files flag, may be we can think of using same with some standard module/class naming convention.

@dhaniram-kshirsagar
Copy link
Contributor

@htappen Instead of changing the base handler and adding new method i.e. getDefaultHandler that accepts Formatter, can we add new package named formatters that will contain apart from formatters, a new default base handler class say BaseHandlerWithFormatter [extending BaseHandler] however has overridden initialize method for initializing formatter and handle method for pre and post formatting like this

class BaseHandlerWithFormatter(**BaseHandler**)
  def __init__(self):
     self.request_formatter = None

  def initialize(self, context):
     **super(BaseHandlerWithFormatter, self).initialize(context)**
     # Initialize formatter by using context
     self.request_formatter = TFServingFormatter()

  def handle(self, data, context):
   
      if self.request_formatter:
        data = self.request_formatter.parse_input(data)

      **output = super(BaseHandlerWithFormatter, self).handle(data, context)**

      if self.request_formatter:
        output = self.request_formatter.format_output(data)

      return output

@htappen
Copy link
Contributor Author

htappen commented Jun 18, 2020

Thanks for taking a look! Implementing the request formatters in a replacement base class for BaseHandler won't work because then classes derived from BaseHandler (e.g. all the built-ins) won't get the behavior.

However, another similar that could work would be to have the RequestFormatter wrap a handler object instead of the other way around:

class RequestEnvelope(ABC):
  def __init__(self, handler):
    self._handler = handler

  def initialize(self, context):
    """
      For RequestEnvelope to be used like a Torch Handler, it needs to implement this method.
      Simple calls the wrapped handler's initialize.
    """
    self._handler.initialize(context)

  def handle(self, data, context):
    input_list = self.parse_input(data)
    results = self._handler.handle(input_list, context)
    output_list = self.format_output(results)

    return output_list

  @abstractmethod
  def parse_input(self, data):
    pass

  @abstractmethod
  def format_output(self, data):
    pass

To use this, someone would have to create a handler.py like

MyHandler = TFRequestEnvelope(ImageClassifier())

Something about that feels a bit fragile, but let me think some more.

@maaquib maaquib modified the milestones: v0.2.0, v0.3.0 Jul 23, 2020
@chauhang chauhang linked a pull request Nov 25, 2020 that will close this issue
6 tasks
@chauhang
Copy link
Contributor

@htappen This should be resolved with the merge of PR #749 Please run the tests on GCAIP side

@harshbafna
Copy link
Contributor

harshbafna commented Dec 10, 2020

#749 is now merged to master. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants