Add support for binary data extension protocol and FP16 datatype #3685

sivanantha321 · 2024-05-13T16:32:40Z

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3643

https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_binary_data.md

Type of changes
Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Test B
Logs

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

Have you added unit/e2e tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

Release note:

Re-running failed tests

/rerun-all - rerun all failed workflows.
/rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

oss-prow-bot · 2024-06-10T18:27:38Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sivanantha321
Once this PR has been reviewed and has the lgtm label, please assign yuzisun for approval by writing /assign @yuzisun in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sivanantha321 · 2024-06-10T20:39:38Z

/rerun-all

sivanantha321 · 2024-06-10T21:39:30Z

/rerun-all

sivanantha321 · 2024-06-12T09:26:42Z

/rerun-all

sivanantha321 · 2024-06-12T10:21:39Z

/rerun-all

python/kserve/kserve/model.py

yuzisun · 2024-07-01T06:26:27Z

python/kserve/kserve/protocol/dataplane.py

        if isinstance(body, InferRequest):
            return body, attributes
+        elif isinstance(body, InferenceRequest) or (


I remember InferenceRequest was always converted to InferRequest before hitting here

yuzisun · 2024-07-01T06:32:24Z

python/kserve/kserve/protocol/rest/server.py

@@ -174,7 +173,7 @@ def create_application(self) -> FastAPI:
                    r"/v2/models/{model_name}/infer",
                    v2_endpoints.infer,
                    methods=["POST"],
-                    response_model=InferenceResponse,
+                    response_model=None,


The downside of this is that we lose the validation when returning back the InferenceResponse

The response will be validated here

yuzisun · 2024-07-01T06:40:02Z

test/e2e/predictor/test_sklearn.py

@@ -221,6 +223,32 @@ def test_sklearn_v2():
    res = predict(service_name, "./data/iris_input_v2.json", protocol_version="v2")
    assert res["outputs"][0]["data"] == [1, 1]

+    raw_res = predict(
+        service_name,
+        "./data/iris_input_v2_binary.json",


How do we test sending inputs with binary data for REST ?

Added in custom transformer test and also included in python tests

yuzisun · 2024-07-01T06:40:54Z

@sivanantha321 Need a rebase after merging the inference client PR

sivanantha321 · 2024-07-08T09:29:48Z

/rerun-all

sivanantha321 · 2024-07-08T10:19:26Z

/rerun-all

sivanantha321 · 2024-07-10T08:53:32Z

/rerun-all

sivanantha321 · 2024-07-10T09:45:18Z

/rerun-all

sivanantha321 · 2024-07-11T07:14:22Z

/rerun-all

sivanantha321 · 2024-08-02T06:22:08Z

/rerun-all

sivanantha321 · 2024-08-02T06:58:31Z

/rerun-all

sivanantha321 · 2024-08-02T08:33:11Z

/rerun-all

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 · 2024-08-12T09:50:12Z

/rerun-all

python/kserve/kserve/protocol/dataplane.py

test/e2e/common/utils.py

yuzisun · 2024-08-24T13:28:52Z

python/kserve/test/test_server.py

+        # Fixme: Gets only the 1st element of the input
+        # inputs = get_predict_input(request)


Suggested change

# Fixme: Gets only the 1st element of the input

# inputs = get_predict_input(request)

yuzisun · 2024-08-24T13:29:12Z

python/kserve/test/test_server.py

+        # Fixme: Gets only the 1st element of the input
+        # inputs = get_predict_input(request)


Suggested change

# Fixme: Gets only the 1st element of the input

# inputs = get_predict_input(request)

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yuzisun

/lgtm
/approve

oss-prow-bot bot added the do-not-merge/work-in-progress label May 13, 2024

oss-prow-bot bot requested review from ckadner and rachitchauhan43 May 13, 2024 16:32

sivanantha321 force-pushed the support-fp16-datatype branch from ebbc203 to 034a2d3 Compare June 10, 2024 18:27

sivanantha321 force-pushed the support-fp16-datatype branch 2 times, most recently from 1035075 to 615e6b3 Compare June 10, 2024 20:06

sivanantha321 force-pushed the support-fp16-datatype branch 2 times, most recently from 252e322 to d14808d Compare June 10, 2024 21:38

sivanantha321 force-pushed the support-fp16-datatype branch 2 times, most recently from 03bef97 to 92a7b02 Compare June 12, 2024 08:45

sivanantha321 marked this pull request as ready for review June 12, 2024 11:42

oss-prow-bot bot removed the do-not-merge/work-in-progress label Jun 12, 2024

oss-prow-bot bot requested review from andyi2it and cmaddalozzo June 12, 2024 11:42

yuzisun reviewed Jul 1, 2024

View reviewed changes

sivanantha321 force-pushed the support-fp16-datatype branch 3 times, most recently from 494d70f to d00e06c Compare July 8, 2024 08:42

sivanantha321 force-pushed the support-fp16-datatype branch from 865ab50 to f2efd29 Compare July 11, 2024 06:23

sivanantha321 force-pushed the support-fp16-datatype branch from 3245c45 to 44f7254 Compare August 1, 2024 13:37

sivanantha321 force-pushed the support-fp16-datatype branch 2 times, most recently from b353bb6 to 382ba9e Compare August 6, 2024 13:29

sivanantha321 added 6 commits August 12, 2024 12:50

Add support for binary data extension protocol and FP16 datatype support

c4be6e9

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add FP16 validation

ec60aac

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Refactor, support request outputs and add tests

7ea342c

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add back numpy as data support

35dc575

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Add binary data extension support for inference client

5ecc0b1

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

Increase timeout for gprc client

72c8a34

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 force-pushed the support-fp16-datatype branch from 382ba9e to d188d1d Compare August 12, 2024 07:21

Rebase master

3590411

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>

sivanantha321 force-pushed the support-fp16-datatype branch from d188d1d to 3590411 Compare August 12, 2024 09:08

yuzisun reviewed Aug 24, 2024

View reviewed changes

python/kserve/kserve/protocol/dataplane.py Outdated Show resolved Hide resolved

yuzisun reviewed Aug 24, 2024

View reviewed changes

test/e2e/common/utils.py Outdated Show resolved Hide resolved

yuzisun reviewed Aug 24, 2024

View reviewed changes

test/e2e/common/utils.py Outdated Show resolved Hide resolved

yuzisun reviewed Aug 24, 2024

View reviewed changes

Apply suggestions from code review

22a012c

Signed-off-by: Dan Sun <dsun20@bloomberg.net>

yuzisun approved these changes Aug 24, 2024

View reviewed changes

yuzisun added lgtm approved labels Aug 24, 2024

yuzisun merged commit 69cdca5 into kserve:master Aug 24, 2024
57 checks passed

sivanantha321 mentioned this pull request Nov 13, 2024

Add Binary tensor data extension docs kserve/website#419

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for binary data extension protocol and FP16 datatype #3685

Add support for binary data extension protocol and FP16 datatype #3685

sivanantha321 commented May 13, 2024 •

edited

Loading

oss-prow-bot bot commented Jun 10, 2024

sivanantha321 commented Jun 10, 2024

sivanantha321 commented Jun 10, 2024

sivanantha321 commented Jun 12, 2024

sivanantha321 commented Jun 12, 2024

yuzisun Jul 1, 2024

yuzisun Jul 1, 2024

sivanantha321 Jul 10, 2024

yuzisun Jul 1, 2024

sivanantha321 Jul 8, 2024

yuzisun commented Jul 1, 2024

sivanantha321 commented Jul 8, 2024

sivanantha321 commented Jul 8, 2024

sivanantha321 commented Jul 10, 2024

sivanantha321 commented Jul 10, 2024

sivanantha321 commented Jul 11, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 12, 2024

yuzisun Aug 24, 2024

yuzisun Aug 24, 2024

yuzisun left a comment

		# Fixme: Gets only the 1st element of the input
		# inputs = get_predict_input(request)

Add support for binary data extension protocol and FP16 datatype #3685

Add support for binary data extension protocol and FP16 datatype #3685

Conversation

sivanantha321 commented May 13, 2024 • edited Loading

oss-prow-bot bot commented Jun 10, 2024

sivanantha321 commented Jun 10, 2024

sivanantha321 commented Jun 10, 2024

sivanantha321 commented Jun 12, 2024

sivanantha321 commented Jun 12, 2024

yuzisun Jul 1, 2024

Choose a reason for hiding this comment

yuzisun Jul 1, 2024

Choose a reason for hiding this comment

sivanantha321 Jul 10, 2024

Choose a reason for hiding this comment

yuzisun Jul 1, 2024

Choose a reason for hiding this comment

sivanantha321 Jul 8, 2024

Choose a reason for hiding this comment

yuzisun commented Jul 1, 2024

sivanantha321 commented Jul 8, 2024

sivanantha321 commented Jul 8, 2024

sivanantha321 commented Jul 10, 2024

sivanantha321 commented Jul 10, 2024

sivanantha321 commented Jul 11, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 2, 2024

sivanantha321 commented Aug 12, 2024

yuzisun Aug 24, 2024

Choose a reason for hiding this comment

yuzisun Aug 24, 2024

Choose a reason for hiding this comment

yuzisun left a comment

Choose a reason for hiding this comment

sivanantha321 commented May 13, 2024 •

edited

Loading