Support continuous batching in sequence batch streaming case #3160

lxning · 2024-05-24T20:44:20Z

Description

Please read our CONTRIBUTING.md prior to creating your first pull request.

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #(issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

 pytest test_example_stateful_sequence_continuous_batching_http.py
============================== test session starts ===============================
platform linux -- Python 3.10.14, pytest-7.3.1, pluggy-1.4.0
rootdir: /home/ubuntu/serve
plugins: cov-4.1.0, mock-3.12.0
collected 3 items

test_example_stateful_sequence_continuous_batching_http.py ...             [100%]

=============================== 3 passed in 19.81s ===============================

Checklist:

Did you have fun?
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

mreso

Left some comments. Please add continuous batching to the test to make sure this part works as well.

examples/stateful/sequence_continuous_batching/stateful_handler.py

frontend/server/src/main/java/org/pytorch/serve/util/messages/RequestInput.java

mreso · 2024-05-30T18:29:44Z

frontend/server/src/main/java/org/pytorch/serve/wlm/SequenceContinuousBatching.java

+    @Override
+    protected void pollInferJob() throws InterruptedException {
+        // TBD: Temporarily hard code the continuous batch size is 2 * batchSize
+        model.pollInferJob(jobs, model.getBatchSize() * 2 - jobs.size(), jobsQueue);


We should document how the jobs will be interleaved when they land in the handler and the batch_size > 1. Is is [Job_1_1, Job_1_2,Job_2_1, Job_2_2] or [Job_1_1, Job_2_1,Job_1_2, Job_2_2]? Or will it be scrambled and the handler needs to sort this out.

No, it is not necessary for backend handler to sort b/c backend handler preprocessing is able to mark the previous running request as "cancel"

@lxning So you mean the preprocessing in the handler will need to sort this out as there can be multiple requests from multiple sequences and in no specific pattern. In the end the preprocessing in the handler will need to go though the batch and see if besides the previous request another cancel request is in the batch as well and than cancel it/clean it up.

No, it is not necessary to sort. Frontend guarantee the order of the requests in a sequence, and pass them to backend. Currently the use case is synchronized b/w client and TS, and the continuous batch size is fixed as 2 for each sequence. This means a cancel cmd applies on all requests of this sequence at backend. So here is based on "num_requests"

frontend/server/src/main/java/org/pytorch/serve/wlm/WorkLoadManager.java

test/pytest/test_example_stateful_sequence_continuous_batching_http.py

mreso

Please see comments.

frontend/server/src/main/java/org/pytorch/serve/wlm/WorkLoadManager.java

mreso

LGTM

Support continuous batching in sequence batch streaming case

ea83cd9

lxning added the enhancement New feature or request label May 24, 2024

lxning added this to the v0.12.0 milestone May 24, 2024

lxning requested a review from mreso May 24, 2024 20:44

lxning self-assigned this May 24, 2024

add test stateful sequence continuous batchng

eab4d24

lxning changed the title ~~[WIP]Support continuous batching in sequence batch streaming case~~ Support continuous batching in sequence batch streaming case May 29, 2024

lxning added 12 commits May 29, 2024 13:20

fmt

d4a777d

fix init atomicboolean

55068cc

update test and example

8f6d366

fix open session test

7e7b339

fix open session test

59cc12f

set sequnce id

287333f

set seq id in response

1954417

update test

e70ffb4

fix wrong expected result

a41ecf5

fixed test expectation

5346f26

fmt

81525c5

update test path

b54f70b

mreso requested changes May 30, 2024

View reviewed changes

lxning added 2 commits May 30, 2024 13:15

simpify

8428cea

update for comments

82f97f6

mreso requested changes May 30, 2024

View reviewed changes

frontend/server/src/main/java/org/pytorch/serve/wlm/WorkLoadManager.java Outdated Show resolved Hide resolved

lxning added 7 commits May 30, 2024 19:25

remove sequence continuous parametger

abba9df

update cancel

ab8fd67

update cancel

c0ebab3

update cleanup

2830e32

support mix mode stream and non-stream

c3122f7

clean code

0167921

update test

a4e80ae

lxning added 10 commits May 31, 2024 23:30

fix order

cc8216a

update log

b4934ec

update headers

b156aea

test mix mode

74d5321

update fmt

0ad8645

increase counter

fe18dea

increase counter

4b43233

add commnents

77b838b

update readme

182e831

update readme

5f87195

mreso approved these changes Jun 3, 2024

View reviewed changes

mreso added this pull request to the merge queue Jun 3, 2024

Added stop torchserve to unit tests

1ce8bf0

mreso removed this pull request from the merge queue due to a manual request Jun 3, 2024

mreso enabled auto-merge June 3, 2024 18:13

mreso added this pull request to the merge queue Jun 3, 2024

Merged via the queue into master with commit 0c820c7 Jun 3, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support continuous batching in sequence batch streaming case #3160

Support continuous batching in sequence batch streaming case #3160

lxning commented May 24, 2024 •

edited

Loading

mreso left a comment

mreso May 30, 2024

lxning May 30, 2024

mreso May 30, 2024

lxning Jun 2, 2024 •

edited

Loading

mreso left a comment

mreso left a comment

Support continuous batching in sequence batch streaming case #3160

Support continuous batching in sequence batch streaming case #3160

Conversation

lxning commented May 24, 2024 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Checklist:

mreso left a comment

Choose a reason for hiding this comment

mreso May 30, 2024

Choose a reason for hiding this comment

lxning May 30, 2024

Choose a reason for hiding this comment

mreso May 30, 2024

Choose a reason for hiding this comment

lxning Jun 2, 2024 • edited Loading

Choose a reason for hiding this comment

mreso left a comment

Choose a reason for hiding this comment

mreso left a comment

Choose a reason for hiding this comment

lxning commented May 24, 2024 •

edited

Loading

lxning Jun 2, 2024 •

edited

Loading