Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support continuous batching in sequence batch streaming case #3160

Merged
merged 34 commits into from
Jun 3, 2024

Conversation

lxning
Copy link
Collaborator

@lxning lxning commented May 24, 2024

Description

Please read our CONTRIBUTING.md prior to creating your first pull request.

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

 pytest test_example_stateful_sequence_continuous_batching_http.py
============================== test session starts ===============================
platform linux -- Python 3.10.14, pytest-7.3.1, pluggy-1.4.0
rootdir: /home/ubuntu/serve
plugins: cov-4.1.0, mock-3.12.0
collected 3 items

test_example_stateful_sequence_continuous_batching_http.py ...             [100%]

=============================== 3 passed in 19.81s ===============================

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@lxning lxning added the enhancement New feature or request label May 24, 2024
@lxning lxning added this to the v0.12.0 milestone May 24, 2024
@lxning lxning requested a review from mreso May 24, 2024 20:44
@lxning lxning self-assigned this May 24, 2024
@lxning lxning changed the title [WIP]Support continuous batching in sequence batch streaming case Support continuous batching in sequence batch streaming case May 29, 2024
Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Please add continuous batching to the test to make sure this part works as well.

@Override
protected void pollInferJob() throws InterruptedException {
// TBD: Temporarily hard code the continuous batch size is 2 * batchSize
model.pollInferJob(jobs, model.getBatchSize() * 2 - jobs.size(), jobsQueue);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document how the jobs will be interleaved when they land in the handler and the batch_size > 1. Is is [Job_1_1, Job_1_2,Job_2_1, Job_2_2] or [Job_1_1, Job_2_1,Job_1_2, Job_2_2]? Or will it be scrambled and the handler needs to sort this out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not necessary for backend handler to sort b/c backend handler preprocessing is able to mark the previous running request as "cancel"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxning So you mean the preprocessing in the handler will need to sort this out as there can be multiple requests from multiple sequences and in no specific pattern. In the end the preprocessing in the handler will need to go though the batch and see if besides the previous request another cancel request is in the batch as well and than cancel it/clean it up.

Copy link
Collaborator Author

@lxning lxning Jun 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not necessary to sort. Frontend guarantee the order of the requests in a sequence, and pass them to backend. Currently the use case is synchronized b/w client and TS, and the continuous batch size is fixed as 2 for each sequence. This means a cancel cmd applies on all requests of this sequence at backend. So here is based on "num_requests"

Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments.

Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mreso mreso added this pull request to the merge queue Jun 3, 2024
@mreso mreso removed this pull request from the merge queue due to a manual request Jun 3, 2024
@mreso mreso enabled auto-merge June 3, 2024 18:13
@mreso mreso added this pull request to the merge queue Jun 3, 2024
Merged via the queue into master with commit 0c820c7 Jun 3, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants