Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor inf2 streamer handler #3035

Merged
merged 29 commits into from
May 1, 2024
Merged

Refactor inf2 streamer handler #3035

merged 29 commits into from
May 1, 2024

Conversation

lxning
Copy link
Collaborator

@lxning lxning commented Mar 21, 2024

Description

Please read our CONTRIBUTING.md prior to creating your first pull request.

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test on notebook: examples/large_models/inferentia2/llama2/streamer/inf2-llama-2-micro-batching.ipynb
python examples/large_models/utils/test_llm_streaming_response.py -m llama-2-70b --demo-streaming --prompt-text "Today the weather is really nice and I am planning on "
payload={'prompt': 'Today the weather is really nice and I am planning on ', 'max_new_tokens': 64}
, output=
Today the weather is really nice and I am planning on 200km ride. I am going to ride to the north of the city and then to the west. I am going to ride to the city of Krasnogorsk.
I am going to ride to the city of KrasTasks are completed
  • Test on notebook: examples/large_models/inferentia2/llama2/continuous_batching/inf2-llama-2-continuous-batching.ipynb
python examples/large_models/utils/test_llm_streaming_response.py -m llama-2-70b -o 50 -t 2 -n 4 --prompt-text "Today the weather is really nice and I am planning on " --prompt-randomize
Tasks are completed
payload={'prompt': 'n j Today the weather is really nice and I am planning on ', 'max_new_tokens': 52}
, output=n j Today the weather is really nice and I am planning on 100% of the time.
I'm not sure if I'm going to be able to make it to the meetup. I'm going to be in the area, but I'm not sure if I'll be able

payload={'prompt': 't d e s k h e h y o z m a f e f x Today the weather is really nice and I am planning on ', 'max_new_tokens': 67}
, output=t d e s k h e h y o z m a f e f x Today the weather is really nice and I am planning on 100% going to the beach. I am going to go to the beach with my friends and we are going to have a lot of fun. I am going to go to the beach with my friends and we are going to have a lot of fun. I am going to go to the beach with my friends and we are

payload={'prompt': 'Today the weather is really nice and I am planning on ', 'max_new_tokens': 50}
, output=Today the weather is really nice and I am planning on 200km ride. I am going to ride to the north of the city and then to the west. I am going to ride to the city of Krasnogorsk.
I am going to ride to the city of Kras

payload={'prompt': 'c r y g g g v Today the weather is really nice and I am planning on ', 'max_new_tokens': 57}
, output=c r y g g g v Today the weather is really nice and I am planning on 100% going to the beach. I am going to take my camera and take some pictures of the beach and the ocean. I am going to take some pictures of the sunset and the sunrise. I am going to take some pictures of the waves and the sand

payload={'prompt': 'z m w n r Today the weather is really nice and I am planning on ', 'max_new_tokens': 55}
, output=z m w n r Today the weather is really nice and I am planning on 100% going to the beach. I am going to go to the beach with my friends and we are going to have a lot of fun. I am going to go to the beach with my friends and we are going to have a lot of fun. I am

payload={'prompt': 's e d t p d w Today the weather is really nice and I am planning on ', 'max_new_tokens': 57}
, output=s e d t p d w Today the weather is really nice and I am planning on 100% going to the beach. I am going to go to the beach with my friends and we are going to have a lot of fun. I am going to go to the beach with my friends and we are going to have a lot of fun. I am going to

payload={'prompt': 'u x t o s o t Today the weather is really nice and I am planning on ', 'max_new_tokens': 57}
, output=u x t o s o t Today the weather is really nice and I am planning on 100% enjoying it. I am going to go to the beach and then to the pool. I am going to have a great time with my friends and I am going to enjoy the sun. I am going to have a great time and I am going to enjoy the

payload={'prompt': 'q p f z v z t u o e f d g Today the weather is really nice and I am planning on ', 'max_new_tokens': 63}
, output=q p f z v z t u o e f d g Today the weather is really nice and I am planning on 1. (go) to the park with my friends. We are going to2. (play) football and then we are going to3. (have) a picnic. I am going to4. (bring) some sandwiches and some fruit. I am also going to

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@lxning lxning self-assigned this Apr 16, 2024
@lxning lxning added enhancement New feature or request example large-model labels Apr 16, 2024
@lxning lxning added this to the v0.10.1 milestone Apr 16, 2024
@lxning lxning requested a review from mreso April 16, 2024 18:09
Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but lets move the neuronx handlers into the example folder, see comments

@lxning lxning enabled auto-merge May 1, 2024 18:45
Copy link
Collaborator

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lxning lxning added this pull request to the merge queue May 1, 2024
Merged via the queue into master with commit f74f85c May 1, 2024
10 of 12 checks passed
@mreso mreso mentioned this pull request May 8, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants