-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up tgi environment values with the ones used to build the model #529
Conversation
e84cfcc
to
07910bd
Compare
note: the associated generated image is available for testing here :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I am a bit worried some configurations might not work.
Could you add integration tests under https://github.com/huggingface/optimum-neuron/tree/main/text-generation-inference/integration-tests ?
I also need to add a github workflow to build the image and run the integration tests (make tgi_docker_test).
done, both -> tgi_implicit_env.py and workflow |
Actually I remove the workflow, the integration test test_gpt2.py cannot work for the local_neuron variant, reason: Some directory is filled with data here:
then this directory is expected to be shared with the docker container, here:
the problem is that this cannot work if tests are run within a container + docker dind environment as the volume filled in the first container won't be available on the host and thus won't end up on the second (hence tgi will launch with an empty dir) -> so either we remove the local_neuron variant or we find a way to share the volume between the container running pytests and the one spawned by pytests |
7aca4a7
to
5bf0f49
Compare
I had to deactivate/remove all tests related to aws-neuron/gpt2-neuronx-bs4-seqlen1024 because of the neuronx-cc upgrade v2.13.xxx |
Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
…del within an image built on the flight Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Before merging, could you:
- bump dev version,
- use a single workflow for TGI,
- simplify a bit the implicit env test to just check you received a response to a single request.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the pull-request !
@@ -77,6 +80,9 @@ RUN VERBOSE=1 BUILDDIR=/pyserver/build PROTODIR=/pyserver/proto VERSION=${VERSIO | |||
# Neuron base image (used for deployment) | |||
FROM base AS neuron | |||
|
|||
ARG VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to repeat VERSION multiple times ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep I wasn't even able to fully explain it: without it, on github ci the docker build process failed as you can see here:
https://github.com/huggingface/optimum-neuron/actions/runs/8552046964/job/23432354170
my guess is it's due to a buildkit version... not 100% sure though
Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com>
Side note: I bumped the version to 0.0.22.dev0. This will temporarily break integration tests (as there are no compatible cached models for the CI yet: gpt 2 compiled with neuronx-cc 2.13.66.0+6dfecc895 on 1 or 2 cores) |
Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models