-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend TGI integration tests #561
Conversation
60f7870
to
70cb0d0
Compare
This improves logs readability.
44986bf
to
4f24388
Compare
9a44aaf
to
c5bb24e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just left some minor questions to better understand. Thanks for enhancing the tests!
"""Expose a pre-trained neuron model | ||
|
||
The fixture first makes sure the following model artifacts are present on the hub: | ||
- exported neuron model under optimum/neuron-testing-<version>-<name>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you want to both cache it under optimum/neuron-testing-cache
repo and store in a repo optimum/neuron-testing-<version>-<name>
instead of just the cache repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I want to test both fetching a neuron model from the hub and exporting it "on-the_fly" from the cached artifacts.
For each exposed model, the local directory is maintained for the duration of the | ||
test session and cleaned up afterwards. | ||
The hub model artifacts are never cleaned up and persist accross sessions. | ||
They must be cleaned up manually when the optimum-neuron version changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw do we also need to clean the cache in the official neuron cache repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The official cache is not affected by these files. It is however sometimes affected when running tests locally: the registry is updated for dev versions for instance.
I usually remove the registry files afterwards.
We could also remove older compiler trees.
They must be cleaned up manually when the optimum-neuron version changes. | ||
|
||
""" | ||
config_name = request.param |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's "model_name" / "model_arch"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a free-form string used as the key in the model_configurations dictionary (here gpt2, llama).
What does this PR do?
This extends the TGI integration tests by running all tests on not only llama but also gpt2 model configurations.