Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend TGI integration tests #561

Merged
merged 3 commits into from
Apr 12, 2024
Merged

Extend TGI integration tests #561

merged 3 commits into from
Apr 12, 2024

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Apr 10, 2024

What does this PR do?

This extends the TGI integration tests by running all tests on not only llama but also gpt2 model configurations.

@dacorvo dacorvo force-pushed the extend_docker_tests branch 2 times, most recently from 60f7870 to 70cb0d0 Compare April 10, 2024 15:49
This improves logs readability.
@dacorvo dacorvo force-pushed the extend_docker_tests branch 2 times, most recently from 44986bf to 4f24388 Compare April 11, 2024 07:27
@dacorvo dacorvo marked this pull request as ready for review April 11, 2024 11:48
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left some minor questions to better understand. Thanks for enhancing the tests!

"""Expose a pre-trained neuron model

The fixture first makes sure the following model artifacts are present on the hub:
- exported neuron model under optimum/neuron-testing-<version>-<name>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to both cache it under optimum/neuron-testing-cache repo and store in a repo optimum/neuron-testing-<version>-<name> instead of just the cache repo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I want to test both fetching a neuron model from the hub and exporting it "on-the_fly" from the cached artifacts.

For each exposed model, the local directory is maintained for the duration of the
test session and cleaned up afterwards.
The hub model artifacts are never cleaned up and persist accross sessions.
They must be cleaned up manually when the optimum-neuron version changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw do we also need to clean the cache in the official neuron cache repo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The official cache is not affected by these files. It is however sometimes affected when running tests locally: the registry is updated for dev versions for instance.
I usually remove the registry files afterwards.
We could also remove older compiler trees.

They must be cleaned up manually when the optimum-neuron version changes.

"""
config_name = request.param
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's "model_name" / "model_arch"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a free-form string used as the key in the model_configurations dictionary (here gpt2, llama).

@dacorvo dacorvo merged commit 9232f67 into main Apr 12, 2024
1 check passed
@dacorvo dacorvo deleted the extend_docker_tests branch April 12, 2024 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants