Extend TGI integration tests #561

dacorvo · 2024-04-10T09:38:07Z

What does this PR do?

This extends the TGI integration tests by running all tests on not only llama but also gpt2 model configurations.

This improves logs readability.

JingyaHuang

LGTM, just left some minor questions to better understand. Thanks for enhancing the tests!

JingyaHuang · 2024-04-11T13:05:02Z

text-generation-inference/integration-tests/model_fixtures.py

+    """Expose a pre-trained neuron model
+
+    The fixture first makes sure the following model artifacts are present on the hub:
+    - exported neuron model under optimum/neuron-testing-<version>-<name>,


Why do you want to both cache it under optimum/neuron-testing-cache repo and store in a repo optimum/neuron-testing-<version>-<name> instead of just the cache repo?

Because I want to test both fetching a neuron model from the hub and exporting it "on-the_fly" from the cached artifacts.

JingyaHuang · 2024-04-11T13:05:40Z

text-generation-inference/integration-tests/model_fixtures.py

+    For each exposed model, the local directory is maintained for the duration of the
+    test session and cleaned up afterwards.
+    The hub model artifacts are never cleaned up and persist accross sessions.
+    They must be cleaned up manually when the optimum-neuron version changes.


btw do we also need to clean the cache in the official neuron cache repo?

The official cache is not affected by these files. It is however sometimes affected when running tests locally: the registry is updated for dev versions for instance.
I usually remove the registry files afterwards.
We could also remove older compiler trees.

JingyaHuang · 2024-04-11T13:06:25Z

text-generation-inference/integration-tests/model_fixtures.py

+    They must be cleaned up manually when the optimum-neuron version changes.
+
+    """
+    config_name = request.param


it's "model_name" / "model_arch"?

It is a free-form string used as the key in the model_configurations dictionary (here gpt2, llama).

dacorvo force-pushed the extend_docker_tests branch 2 times, most recently from 60f7870 to 70cb0d0 Compare April 10, 2024 15:49

ci(tgi): split docker tests in several steps

41dd9e4

This improves logs readability.

dacorvo force-pushed the extend_docker_tests branch 2 times, most recently from 44986bf to 4f24388 Compare April 11, 2024 07:27

dacorvo marked this pull request as ready for review April 11, 2024 11:48

dacorvo requested review from michaelbenayoun and JingyaHuang April 11, 2024 11:48

dacorvo added 2 commits April 11, 2024 11:49

test(tgi): extend model tests and cache model configurations

3c3d07c

test(tgi): reduce the number of generated tokens

c5bb24e

dacorvo force-pushed the extend_docker_tests branch from 9a44aaf to c5bb24e Compare April 11, 2024 11:50

JingyaHuang approved these changes Apr 11, 2024

View reviewed changes

dacorvo merged commit 9232f67 into main Apr 12, 2024
1 check passed

dacorvo deleted the extend_docker_tests branch April 12, 2024 05:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend TGI integration tests #561

Extend TGI integration tests #561

dacorvo commented Apr 10, 2024

JingyaHuang left a comment

JingyaHuang Apr 11, 2024

dacorvo Apr 11, 2024

JingyaHuang Apr 11, 2024

dacorvo Apr 11, 2024

JingyaHuang Apr 11, 2024

dacorvo Apr 11, 2024

Extend TGI integration tests #561

Extend TGI integration tests #561

Conversation

dacorvo commented Apr 10, 2024

What does this PR do?

JingyaHuang left a comment

Choose a reason for hiding this comment

JingyaHuang Apr 11, 2024

Choose a reason for hiding this comment

dacorvo Apr 11, 2024

Choose a reason for hiding this comment

JingyaHuang Apr 11, 2024

Choose a reason for hiding this comment

dacorvo Apr 11, 2024

Choose a reason for hiding this comment

JingyaHuang Apr 11, 2024

Choose a reason for hiding this comment

dacorvo Apr 11, 2024

Choose a reason for hiding this comment