Add neuronx cache registry #442

dacorvo · 2024-01-25T16:35:19Z

This extends the neuronx caching for decoder models to store the cached neuron configurations under a specific registry folder for each neuronxcc compiler version.

aws-neuron/optimum-neuron-cache/                                                
└── neuronxcc-2.12.54.0+f631c2365
    └── registry
       ├── hf-internal-testing
       │   └── tiny-random-gpt2
       │       └── 8019d93dd8eda6d8da82.json
       └── mistralai
            └── Mistral-7B-Instruct-v0.1
                └── b12f53f65aebf02fdfa9.json

This also adds a get_hub_cached_entries helper and CLI command to lookup cached configurations for a specific model_id.

The output looks like this:

$ optimum-cli neuron cache lookup mistralai/Mistral-7B-Instruct-v0.1
*** 1 entrie(s) found in cache for mistralai/Mistral-7B-Instruct-v0.1 ***


task: text-generation
batch_size: 1
num_cores: 2
auto_cast_type: bf16
sequence_length: 2048
compiler_type: neuronx-cc
compiler_version: 2.12.54.0+f631c2365
checkpoint_id: mistralai/Mistral-7B-Instruct-v0.1
checkpoint_revision: 9ab9e76e2b09f9f29ea2d56aa5bd139e4445c59e

HuggingFaceDocBuilderDev · 2024-01-25T16:40:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

michaelbenayoun · 2024-01-26T13:21:29Z

examples/language-modeling/run_clm.py

+        preprocess_logits_for_metrics=(
+            preprocess_logits_for_metrics if training_args.do_eval and not is_torch_tpu_available() else None
+        ),


Is it just formatting?
Anyways those example are automatically generated from the Transformers library so this change will be overriden when we synchronize with the next release.

If I don't apply them, then the check-code-quality fails. I could exclude the examples from it though ...

No worries, I just wanted to check if the change was only about styling. After reading the rest of the PR I understood it was automatic reformatting.

optimum/neuron/utils/hub_neuronx_cache.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

michaelbenayoun

LGTM thanks!

dacorvo force-pushed the cache_registry branch from c2114a6 to 13f5718 Compare January 25, 2024 16:37

dacorvo marked this pull request as ready for review January 25, 2024 16:39

dacorvo requested review from michaelbenayoun, JingyaHuang and philschmid January 25, 2024 16:40

dacorvo added 4 commits January 25, 2024 20:44

feat(decoder): store neuron config in cache registry

838fc32

feat(cache): add helper to get cached model configurations

a09b4a6

feat(cli): add cache lookup command

b658b91

test(generation): only test push_to_hub with one model

a3d7804

dacorvo force-pushed the cache_registry branch from bf9d7f6 to a3d7804 Compare January 25, 2024 20:44

dacorvo added 4 commits January 26, 2024 08:23

feat(cache): rename registry folder so that it appears on top

7ad140a

doc(cache): add lookup command

948688e

fix(cli): update cache command help

4ea8215

fix(style): reformat with latest black version

1b4a731

michaelbenayoun reviewed Jan 26, 2024

View reviewed changes

Update optimum/neuron/utils/hub_neuronx_cache.py

4765f2f

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

dacorvo requested a review from michaelbenayoun January 26, 2024 15:13

michaelbenayoun approved these changes Jan 26, 2024

View reviewed changes

dacorvo merged commit 0f7bf4a into main Jan 26, 2024
8 checks passed

dacorvo deleted the cache_registry branch January 26, 2024 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add neuronx cache registry #442

Add neuronx cache registry #442

dacorvo commented Jan 25, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 25, 2024

michaelbenayoun Jan 26, 2024

dacorvo Jan 26, 2024

michaelbenayoun Jan 26, 2024

michaelbenayoun left a comment

Add neuronx cache registry #442

Add neuronx cache registry #442

Conversation

dacorvo commented Jan 25, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Jan 25, 2024

michaelbenayoun Jan 26, 2024

Choose a reason for hiding this comment

dacorvo Jan 26, 2024

Choose a reason for hiding this comment

michaelbenayoun Jan 26, 2024

Choose a reason for hiding this comment

michaelbenayoun left a comment

Choose a reason for hiding this comment

dacorvo commented Jan 25, 2024 •

edited

Loading