Save output model to output_dir #1430

xiaoyu-work · 2024-10-21T06:30:40Z

Describe your changes

Save output model to output_dir

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

test/unit_test/engine/test_engine.py

olive/engine/footprint.py

olive/engine/engine.py

test/unit_test/engine/test_engine.py

test/unit_test/cli/test_cli.py

olive/engine/engine.py

olive/engine/footprint.py

some additional questions

jambayk · 2024-11-20T01:29:03Z

olive/cli/base.py

@@ -416,7 +416,7 @@ def save_output_model(config: Dict, output_model_dir: Union[str, Path]):

    This assumes a single accelerator workflow.
    """
-    run_output_path = Path(config["output_dir"]) / "output_model"
+    run_output_path = Path(config["output_dir"])


btw without this output_model nesting, now the output of the cli would also have the footprint and other files copied over even though they mean nothing to the user of the cli and is messy+confusing. I think we might need an option to disable saving these files like you mentioned once before.

or make it opt-in to save the extra files instead of opt-out. most users only care about the final output model.

Good call! I don't think users need those files. We should only copy model files here.

jambayk · 2024-11-21T18:08:37Z

olive/cli/base.py

+    for resource_key, resource_path in all_resources.items():
+        src_path = Path(resource_path.get_path()).resolve()
+        if src_path.is_dir():
+            hardlink_copy_dir(src_path, output_model_dir / src_path)


I don't think this is correct. output_model_dir / src_path = src_path since src_path is a fully resolved path. you probably wanted str_path.name like in 448?

Yeah true! I forgot to update this!

please also check how the outputs for a graph capture for with --use_ort_genai and --use_model_builder look. the additional_files attributes might make this copy the additional files into the output directory, even though they refer to files already in the model subdir.

Like I described in the previous comment, I think it will be easier to just make saving the footprint, etc opt-in in the workflow config. that way, even for cli, we can just use the final output dir and not need to do any of this copy and path update.

My intention is copying additional files to output folder as they are also a part of the output model.

I think the additional files get copied twice in this. Once is as part of the model_path resource copy into model subdir. Then they are again copied individually directly into the output_dir itself.

Are all additional files stored in the model path? I kinda remember it can be everywhere depending on the pass who created it. Was this logic updated before?

it is stored in model path normally. some passes like modelbuilder with metadata saves it in a different folder but that was because we weren't sure if should copy/hardlink the existing model files. But both the pass carry forward https://github.com/microsoft/Olive/blob/main/olive/passes/olive_pass.py#L274 and cache model save have always saved in model_path

Olive/olive/cache.py

Line 451 in 0db2d72

# we only have additional files for onnx models so saving to "model" is safe

Since the output of a workflow goes through the cache model save, the additional files are already in model_path resources for onnx models. this is different for composite models where they are saved in output_dir.

So i think it's less hacky to just save the cli output directly in output_path and opt out of saving the other footprints, etc. No need to temp directories or copy.

xiaoyu-work added 2 commits October 21, 2024 06:30

Save output model to output_dir

9af78ca

fix whisper example test

7a25bdc

jambayk reviewed Oct 21, 2024

View reviewed changes

test/unit_test/engine/test_engine.py Show resolved Hide resolved

jambayk reviewed Oct 21, 2024

View reviewed changes

olive/engine/footprint.py Show resolved Hide resolved

jambayk reviewed Oct 21, 2024

View reviewed changes

olive/engine/engine.py Outdated Show resolved Hide resolved

xiaoyu-work added 3 commits October 22, 2024 22:16

Fix tests

25a255a

update test

ac96d58

fix nit

a5627bb

xiaoyu-work mentioned this pull request Oct 28, 2024

BERT has not final model #1439

Open

Merge remote-tracking branch 'origin' into xiaoyu/output

cb7ae1e

github-advanced-security bot found potential problems Oct 28, 2024

View reviewed changes

test/unit_test/engine/test_engine.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Oct 28, 2024

View reviewed changes

test/unit_test/engine/test_engine.py Fixed Show fixed Hide fixed

xiaoyu-work added 2 commits October 28, 2024 21:56

fix nit

9007c08

Merge remote-tracking branch 'origin' into xiaoyu/output

9b0fdb0

jambayk reviewed Nov 11, 2024

View reviewed changes

test/unit_test/cli/test_cli.py Outdated Show resolved Hide resolved

jambayk reviewed Nov 11, 2024

View reviewed changes

olive/engine/engine.py Show resolved Hide resolved

xiaoyu-work added 2 commits November 16, 2024 00:52

Merge remote-tracking branch 'origin' into xiaoyu/output

17b1121

fix test

0c8d07e

jambayk reviewed Nov 18, 2024

View reviewed changes

olive/engine/footprint.py Show resolved Hide resolved

Update folder structure commment

be8af11

jambayk previously approved these changes Nov 20, 2024

View reviewed changes

jambayk reviewed Nov 20, 2024

View reviewed changes

Update cli saving output model logic

cd59774

jambayk reviewed Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save output model to output_dir #1430

Save output model to output_dir #1430

xiaoyu-work commented Oct 21, 2024

jambayk Nov 20, 2024 •

edited

Loading

jambayk Nov 20, 2024 •

edited

Loading

xiaoyu-work Nov 20, 2024

jambayk Nov 21, 2024 •

edited

Loading

xiaoyu-work Nov 21, 2024

jambayk Nov 21, 2024

jambayk Nov 21, 2024

xiaoyu-work Nov 21, 2024

jambayk Nov 21, 2024

xiaoyu-work Nov 22, 2024

jambayk Nov 22, 2024 •

edited

Loading

Save output model to output_dir #1430

Are you sure you want to change the base?

Save output model to output_dir #1430

Conversation

xiaoyu-work commented Oct 21, 2024

Describe your changes

Checklist before requesting a review

(Optional) Issue link

jambayk Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

jambayk Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

xiaoyu-work Nov 20, 2024

Choose a reason for hiding this comment

jambayk Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

xiaoyu-work Nov 21, 2024

Choose a reason for hiding this comment

jambayk Nov 21, 2024

Choose a reason for hiding this comment

jambayk Nov 21, 2024

Choose a reason for hiding this comment

xiaoyu-work Nov 21, 2024

Choose a reason for hiding this comment

jambayk Nov 21, 2024

Choose a reason for hiding this comment

xiaoyu-work Nov 22, 2024

Choose a reason for hiding this comment

jambayk Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

jambayk Nov 20, 2024 •

edited

Loading

jambayk Nov 20, 2024 •

edited

Loading

jambayk Nov 21, 2024 •

edited

Loading

jambayk Nov 22, 2024 •

edited

Loading