Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save output model to output_dir #1430

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Save output model to output_dir #1430

wants to merge 12 commits into from

Conversation

xiaoyu-work
Copy link
Contributor

Describe your changes

Save output model to output_dir

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

olive/engine/engine.py Outdated Show resolved Hide resolved
jambayk
jambayk previously approved these changes Nov 20, 2024
@jambayk jambayk dismissed their stale review November 20, 2024 01:27

some additional questions

@@ -416,7 +416,7 @@ def save_output_model(config: Dict, output_model_dir: Union[str, Path]):

This assumes a single accelerator workflow.
"""
run_output_path = Path(config["output_dir"]) / "output_model"
run_output_path = Path(config["output_dir"])
Copy link
Contributor

@jambayk jambayk Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw without this output_model nesting, now the output of the cli would also have the footprint and other files copied over even though they mean nothing to the user of the cli and is messy+confusing. I think we might need an option to disable saving these files like you mentioned once before.

Copy link
Contributor

@jambayk jambayk Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or make it opt-in to save the extra files instead of opt-out. most users only care about the final output model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I don't think users need those files. We should only copy model files here.

for resource_key, resource_path in all_resources.items():
src_path = Path(resource_path.get_path()).resolve()
if src_path.is_dir():
hardlink_copy_dir(src_path, output_model_dir / src_path)
Copy link
Contributor

@jambayk jambayk Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. output_model_dir / src_path = src_path since src_path is a fully resolved path. you probably wanted str_path.name like in 448?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah true! I forgot to update this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also check how the outputs for a graph capture for with --use_ort_genai and --use_model_builder look. the additional_files attributes might make this copy the additional files into the output directory, even though they refer to files already in the model subdir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I described in the previous comment, I think it will be easier to just make saving the footprint, etc opt-in in the workflow config. that way, even for cli, we can just use the final output dir and not need to do any of this copy and path update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is copying additional files to output folder as they are also a part of the output model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the additional files get copied twice in this. Once is as part of the model_path resource copy into model subdir. Then they are again copied individually directly into the output_dir itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all additional files stored in the model path? I kinda remember it can be everywhere depending on the pass who created it. Was this logic updated before?

Copy link
Contributor

@jambayk jambayk Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is stored in model path normally. some passes like modelbuilder with metadata saves it in a different folder but that was because we weren't sure if should copy/hardlink the existing model files. But both the pass carry forward https://github.com/microsoft/Olive/blob/main/olive/passes/olive_pass.py#L274 and cache model save have always saved in model_path

# we only have additional files for onnx models so saving to "model" is safe

Since the output of a workflow goes through the cache model save, the additional files are already in model_path resources for onnx models. this is different for composite models where they are saved in output_dir.

So i think it's less hacky to just save the cli output directly in output_path and opt out of saving the other footprints, etc. No need to temp directories or copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants