-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mergekit-mega: compound merging using multiple yaml documents in a single merge config #72
Conversation
Will use multiple merge configurations as seperate documents in a yaml file and do them in the correct order.
This is great! I'll look this over soon and give you some comments - think there are a few minor tweaks that would clean things up a bit. Excited to get this in. You can ignore |
Added a basic check for circular dependencies and by henky's suggestion decided on using names for the individual merges and then specifying the working directory/out_path as a cli arg. |
R1702 not fixed due to the inherently nested nature of the config format E1120 not fixed due to arguments handled by click R0912 not fixed due to being introduced by click
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty much good to me! I had one nitpick, and it looks like isort
wants a change or two. Those aside I'm happy to merge this in.
When #67 makes its way in there's infrastructure to do exactly the kind of graph wrangling you wrote in here, which could make things cleaner - but that's a ways off.
Thanks for the PR!
… reuse for slices and models syntax
Alright, added some docstrings mostly to please pylint ran isort and moved the part about checking if its a dependent merge or not out to a separate function, so should be good to merge! |
Actually just realized that you can't use other models from the merge as a base model! 😅 |
Looks great! Thanks for putting this together. |
This adds the a new script
mergekit-mega
that takes a yaml file with multiple merge configs specified as individual yaml documents and merges them in the proper order depending on interdependencies.A couple of quirks that should probably be ironed out before merging
Currently there's no special handling for circular dependencies, and I'm unsure how it would break.I dont know if theout_path
in the yaml file is ideal, or if it should instead be aname
, and the out_dir gets specified on the command line.The huggingface repo regex probably doesn't cover all the edge cases in its current form