Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make it easier to adjust dropout when loading gemma models #1620

Open
josharian opened this issue May 5, 2024 · 1 comment
Open

make it easier to adjust dropout when loading gemma models #1620

josharian opened this issue May 5, 2024 · 1 comment
Assignees
Labels
Gemma Gemma model specific issues type:feature New feature or request

Comments

@josharian
Copy link

Is your feature request related to a problem? Please describe.

I'm fine tuning gemma models. I'd like to be able to:

  • add dropout to models loaded with from_preset
  • remove dropout from models I've saved, when loading them (not before saving, otherwise I can't use the same checkpoint to both continue training and to infer from)

There is way to do either of these without reaching into the models and fiddling with the layers. And going from no dropout to some dropout (or vice versa) is particularly challenging, because the relevant layers need to be created/destroyed.

My current "easy" workaround is to manually edit the config on disk. (For from_preset, this means messing with the kaggle cache. For saved models, this means unzipping and then rezipping a .keras file.) This works but is, ummm...not something I'm proud of.

Describe the solution you'd like

One option would be an API that lets me override config values, such as dropout, at model load time (be that from_preset or keras.saving.load_model).

Another would be a set_dropout API that includes recursively setting dropout values as well as adding/removing dropout layers as appropriate.

Additional context

This issue is meant more in the spirit of an experience report than a feature request. I suspect(?) that adjusting dropout is (or should be?) a common desire, so might be a good thing to consider across models, not just for me, for gemma.

@mattdangerw
Copy link
Member

This should be doable just by allowing an extra argument in the from_preset constructor to set the dropout. I will take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Gemma Gemma model specific issues type:feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants