-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue instantiating a keras_nlp.models.Backbone from a model preset of Hugging Face handles #1574
Comments
I believe this is because you are attempting to load a I think there's two actionable things here:
@Wauplin @SamanehSaadat cc'ing for thoughts. |
I see two places in the code where we could add a check. Either when we download from the Hub (check the Agree on the longer term goal as well but I'll defer the topic to @Rocketknight1 who's more knowledgeable on the |
@mattdangerw I think a short-term check that raises a sensible error makes sense as the first step. Longer-term, conversion should be possible - once we detect a transformers checkpoint in KerasNLP, as long as KerasNLP already has support for that architecture, we could just have a mapping for config attributes, tokenizer vocab and layer names to convert the checkpoint. In theory, simple. In practice, I'm sure there are lots of painful edge cases to worry about, but I suspect we'd get most of the benefit just from supporting a few of the most popular architectures (e.g. Gemma/Llama/Mistral/Mixtral), and maybe that wouldn't be so bad! |
@Wauplin I agree that we should add a check when loading the model to make sure both local and remote presets are covered. I believe, the check should be done at the beginning of our |
The long-term approach has been addressed in #1662 so I'll close this issue. |
Describe the bug
I am unable to instantiate a keras_nlp.models.Backbone from a model preset of Hugging Face handles, and get the following error:
To Reproduce
https://colab.research.google.com/drive/1sL3dEM8ZOLCbQ5RM1P6usrrzfqk5aTbQ?usp=sharing
Expected behavior
This line should probably be changed?
https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/utils/preset_utils.py#L411
E.g., config.json, from HF: https://huggingface.co/google-bert/bert-base-uncased/resolve/main/config.json
Would you like to help us fix it?
The text was updated successfully, but these errors were encountered: