Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve EncoderDecoderModel docs #16135

Open
patrickvonplaten opened this issue Mar 14, 2022 · 19 comments · May be fixed by #34323
Open

Improve EncoderDecoderModel docs #16135

patrickvonplaten opened this issue Mar 14, 2022 · 19 comments · May be fixed by #34323

Comments

@patrickvonplaten
Copy link
Contributor

First good issue

There have been quite some issues/questions with how to use the Encoder-Decoder model, e.g.: #4483 and #15479 . The main reason for this is that the model docs are quite outdated and we could need a nice How-to-guide.

So I think we have two action items here:

  1. Improve https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models a.k.a.: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx

We should mention here:
a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model?
b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)
c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: #15479

This should be an EncoderDecoderModel specific text and be very concise and short.

In a second step, we should then write a How-to-guide that includes much more details.

More than happy to help someone tackle this first good issue

@patrickvonplaten patrickvonplaten changed the title EncoderDecoderModel docs: Improve EncoderDecoderModel docs Mar 14, 2022
@silvererudite
Copy link
Contributor

Hi...I would love to contribute to this.

@patrickvonplaten
Copy link
Contributor Author

Awesome! Would you like to open a PR and give it a try? :-) I think it would be great if we could put some example code on how to create an EncoderDecoderModel on this model doc: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx which will then be displayed here: https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models :-)

Let me know if you have any questions! Happy to help :-)

@silvererudite
Copy link
Contributor

Yes..definitely...Will open a PR shortly and ask for help when I'm stuck..thanks a lot.

@Threepointone4
Copy link
Contributor

Hi @patrickvonplaten , I would love to contribute to this.

@Threepointone4
Copy link
Contributor

@patrickvonplaten , I have created the fork and added some docs.

So I think we have two action items here:

  1. Improve https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models a.k.a.: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx

We should mention here:
a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model?
b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)

I have added some documentation, let me know what do you think about this.

c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: #15479

I didn't got chance to go through this, I will try to cover it this week

In a second step, we should then write a How-to-guide that includes much more details.

I have added a colab which has detailed explanation of encoder decoder model and how to train it. Does that help for this?

@patrickvonplaten
Copy link
Contributor Author

Hey @Threepointone4, that's great!

Could you maybe open a PR for:

We should mention here:
a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model?
b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)

? :-)

@Threepointone4
Copy link
Contributor

@patrickvonplaten I have created the PR and done the changes based on my understanding. Please let me know if some changes are required.

@Winterflower
Copy link

Hello all, I'm very much a beginner in this space, so please excuse the potentially stupid question. I have been experimenting with rolling out my own encoder-decoder combinations for use with the VisionEncoderDecoder class as specified in the docs here

The VisionEncoderDecoderModel can be used to initialize an image-to-text model with any pretrained Transformer-based vision model as the encoder ...

but I keep running into the issue of getting this error message

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.

Based on reading the docs, I am not entirely sure if I need to specifically finetune an encode-decoder combination on the image-to-text downstream task and the error message above is due to that or if I can just pre-trained configurations without finetuning.
Perhaps I could open a PR with some docs suggestions?

@patrickvonplaten
Copy link
Contributor Author

Hey @Winterflower,

Could you please try to use the forum instead for such questions: https://discuss.huggingface.co/ ? :-) Thank you!

@anishlukk123
Copy link

is this issue still open because we i would like to take to solve this problem

@ghost
Copy link

ghost commented May 19, 2023

I would like to contribute

@SHUBHAPRIYA95
Copy link

Hi ,if this issue is still open i would love to contribute.

@rajveer43
Copy link
Contributor

Hi @patrickvonplaten , Is it still open I want to work on this!

@patrickvonplaten
Copy link
Contributor Author

Sure maybe you can browse https://huggingface.co/docs/transformers/v4.31.0/en/model_doc/encoder-decoder#overview and check if there is anything we can improve

@rajveer43
Copy link
Contributor

thanks will do check

@riiyaa24
Copy link
Contributor

Hello, I would like to contribute to this issue? Can you assign me this PR

@mhdirnjbr
Copy link

Hello @patrickvonplaten !

This is my very first time deciding to contribute to open source projects inspired by my participation in the Hugging Face event in Paris and the insightful conversations I had with the project maintainers.

As a final-year graduate student in Math and AI, I am eager to explore opportunities to collaborate on this issue. I would greatly appreciate it if you could provide more information on how I can get involved.

Thank you in advance.

@lappemic
Copy link

It feels like this issue was addressed and closed by PR #17815?

@Ryukijano
Copy link
Contributor

I would love to contribute to this!

Ryukijano added a commit to Ryukijano/transformers that referenced this issue Oct 22, 2024
Fixes huggingface#16135

Improve the `EncoderDecoderModel` documentation.

* Add example code to create an `EncoderDecoderModel` using `from_encoder_decoder_pretrained`.
* Add instructions on how to save the model.
* Add instructions on how to fine-tune the model.
* Add a warning about correctly setting configuration values.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/huggingface/transformers/issues/16135?shareId=XXXX-XXXX-XXXX-XXXX).
@Ryukijano Ryukijano linked a pull request Oct 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.