Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doctest longformer #16441

Merged
merged 12 commits into from
May 17, 2022
Merged

Conversation

KMFODA
Copy link
Contributor

@KMFODA KMFODA commented Mar 28, 2022

What does this PR do?

adds LONGFORMER PT to doctests

@patrickvonplaten - LongformerForMaskedLM and LongformerForQuestionAnswering have @replace_return_docstrings instead of @add_code_sample_docstrings. Should I be replacing them with that instead so that I can test for expected_output & expected_loss?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 28, 2022

The documentation is not available anymore as the PR was closed or merged.

@LysandreJik
Copy link
Member

Hey @KMFODA, thanks for your PR! Could you rebase/merge on main so that the CI passes? Thank you!

@KMFODA
Copy link
Contributor Author

KMFODA commented Mar 28, 2022

apologies @LysandreJik, had merged with master not main. Fixed now :).

@@ -1407,6 +1407,28 @@ def _set_gradient_checkpointing(self, module, value=False):
module.gradient_checkpointing = value


LONGFORMER_GENERATION_DOCSTRING = r"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example looks super cool - could you replace it with what's currently written for LongformerForMaskedLM ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @patrickvonplaten

A few more comments:

  • No more need for LONGFORMER_GENERATION_DOCSTRING, but moving the docstring example to LongformerForMaskedLM directly

  • I think it would be great to add a comment before the line containing * 300, say Let's try a very long input.

@patrickvonplaten patrickvonplaten requested a review from ydshieh April 6, 2022 10:44
output_type=LongformerSequenceClassifierOutput,
config_class=_CONFIG_FOR_DOC,
expected_output=[1, 2],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KMFODA Is this the expected output? I think it should be something like ORIGINAL, PLAGIARISM, no?

@huggingface huggingface deleted a comment from github-actions bot May 2, 2022
@patrickvonplaten
Copy link
Contributor

@KMFODA would it be ok if we go into the PR to finish it in case you don't have enough time? :-)

@KMFODA
Copy link
Contributor Author

KMFODA commented May 4, 2022

hey @patrickvonplaten, apologies for the delay. I've managed to work on the proposed changes. Let me know if anything else is missing?

@patrickvonplaten
Copy link
Contributor

Great! Looks good to me actually :-) @ydshieh could you take a look maybe?

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @KMFODA . Thank you for working on this! I left 3 comments about checkpoints, 2 of them about the heads and the randomness.

Comment on lines 2325 to 2329
checkpoint=_CHECKPOINT_FOR_DOC,
output_type=TFLongformerSequenceClassifierOutput,
config_class=_CONFIG_FOR_DOC,
expected_output="LABEL_0",
expected_loss=0.58,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the checkpoint _CHECKPOINT_FOR_DOC, which is allenai/longformer-base-4096 doesn't contain sequence classification head, and the head is randomly initialized. The result would be random, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct!

Comment on lines 2565 to 2569
checkpoint=_CHECKPOINT_FOR_DOC,
output_type=TFLongformerTokenClassifierOutput,
config_class=_CONFIG_FOR_DOC,
expected_output="['LABEL_0', 'LABEL_0', 'LABEL_0', 'LABEL_0', 'LABEL_0', 'LABEL_0', 'LABEL_1', 'LABEL_1', 'LABEL_0', 'LABEL_0', 'LABEL_0', 'LABEL_0']",
expected_loss=0.62,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as I give for sequence classification model.

@ydshieh
Copy link
Collaborator

ydshieh commented May 4, 2022

@patrickvonplaten could you also check if my comment about randomly initialized is valid, please? I haven't run the examples myself though. I can double check tomorrow.

Update: I ran it, and the 2 tests failed as I think. Depending on @KMFODA, maybe we can merge this PR with only the PT file (and we work on TF part internally)? To address this situation, either a conversion of PT -> TF checkpoint, or using a tiny model checkpoint is required.

@ydshieh
Copy link
Collaborator

ydshieh commented May 5, 2022

@KMFODA

If you are willing to deal with the TensorFlow version , could you try to use the TF checkpoint hf-internal-testing/tiny-random-longformer for TFLongformerForSequenceClassification and TFLongformerForTokenClassification.

Otherwise, it's totally OK that we revert the changes in src/transformers/models/longformer/modeling_tf_longformer.py + remove src/transformers/models/longformer/modeling_tf_longformer.py from utils/documentation_tests.txt, and we can merge this PR :-). Thank you!

@KMFODA
Copy link
Contributor Author

KMFODA commented May 11, 2022

hey @ydshieh thanks for the comments. Yes sure I just switched those 3 checkpoints now. I'm having issues with TF and my new M1 Mac so I can't test these now unfortunately. Thought I'd just commit these changes now in case you're in a rush but will try and test these changes soon as I get my TF library working.

@ydshieh
Copy link
Collaborator

ydshieh commented May 12, 2022

Hi, @KMFODA , no rush on our side :-). But if you have trouble with TF on Mac, we can run it and update the expect values, just let us know. Thank you!

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KMFODA Thank you for all the efforts and patience to address all the comments!

This PR is ready to merge (after fixing code quality), just waiting @patrickvonplaten to give the final approval 💯.

(run the tests twice: all passed)

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge for me once the make quality test passes :-)

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
@KMFODA
Copy link
Contributor Author

KMFODA commented May 17, 2022

No problem at all. Thanks for all the helpful comments. Changes made :).

@patrickvonplaten
Copy link
Contributor

Thanks a mille for all your work here @KMFODA !

@patrickvonplaten patrickvonplaten merged commit 38ddab1 into huggingface:main May 17, 2022
ArthurZucker added a commit to ArthurZucker/transformers that referenced this pull request May 20, 2022
commit 5419205
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Thu May 19 23:46:26 2022 +0200

    [Test OPT] Add batch generation test opt (huggingface#17359)

    * up

    * up

commit 48c2269
Author: ddobokki <44228269+ddobokki@users.noreply.github.com>
Date:   Fri May 20 05:42:44 2022 +0900

    Fix bug in Wav2Vec2 pretrain example (huggingface#17326)

commit 5d6feec
Author: Nathan Dahlberg <58701810+nadahlberg@users.noreply.github.com>
Date:   Thu May 19 16:21:19 2022 -0400

    fix for 17292 (huggingface#17293)

commit 518bd02
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Thu May 19 22:17:02 2022 +0200

    [Generation] Fix Transition probs (huggingface#17311)

    * [Draft] fix transition probs

    * up

    * up

    * up

    * make it work

    * fix

    * finish

    * update

commit e8714c0
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Thu May 19 22:15:36 2022 +0200

    [OPT] Run test in lower precision on GPU (huggingface#17353)

    * [OPT] Run test only in half precision

    * up

    * up

    * up

    * up

    * finish

    * fix on GPU

    * Update tests/models/opt/test_modeling_opt.py

commit 2b28229
Author: Nicolas Patry <patry.nicolas@protonmail.com>
Date:   Thu May 19 20:28:12 2022 +0200

    Adding `batch_size` test to QA pipeline. (huggingface#17330)

commit a4386d7
Author: Nicolas Patry <patry.nicolas@protonmail.com>
Date:   Thu May 19 10:29:16 2022 +0200

    [BC] Fixing usage of text pairs (huggingface#17324)

    * [BC] Fixing usage of text pairs

    The BC is actually preventing users from misusing the pipeline since
    users could have been willing to send text pairs and the pipeline would
    instead understand the thing as a batch returning bogus results.

    The correct usage of text pairs is preserved in this PR even when that
    makes the code clunky.

    Adds support for {"text":..,, "text_pair": ...} inputs for both dataset
    iteration and more explicit usage to pairs.

    * Updating the doc.

    * Update src/transformers/pipelines/text_classification.py

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

    * Update src/transformers/pipelines/text_classification.py

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

    * Update tests/pipelines/test_pipelines_text_classification.py

    Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

    * quality.

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
    Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

commit 3601aa8
Author: Stas Bekman <stas00@users.noreply.github.com>
Date:   Wed May 18 16:00:47 2022 -0700

    [tests] fix copy-n-paste error (huggingface#17312)

    * [tests] fix copy-n-paste error

    * fix

commit 1b20c97
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Wed May 18 21:49:08 2022 +0200

    Fix ci_url might be None (huggingface#17332)

    * fix

    * Update utils/notification_service.py

    Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

commit 6aad387
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Wed May 18 21:26:44 2022 +0200

    fix (huggingface#17337)

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

commit 1762ded
Author: Zachary Mueller <muellerzr@gmail.com>
Date:   Wed May 18 14:17:40 2022 -0400

    Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (huggingface#17331)

    * Fix length in no_trainer examples

    * Add setup and teardown

    * Use new accelerator config generator to automatically make tests able to run based on environment

commit 6e195eb
Author: Jader Martins <jadermcs@users.noreply.github.com>
Date:   Wed May 18 14:18:43 2022 -0300

    docs for typical decoding (huggingface#17186)

    Co-authored-by: Jader Martins <jadermcs94@gmail.com>

commit 060fe61
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Wed May 18 19:07:48 2022 +0200

    Not send successful report (huggingface#17329)

    * send report only if there is any failure

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

commit b3b9f99
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Wed May 18 17:57:23 2022 +0200

    Fix test_t5_decoder_model_past_large_inputs (huggingface#17320)

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

commit 6da76b9
Author: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>
Date:   Wed May 18 17:52:13 2022 +0200

    Add onnx export cuda support (huggingface#17183)

    Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

    Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

commit adc0ff2
Author: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Date:   Wed May 18 17:47:18 2022 +0200

    Add CvT (huggingface#17299)

    * Adding cvt files

    * Adding cvt files

    * changes in init file

    * Adding cvt files

    * changes in init file

    * Style fixes

    * Address comments from code review

    * Apply suggestions from code review

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

    * Format lists in docstring

    * Fix copies

    * Apply suggestion from code review

    Co-authored-by: AnugunjNaman <anugunjjha@gmail.com>
    Co-authored-by: Ayushman Singh <singhayushman13@protonmail.com>
    Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

commit 4710702
Author: Sylvain Gugger <Sylvain.gugger@gmail.com>
Date:   Wed May 18 10:46:40 2022 -0400

    Fix style

commit 5fdb54e
Author: mraunak <83710963+mraunak@users.noreply.github.com>
Date:   Wed May 18 10:39:02 2022 -0400

    Add Information Gain Filtration algorithm (huggingface#16953)

    * Add information gain filtration algorithm

    * Complying with black requirements

    * Added author

    * Fixed import order

    * flake8 corrections

    Co-authored-by: Javier Turek <javier.turek@intel.com>

commit 91ede48
Author: Kamal Raj <kamalraj97@gmail.com>
Date:   Wed May 18 19:59:53 2022 +0530

    Fix typo (huggingface#17328)

commit fe28eb9
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Wed May 18 16:06:41 2022 +0200

    remove (huggingface#17325)

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

commit 2cb2ea3
Author: Nicolas Patry <patry.nicolas@protonmail.com>
Date:   Wed May 18 16:06:24 2022 +0200

    Accepting real pytorch device as arguments. (huggingface#17318)

    * Accepting real pytorch device as arguments.

    * is_torch_available.

commit 1c9d1f4
Author: Nicolas Patry <patry.nicolas@protonmail.com>
Date:   Wed May 18 15:46:12 2022 +0200

    Updating the docs for `max_seq_len` in QA pipeline (huggingface#17316)

commit 60ad734
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Wed May 18 15:08:56 2022 +0200

    [T5] Fix init in TF and Flax for pretraining (huggingface#17294)

    * fix init

    * Apply suggestions from code review

    * fix

    * finish

    * Update src/transformers/modeling_tf_utils.py

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

commit 7ba1d4e
Author: Joaq <55513213+jQuinRivero@users.noreply.github.com>
Date:   Wed May 18 09:23:47 2022 -0300

    Add type hints for ProphetNet (Pytorch) (huggingface#17223)

    * added type hints to prophetnet

    * reformatted with black

    * fix bc black misformatted some parts

    * fix imports

    * fix imports

    * Update src/transformers/models/prophetnet/configuration_prophetnet.py

    Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

    * update OPTIONAL type hint and docstring

    Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

commit d6b8e9c
Author: Carl <carl.cochet@gmail.com>
Date:   Wed May 18 01:07:43 2022 +0200

    Add trajectory transformer (huggingface#17141)

    * Add trajectory transformer

    Fix model init

    Fix end of lines for .mdx files

    Add trajectory transformer model to toctree

    Add forward input docs

    Fix docs, remove prints, simplify prediction test

    Apply suggestions from code review

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
    Apply suggestions from code review

    Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
    Update docs, more descriptive comments

    Apply suggestions from code review

    Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
    Update readme

    Small comment update and add conversion script

    Rebase and reformat

    Fix copies

    Fix rebase, remove duplicates

    Fix rebase, remove duplicates

    * Remove tapex

    * Remove tapex

    * Remove tapex

commit c352640
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Wed May 18 00:34:31 2022 +0200

    fix (huggingface#17310)

commit d9050dc
Author: Cesare Campagnano <cesare.campagnano@gmail.com>
Date:   Tue May 17 23:44:37 2022 +0200

    [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (huggingface#17112)

    * [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing

    * LED docs clarification

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    * [LED] gradient_checkpointing=True should be passed to TrainingArguments

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    * [LED] docs: remove wrong word

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    * [LED] docs fix typo

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

commit bad3583
Author: Jean Vancoppenolle <jean.vcop@gmail.com>
Date:   Tue May 17 23:42:14 2022 +0200

    Add support for pretraining recurring span selection to Splinter (huggingface#17247)

    * Add SplinterForSpanSelection for pre-training recurring span selection.

    * Formatting.

    * Rename SplinterForSpanSelection to SplinterForPreTraining.

    * Ensure repo consistency

    * Fixup changes

    * Address SplinterForPreTraining PR comments

    * Incorporate feedback and derive multiple question tokens per example.

    * Update src/transformers/models/splinter/modeling_splinter.py

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    * Update src/transformers/models/splinter/modeling_splinter.py

    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

    Co-authored-by: Jean Vancoppenole <jean.vancoppenolle@retresco.de>
    Co-authored-by: Tobias Günther <tobias.guenther@retresco.de>
    Co-authored-by: Tobias Günther <github@tobigue.de>
    Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

commit 0511305
Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Date:   Tue May 17 18:56:58 2022 +0200

    Add PR author in CI report + merged by info (huggingface#17298)

    * Add author info to CI report

    * Add merged by info

    * update

    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

commit 032d63b
Author: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Date:   Tue May 17 12:56:24 2022 -0400

    Fix dummy creation script (huggingface#17304)

commit 986dd5c
Author: Sylvain Gugger <Sylvain.gugger@gmail.com>
Date:   Tue May 17 12:50:14 2022 -0400

    Fix style

commit 38ddab1
Author: Karim Foda <35491698+KMFODA@users.noreply.github.com>
Date:   Tue May 17 09:32:12 2022 -0700

    Doctest longformer (huggingface#16441)

    * Add initial doctring changes

    * make fixup

    * Add TF doc changes

    * fix seq classifier output

    * fix quality errors

    * t

    * swithc head to random init

    * Fix expected outputs

    * Update src/transformers/models/longformer/modeling_longformer.py

    Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

    Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

commit 10704e1
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Tue May 17 18:20:36 2022 +0200

    [Test] Fix W2V-Conformer integration test (huggingface#17303)

    * [Test] Fix W2V-Conformer integration test

    * correct w2v2

    * up

commit 28a0811
Author: regisss <15324346+regisss@users.noreply.github.com>
Date:   Tue May 17 17:58:14 2022 +0200

    Improve mismatched sizes management when loading a pretrained model (huggingface#17257)

    - Add --ignore_mismatched_sizes argument to classification examples

    - Expand the error message when loading a model whose head dimensions are different from expected dimensions

commit 1f13ba8
Author: Patrick von Platen <patrick.v.platen@gmail.com>
Date:   Tue May 17 15:48:23 2022 +0200

    correct opt (huggingface#17301)

commit 349f1c8
Author: Matt <Rocketknight1@users.noreply.github.com>
Date:   Tue May 17 14:36:23 2022 +0100

    Rewrite TensorFlow train_step and test_step (huggingface#17057)

    * Initial commit

    * Better label renaming

    * Remove breakpoint before pushing (this is your job)

    * Test a lot more in the Keras fit() test

    * make fixup

    * Clarify the case where we flatten y dicts into tensors

    * Clarify the case where we flatten y dicts into tensors

    * Extract label name remapping to a method

commit 651e48e
Author: Matt <Rocketknight1@users.noreply.github.com>
Date:   Tue May 17 14:14:17 2022 +0100

    Fix tests of mixed precision now that experimental is deprecated (huggingface#17300)

    * Fix tests of mixed precision now that experimental is deprecated

    * Fix mixed precision in training_args_tf.py too

commit 6d21142
Author: SaulLu <55560583+SaulLu@users.noreply.github.com>
Date:   Tue May 17 14:33:13 2022 +0200

    fix retribert's `test_torch_encode_plus_sent_to_model` (huggingface#17231)
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* Add initial doctring changes

* make fixup

* Add TF doc changes

* fix seq classifier output

* fix quality errors

* t

* swithc head to random init

* Fix expected outputs

* Update src/transformers/models/longformer/modeling_longformer.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants