NER - pl example #3180

shubhamagarwal92 · 2020-03-08T23:37:09Z

Solves most of the issues raised in NER: some issues in PyTorch Lightning example #3159
Streamlines shell script pipeline
pl logs related changes
added in Readme

…ents. 2. unrecognized arguments: save_steps

…3. pass gpu ids compared to number of gpus

…ipts

srush · 2020-03-09T11:04:10Z

This looks great to me. Thanks @shubhamagarwal92.

You need to just run the black command, I believe to is make style to fix formatting. Otherwise it looks good.

shubhamagarwal92 · 2020-03-09T14:10:32Z

@srush I ran the make style which changed a lot of files, however, the check_code_quality is still failing!
Do you want me to revert the last two commits and manually change the 'logs' to "logs"

BTW, pip install -e ".[dev]" is failing on both mac and linux for tensorflow and on sentencepiece for mac. I had to manually install the ["black", "isort", "flake8"] packages. Python=3.8.2 in conda env.

LysandreJik · 2020-03-09T14:24:20Z

BTW, pip install -e ".[dev]" is failing on both mac and linux for tensorflow and on sentencepiece for mac.

That would be because TensorFlow only supports python 3.5-3.7, unfortunately.

shubhamagarwal92 · 2020-03-09T14:45:00Z

BTW, pip install -e ".[dev]" is failing on both mac and linux for tensorflow and on sentencepiece for mac.

That would be because TensorFlow only supports python 3.5-3.7, unfortunately.

@LysandreJik Thanks. Installs on ubuntu with python 3.6.

However, on mac:

conda create -n transformers_dev python=3.6 -y
conda activate transformers_dev
pip install -e ".[dev]"


Failed to build tokenizers
ERROR: Could not build wheels for tokenizers which use PEP 517 and cannot be installed directly

Mac specs:

Python 3.6.10 | packaged by conda-forge | (default, Mar  5 2020, 09:56:10) 
[GCC Clang 9.0.1 ] on darwin

stefan-it · 2020-03-09T15:19:04Z

Hi @shubhamagarwal92 ,

thanks for that PR and fixing the issues 👍

I just ran the run_pl.sh script, training and final testing run without errors now. However, during training precision, recall and F1 are always 0 -> final output on the test set shows:

TEST RESULTS
{'val_loss': tensor(7.0679), 'precision': 0.0, 'recall': 0.0, 'f1': 0}
----------------------------------------------------------------------------------------------------
Testing: 200it [00:07, 27.98it/s]

Last lines of the prediction output:

der I-OTHderiv
Bibliothek I-OTHderiv
berufen I-OTHderiv
wurde I-OTHderiv
, I-OTHderiv
verließ I-OTHderiv
Gardthausen I-OTHderiv
den I-OTHderiv
Bibliotheksdienst I-OTHderiv
. I-OTHderiv

shubhamagarwal92 · 2020-03-09T15:35:55Z

I just ran the run_pl.sh script, training and final testing run without errors now. However, during training precision, recall and F1 are always 0 -> final output on the test set shows:
TEST RESULTS
{'val_loss': tensor(7.0679), 'precision': 0.0, 'recall': 0.0, 'f1': 0}
----------------------------------------------------------------------------------------------------
Testing: 200it [00:07, 27.98it/s]

Thanks for reporting this. Could you please verify the version of pytorch-lightning. For me it is
pytorch-lightning==0.7.1, transformers==2.5.1 and the results as reported in the README.

Also could you please check if the results in ${OUTPUT_DIR}/test_results.txt mentioned here
also correspond to 0.

It works for me as:

stefan-it · 2020-03-09T16:04:31Z

Hi,

I'm using the same versions of both pytorch-lightning and transformers 😂

Output of test_results is:

$ cat germeval-model/test_results.txt 
f1 = 0
precision = 0.0
recall = 0.0
val_loss = tensor(9.4173)

But I'm going to test it on another machine :)

shubhamagarwal92 · 2020-03-09T16:13:18Z

But I'm going to test it on another machine :)

I am also attaching my environment file via pip freeze > requirements.txt:
requirements.txt

Please let me know if this doesn't work.

srush · 2020-03-09T17:30:10Z

@shubhamagarwal92 I think somehow you have the wrong version of our style checks installed.

Can you try running under this command?

sudo pip install git+git://github.com/timothycrosley/isort.git@e63ae06ec7d70b06df9e528357650281a3d3ec22#egg=isort
sudo pip install .[tf,torch,quality]

@LysandreJik we have to fix this, it is really confusing...

@stefan-it would love to see your log as well. Could you also try rm cached* I think maybe your feature cache got messed up?

codecov-io · 2020-03-09T21:48:49Z

Codecov Report

Merging #3180 into master will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3180      +/-   ##
==========================================
+ Coverage   78.15%   78.16%   +0.01%     
==========================================
  Files          98       98              
  Lines       16641    16641              
==========================================
+ Hits        13006    13008       +2     
+ Misses       3635     3633       -2

Impacted Files	Coverage Δ
src/transformers/modeling_utils.py	`94.72% <0%> (+0.31%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e03129a...9f949d3. Read the comment docs.

shubhamagarwal92 · 2020-03-09T22:16:36Z

sudo pip install git+git://github.com/timothycrosley/isort.git@e63ae06ec7d70b06df9e528357650281a3d3ec22#egg=isort
sudo pip install .[tf,torch,quality]

@srush I reverted the last 3 style related commits, force-pushed and added a small commit to pass all the checks. Please merge if everything is fine.

Also, for isort, I guess you meant using git+https://

pip install git+https://github.com/timothycrosley/isort.git@e63ae06ec7d70b06df9e528357650281a3d3ec22#egg=isort

This link is also wrong at contributing.md. Could you please also state the python version in the md.

This command pip install .[tf,torch,quality] is still failing on Mac as mentioned in my previous comment.

Failed to build tokenizers
ERROR: Could not build wheels for tokenizers which use PEP 517 and cannot be installed directly

srush · 2020-03-10T00:10:26Z

Thanks @shubhamagarwal92. Sorry for the annoyance.

@LysandreJik this lgtm.

shubhamagarwal92 · 2020-03-10T08:46:54Z

@srush Happy to help! :)

Thanks for approving the PR!

stefan-it · 2020-03-10T13:50:17Z

@shashwath94 I think I've found the reason for the bad evaluation results: I'm using apex and the --fp16 parameter in the run_pl.sh script!

Do you have any idea, why it is not working using half precision 🤔

srush · 2020-03-10T14:06:58Z

Ah, I will check with pytorch-lightning. It is plausible we are not integrating correctly with them

shubhamagarwal92 · 2020-03-10T17:39:35Z

@srush While you are it, could you please check the status of my PR in pl as well.

Lightning-AI/pytorch-lightning#1094

Basically, I was observing memory leak on GPU0 if other GPU id (eg. [1]) was provided when running the NER example.

AFAIK, the solution is torch.cuda.set_device()

* 1. seqeval required by ner pl example. install from examples/requirements. 2. unrecognized arguments: save_steps * pl checkpoint callback filenotfound error: make directory and pass * huggingface#3159 pl checkpoint path difference * 1. Updated Readme for pl 2. pl script now also correct displays logs 3. pass gpu ids compared to number of gpus * Updated results in readme * 1. updated readme 2. removing deprecated pl methods 3. finalizing scripts * comment length check * using deprecated validation_end for stable results * style related changes

shubhamagarwal92 added 7 commits March 8, 2020 21:29

1. seqeval required by ner pl example. install from examples/requirem…

81ea979

…ents. 2. unrecognized arguments: save_steps

pl checkpoint callback filenotfound error: make directory and pass

84ee92d

huggingface#3159 pl checkpoint path difference

d1fbd42

1. Updated Readme for pl 2. pl script now also correct displays logs …

6159077

…3. pass gpu ids compared to number of gpus

Updated results in readme

ed39624

1. updated readme 2. removing deprecated pl methods 3. finalizing scr…

cb91b9e

…ipts

comment length check

881ca77

shubhamagarwal92 mentioned this pull request Mar 8, 2020

NER: some issues in PyTorch Lightning example #3159

Closed

using deprecated validation_end for stable results

5e81127

shubhamagarwal92 force-pushed the master branch from 752c04f to 5e81127 Compare March 9, 2020 21:37

style related changes

9f949d3

LysandreJik approved these changes Mar 10, 2020

View reviewed changes

LysandreJik merged commit 5ca356a into huggingface:master Mar 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NER - pl example #3180

NER - pl example #3180

shubhamagarwal92 commented Mar 8, 2020 •

edited

Loading

srush commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020

LysandreJik commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020

stefan-it commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

stefan-it commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

srush commented Mar 9, 2020

codecov-io commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

srush commented Mar 10, 2020

shubhamagarwal92 commented Mar 10, 2020

stefan-it commented Mar 10, 2020 •

edited

Loading

srush commented Mar 10, 2020

shubhamagarwal92 commented Mar 10, 2020

NER - pl example #3180

NER - pl example #3180

Conversation

shubhamagarwal92 commented Mar 8, 2020 • edited Loading

srush commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020

LysandreJik commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020

stefan-it commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020 • edited Loading

stefan-it commented Mar 9, 2020

shubhamagarwal92 commented Mar 9, 2020 • edited Loading

srush commented Mar 9, 2020

codecov-io commented Mar 9, 2020

Codecov Report

shubhamagarwal92 commented Mar 9, 2020 • edited Loading

srush commented Mar 10, 2020

shubhamagarwal92 commented Mar 10, 2020

stefan-it commented Mar 10, 2020 • edited Loading

srush commented Mar 10, 2020

shubhamagarwal92 commented Mar 10, 2020

shubhamagarwal92 commented Mar 8, 2020 •

edited

Loading

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

shubhamagarwal92 commented Mar 9, 2020 •

edited

Loading

stefan-it commented Mar 10, 2020 •

edited

Loading