Issue with to_fp16() #70

Manas-Embold · 2020-11-25T11:34:49Z

Hi Max,

I trained 344M model using gpt2 simple (dataset was java code for auto code completion) and saved the checkpoint.
Converted the model to pytorch using:

! cd '/content/checkpoint' && transformers-cli convert --model_type gpt2 --tf_checkpoint '/content/checkpoint/run1/' --pytorch_dump_output '/content/checkpoint/run1/pytorch' --config '/content/checkpoint/run1/hparams.json'

When i load the model normally

from aitextgen import aitextgen
config = '/content/checkpoint/run1/pytorch/config.json'
ai = aitextgen(model="/content/checkpoint/run1/pytorch/pytorch_model.bin", config=config)

No issues and i can generate easily:

ai.generate(n=1, prompt="system.out.", max_length=100)

OUTPUT:
system.out.println( + id);

However since,
I want to convert this to fp16 for fast inferencing
I converted model to fp 16 as follows

from aitextgen import aitextgen
config = '/content/checkpoint/run1/pytorch/config.json'
ai = aitextgen(model="/content/checkpoint/run1/pytorch/pytorch_model.bin", config=config, to_gpu=True, to_fp16=True)

When i call generate now, it outputs english instead of java

ai.generate(n=1, prompt="system.out.", max_length=100)

OUTPUT:
system.out. loc character decidedally healthy ultimately points belie mass nearly regidedot price clicklike make TodayocaInd unlike journal Norretene links Good void et attackalsAnSD 54giving sing high Assassatelyhus Y humansware concerned connectionsSt� was believesligmartacing Geteworkamedann·aultrict dep2013� daughtermentructure couldentiallyrolloth confrontted Archbi suitiffge beaut Ed industward Sony* thereileOMrugateg rented Birminghamvironment underinceeg Windows intense static

Manas-Embold · 2020-11-25T11:38:30Z

Any thoughts, where am i going wrong in conversion ?
I think after conversion its kind of loading default gpt2 english language model instead of mine gpt-2 model trained on java code.

Manas-Embold · 2020-11-26T05:10:48Z

When i use to_gpu=True and to_fp16=True for loading, i get english as output
When i just use to_fp16=True and skip to_gpu=True, i get proper java output

This looks strange.

minimaxir · 2020-11-30T18:53:51Z

to_fp16() is sorta beta and not fully tested. Ideally the ONNX support which I intend to add will handle this better.

However, that output is just weird in that it's pseudorandom as opposed to fully random, which may imply a different issue in the pipeline.

junkgear · 2020-11-30T18:57:26Z

Alright, thanks for reviewing !

minimaxir · 2020-12-01T00:20:09Z

Tested: yes it's random output. I assume something changed in Transformers upstream, so I might have to remove it (also there doesn't seem to be a speed increase anymore). Will add a warning for now.

* Update dependencies * Fix depreciation warning * Fix param name * Update minimum versions * Remove TorchScript refs * version bump * Bump PL version * dev Dockerfile * Update dependencies * Transformer 4 fix * Fix transformer 4 import for TF conversion * Fix model training for lighting 1.0.0 * Add back GPU memory printing * TPU fixes * Assert descriptions * Generation tweaks (remove pad message) * Ignore .DS_Store * Handle generation by prompt more canonically * Set 20 for refresh default to avoid Colab warning * Fix gen warning while training * Fix model loading from config + generation * FP16 warning (#70) * Fix tokenizer for latest tokenizers * Set default learning rate to 1e-3 * Set CPU config to match tokenizer default

briansemrau · 2021-01-27T23:14:31Z

I'm able to use fp16 with sensible outputs if I use:

with torch.cuda.amp.autocast():
    ai.generate(...)

Interestingly, I seem to be getting slower generation using fp16 on an RTX2060. Though, half memory usage is a plus.

jonnyplatt · 2021-04-28T08:41:27Z

I was really puzzled by this: I found to_fp16 was generating sensible, normal content on Google Colab despite the warning messages, but was totally bizarre in production. It turned out the pyTorch versions were different - Google was on torch 1.8.1 and Cuda 11.1, while my server was torch 1.7 and Cuda 11.0 .
Once I upgraded the libraries on my server I found FP16 generation was working correctly again, so it may be worth updating the warning where people are on older pyTorch versions?

minimaxir added a commit that referenced this issue Dec 1, 2020

FP16 warning (#70)

1673816

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with to_fp16() #70

Issue with to_fp16() #70

Manas-Embold commented Nov 25, 2020 •

edited

Loading

Manas-Embold commented Nov 25, 2020

Manas-Embold commented Nov 26, 2020

minimaxir commented Nov 30, 2020

junkgear commented Nov 30, 2020

minimaxir commented Dec 1, 2020 •

edited

Loading

briansemrau commented Jan 27, 2021 •

edited

Loading

jonnyplatt commented Apr 28, 2021

Issue with to_fp16() #70

Issue with to_fp16() #70

Comments

Manas-Embold commented Nov 25, 2020 • edited Loading

Manas-Embold commented Nov 25, 2020

Manas-Embold commented Nov 26, 2020

minimaxir commented Nov 30, 2020

junkgear commented Nov 30, 2020

minimaxir commented Dec 1, 2020 • edited Loading

briansemrau commented Jan 27, 2021 • edited Loading

jonnyplatt commented Apr 28, 2021

Manas-Embold commented Nov 25, 2020 •

edited

Loading

minimaxir commented Dec 1, 2020 •

edited

Loading

briansemrau commented Jan 27, 2021 •

edited

Loading