-
-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with to_fp16() #70
Comments
Any thoughts, where am i going wrong in conversion ? |
When i use to_gpu=True and to_fp16=True for loading, i get english as output This looks strange. |
However, that output is just weird in that it's pseudorandom as opposed to fully random, which may imply a different issue in the pipeline. |
Alright, thanks for reviewing ! |
Tested: yes it's random output. I assume something changed in Transformers upstream, so I might have to remove it (also there doesn't seem to be a speed increase anymore). Will add a warning for now. |
* Update dependencies * Fix depreciation warning * Fix param name * Update minimum versions * Remove TorchScript refs * version bump * Bump PL version * dev Dockerfile * Update dependencies * Transformer 4 fix * Fix transformer 4 import for TF conversion * Fix model training for lighting 1.0.0 * Add back GPU memory printing * TPU fixes * Assert descriptions * Generation tweaks (remove pad message) * Ignore .DS_Store * Handle generation by prompt more canonically * Set 20 for refresh default to avoid Colab warning * Fix gen warning while training * Fix model loading from config + generation * FP16 warning (#70) * Fix tokenizer for latest tokenizers * Set default learning rate to 1e-3 * Set CPU config to match tokenizer default
I'm able to use fp16 with sensible outputs if I use: with torch.cuda.amp.autocast():
ai.generate(...) Interestingly, I seem to be getting slower generation using fp16 on an RTX2060. Though, half memory usage is a plus. |
I was really puzzled by this: I found to_fp16 was generating sensible, normal content on Google Colab despite the warning messages, but was totally bizarre in production. It turned out the pyTorch versions were different - Google was on torch 1.8.1 and Cuda 11.1, while my server was torch 1.7 and Cuda 11.0 . |
Hi Max,
I trained 344M model using gpt2 simple (dataset was java code for auto code completion) and saved the checkpoint.
Converted the model to pytorch using:
When i load the model normally
No issues and i can generate easily:
OUTPUT:
system.out.println( + id);
However since,
I want to convert this to fp16 for fast inferencing
I converted model to fp 16 as follows
When i call generate now, it outputs english instead of java
OUTPUT:
system.out. loc character decidedally healthy ultimately points belie mass nearly regidedot price clicklike make TodayocaInd unlike journal Norretene links Good void et attackalsAnSD 54giving sing high Assassatelyhus Y humansware concerned connectionsSt� was believesligmartacing Geteworkamedann·aultrict dep2013� daughtermentructure couldentiallyrolloth confrontted Archbi suitiffge beaut Ed industward Sony* thereileOMrugateg rented Birminghamvironment underinceeg Windows intense static
The text was updated successfully, but these errors were encountered: