-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Educational purpose] Why OpenNMT-py is fast? #552
Comments
Compare with different framework in github. Maybe only the engineers who build it can explain the trick and reason. Call @srush |
I was called :D So our main aim is simplicity not speed. That being said there are a couple optimizations that matter:
Awesome to hear you are working on GEC, it's a neat problem. Cheers! |
@srush Thanks for the reply! It's helpful! |
@howardyclo can I ask what is GEC? |
Grammatical Error Correction. Feel free to just use our code. It is pretty modular, and it can be more fun to develop with others. We will also likely add some GEC specific features as well. One of our students works on that. |
@srush |
Hello, recently I implemented seq2seq for practicing and educational purpose.
Here is my code.
I also compared the performance to OpenNMT-py, and found that this library is more
GPU-memory efficient and the training iteration is a lot fast. When running the following model:
and trained on my grammatical error correction corpus (2443191 sentence pairs), OpenNMT-py only takes ~1 hour to complete an epoch (~76000 iterations), while my code takes ~6 hour to complete an epoch.
I am wondering what important optimizations should I further do comparing to the OpenNMT-py codebase? Since when I tried OpenNMT-py, I didn't specify
shard_size
and couldn't know why OpenNMT-py is fast? What key script should I be aware to?Appreciated.
The text was updated successfully, but these errors were encountered: