-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the loss computation in UnigramTrainer correct? #628
Labels
Comments
@taku910 I ran into the issue when training a Transformer-XL model on various configurations of WikiText-103 using sentencepiece:
I looked at the code and @mbollmann seems to be correct that |
Sorry for the late response. yes, the computation was incorrect. Going to be fixed in the next release. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When computing
logsum_alt
, the frequency of a removed piece is re-assigned to alternatives:sentencepiece/src/unigram_model_trainer.cc
Lines 389 to 394 in ba7e11a
But the code uses
alternatives.size()
which, if I'm not mistaken, is always equal tosentencepieces.size()
. Don't we want to multiply with the number of alternatives for this particular sentencepiece, i.e.,alternatives[i].size()
? @taku910 ?The text was updated successfully, but these errors were encountered: