-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dy2stat] Add word2vec as unittest #24944
Conversation
test=develop
Thanks for your contribution! |
❌This PR is not created using PR's template. You can refer to this Demo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the rest.
indices = indices[np.argsort(-flat[indices])] | ||
for i in indices: # Remove the input words | ||
print('for word %s, the similar word is %s' % | ||
(query_token, str(id2word_dict[i]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that get_similar_tokens
is not used in the unittest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied code so I didn't care much about this. The get_similar_tokens is used in the original code but it now cannot run because of API compatibility issue. I will remove it in this unittest.
|
||
|
||
#for _, batch in zip(range(10), build_batch(dataset, 128, 3)): | ||
# print(batch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is at the original code. I commented it so that we don't print much info. I will remove line 208-209.
|
||
for epoch in range(epoch_num): | ||
|
||
#random.shuffle(dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this commented code necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the commented code is in original code. I commented because the random.shuffle can cause the input to dygraph/static be different thus our check can fail.
test=develop
PR types
Others
PR changes
Others
Describe
Add word2vec as ProgramTranslator unit test。
The word2vec model is referred from https://github.com/PaddlePaddle/models/blob/856c428fe0f1f451f8a22acee50c2e3af33c834e/dygraph/word2vec/word2vec.py#L238