Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Commit

Permalink
Fix getNN in python bindings to avoid 'utf-8' codec can't decode erro…
Browse files Browse the repository at this point in the history
…r. (#967)

Summary:
This [earlier commit](e13484b) fixed issue #715 by casting all strings to Python strings. However, this functionality was not added to getNN and I was seeing the same error when querying nearest neighbors for Japanese language. This commit simply adapts castToPythonString to the get NN function.
Pull Request resolved: #967

Reviewed By: EdouardGrave

Differential Revision: D19287807

Pulled By: Celebio

fbshipit-source-id: 31fb8b4d643848f3f22381ac06f2443eb70c0009
  • Loading branch information
DeepLearning VM authored and facebook-github-bot committed Mar 25, 2020
1 parent 2d453cd commit 3b25b87
Showing 1 changed file with 14 additions and 2 deletions.
16 changes: 14 additions & 2 deletions python/fasttext_module/fasttext/pybind/fasttext_pybind.cc
Original file line number Diff line number Diff line change
Expand Up @@ -427,8 +427,20 @@ PYBIND11_MODULE(fasttext_pybind, m) {
const std::string word) { m.getWordVector(vec, word); })
.def(
"getNN",
[](fasttext::FastText& m, const std::string& word, int32_t k) {
return m.getNN(word, k);
[](fasttext::FastText& m, const std::string& word, int32_t k,
const char* onUnicodeError) {
std::vector<std::pair<float, std::string>> score_words = m.getNN(
word, k);
std::vector<std::pair<float, py::str>> output_list;
for (uint32_t i = 0; i < score_words.size(); i++) {
float score = score_words[i].first;
py::str word = castToPythonString(
score_words[i].second, onUnicodeError);
std::pair<float, py::str> sw_pair = std::make_pair(score, word);
output_list.push_back(sw_pair);
}

return output_list;
})
.def(
"getAnalogies",
Expand Down

0 comments on commit 3b25b87

Please sign in to comment.