Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move load and save word2vec_format out of word2vec class to KeyedVectors #1107

Merged
merged 62 commits into from
Jan 27, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
55a4fc9
updated refactor
Aug 18, 2016
e916f7e
commit missed file
Aug 18, 2016
e5416ed
docstring added
Aug 18, 2016
e64766b
more refactoring
Aug 19, 2016
c34cf37
add missing docstring
Aug 19, 2016
c9b31f9
fix docstring format
Aug 19, 2016
a0329af
clearer docstring
droudy Aug 19, 2016
0c0e2fa
minor typo in word2vec wmdistance
jayantj Sep 2, 2016
cdefeb0
pyemd error in keyedvecs
jayantj Sep 8, 2016
1aec5a2
relative import of keyedvecs from word2vec fails
jayantj Sep 8, 2016
e7368a3
bug in init_sims in word2vec
jayantj Sep 8, 2016
fe283c2
property descriptors for syn0, syn0norm, index2word, vocab - fixes bu…
jayantj Sep 8, 2016
9b36bc4
tests for loading older word2vec models
jayantj Sep 9, 2016
dfe1893
backwards compatibility for loading older models
jayantj Sep 9, 2016
4a03f20
test for syn0norm not saved to file
jayantj Sep 9, 2016
09b6ebe
syn0norm not saved to file for KeyedVectors
jayantj Sep 9, 2016
7df4138
tests and fix for accuracy
jayantj Sep 9, 2016
4c54d9b
minor bug in finalized vocab check
jayantj Sep 9, 2016
a28f9f1
warnings for direct syn0/syn0norm access
jayantj Sep 9, 2016
bf1182e
fixes use of most_similar in accuracy
jayantj Sep 10, 2016
5a6b97b
changes logging level to ERROR in word2vec tests
jayantj Sep 10, 2016
cfb2e1c
renames kv to wv in word2vec
jayantj Sep 12, 2016
b002765
minor bugs with checking existence of syn0
jayantj Sep 12, 2016
27c0a14
replaces syn0 and syn0norm with wv.syn0 and wv.syn0norm in tests and …
jayantj Sep 12, 2016
81f8cbb
adds changelog
jayantj Sep 12, 2016
aa7e632
initial fastText wrapper class
jayantj Aug 29, 2016
c780b9b
fasttext load binary data + oov vectors
jayantj Aug 29, 2016
ccf5a47
tests for fasttext wrapper
jayantj Sep 9, 2016
708113b
reduced memory requirements for fasttext model
jayantj Sep 9, 2016
b7de266
annoy indexer tests for fasttext
jayantj Sep 12, 2016
4d3d251
adds changelog and documentation
jayantj Sep 12, 2016
f2d13ce
renames kv to wv in fasttext wrapper
jayantj Sep 12, 2016
3777423
refactors syn0 word vector lookup into method
jayantj Sep 12, 2016
6e20834
updates keyedvector load tests to use actual values
jayantj Dec 16, 2016
564ea0d
Merge branch 'develop' into fasttext
jayantj Dec 18, 2016
caeb275
updates word2vec load old models tests + test models
jayantj Dec 19, 2016
784ffbf
more fasttext wrapper tests
jayantj Dec 22, 2016
20fe6f2
refactoring of some fasttext and word2vec methods
jayantj Dec 22, 2016
3b9483b
refactors FastText to use subclass of KeyedVectors, updates tests
jayantj Dec 22, 2016
f5cdfb6
Merge branch 'develop' into fasttext
jayantj Dec 26, 2016
700dd26
changes setUp for fast text unittests to setUpClass to reduce time taken
jayantj Dec 26, 2016
d30ea56
adds normalized ngram vectors for fasttext model, tests
jayantj Dec 27, 2016
bb6e538
deletes training files after loading model, tests
jayantj Dec 27, 2016
c7a5d07
doesnt match with oov words, tests
jayantj Dec 27, 2016
734057b
more asserts while loading from fasttext model file, renames some var…
jayantj Dec 27, 2016
56d89e9
updates FastText __contains__ to return True for all words for which …
jayantj Dec 27, 2016
dc51096
updates docstrings, adds comments for fasttext wrapper and tests
jayantj Dec 27, 2016
bb48663
adds fasttext test models
jayantj Dec 27, 2016
b58dd53
changes setUpClass to setUp to allow python2.6 compatibility
jayantj Jan 3, 2017
461a6b4
updates word2vec test model files
jayantj Jan 4, 2017
9137090
python2.6 compatibility for fasttext tests
jayantj Jan 4, 2017
e5ae899
Revert "updates keyedvector load tests to use actual values"
jayantj Jan 4, 2017
b98b40f
Merge branch 'develop' into fasttext
jayantj Jan 4, 2017
5eb8f75
replaces all instances of vocab and syn0 being accessed directly thro…
jayantj Jan 4, 2017
27bec7b
adds fasttext tutorial notebook
jayantj Jan 6, 2017
ef0e1e2
minor doc updates
jayantj Jan 6, 2017
ab07ef9
removes direct vocab access in FastText
jayantj Jan 6, 2017
2f37b04
suppresses numpy overflow warning while computing fasttext hash
jayantj Jan 6, 2017
5653632
load_word2vec_format returns KeyedVector, minor refactoring
jayantj Jan 5, 2017
2425478
Move load and save word2vec_format to KeyedVectors
tmylk Jan 24, 2017
21c2099
Fix save_word2vec_format in kv
tmylk Jan 24, 2017
95da5b3
Fix merge artifacts in test_wor2vec
tmylk Jan 24, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 29 additions & 55 deletions docs/notebooks/FastText_Tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,7 @@
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -99,9 +97,7 @@
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -140,9 +136,7 @@
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -176,9 +170,7 @@
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -224,9 +216,7 @@
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"ename": "KeyError",
Expand Down Expand Up @@ -258,9 +248,7 @@
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand Down Expand Up @@ -295,9 +283,7 @@
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
Expand All @@ -314,8 +300,8 @@
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -336,9 +322,7 @@
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand All @@ -356,8 +340,8 @@
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -368,9 +352,7 @@
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand All @@ -379,8 +361,8 @@
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -390,9 +372,7 @@
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand All @@ -401,8 +381,8 @@
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -412,9 +392,7 @@
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand All @@ -432,8 +410,8 @@
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -443,9 +421,7 @@
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -539,8 +515,8 @@
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand All @@ -550,9 +526,7 @@
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"metadata": {},
"outputs": [
{
"data": {
Expand All @@ -561,8 +535,8 @@
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result",
"metadata": {}
}
],
"source": [
Expand Down Expand Up @@ -592,7 +566,7 @@
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
"version": 2.0
},
"file_extension": ".py",
"mimetype": "text/x-python",
Expand All @@ -604,4 +578,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
2 changes: 1 addition & 1 deletion gensim/models/doc2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@

from gensim.utils import call_on_class_only
from gensim import utils, matutils # utility fnc for pickling, common scipy operations etc
from gensim.models.word2vec import Word2Vec, Vocab, train_cbow_pair, train_sg_pair, train_batch_sg
from gensim.models.word2vec import Word2Vec, train_cbow_pair, train_sg_pair, train_batch_sg
from six.moves import xrange, zip
from six import string_types, integer_types, itervalues

Expand Down
Loading