Tests for the evaluate_word_pairs function #1061

akutuzov · 2016-12-27T13:34:34Z

Test for evaluating model against semantic similarity datasets (#1047).
Also fixes an error in the function call.

Conflicts: CHANGELOG.txt

Conflicts: CHANGELOG.txt gensim/models/word2vec.py

…vec.

… default vector size is 100, not 200).

Conflicts: gensim/models/word2vec.py

Conflicts: CHANGELOG.md README.md gensim/models/word2vec.py tutorials.md

…y judgments datasets.

…elop

akutuzov · 2016-12-27T14:04:12Z

@tmylk the tests are ready.

tmylk

Thanks for the tests. An oov_ratio sanity test would be great

tmylk · 2016-12-28T13:52:10Z

gensim/test/test_word2vec.py

+        pearson = correlation[0][0]
+        spearman = correlation[1][0]
+        self.assertTrue(0.1 < pearson < 1.0)
+        self.assertTrue(0.1 < spearman < 1.0)


could we please test for oov_ratio in correlation[2] too?

akutuzov · 2016-12-28T15:34:02Z

Sure, done.

tmylk · 2016-12-28T20:59:17Z

Thanks for the improvement!

tmylk · 2016-12-30T12:45:35Z

By the way, how is it better than using https://github.com/mfaruqui/eval-word-vectors ?
@anmol01gulati what code did you use to convert gensim word2vec to that format? A short script for that would be useful

akutuzov · 2016-12-30T15:26:49Z

It's better in that this code works directly from Gensim :)
In fact, my code is simpler as it uses Scipy functions for Pearson and Spearman coefficients (eval-word-vectors implements Spearman from scratch). Also, it features some useful options, like case-(in)sensitivity and smart handling of OOV pairs.

anmolgulati · 2017-01-06T23:42:19Z

I agree with @akutuzov. The code currently in gensim for Pearson and Spearman coefficients is shorter. But I feel, we could also include the whole dataset for evaluating word vectors, given in https://github.com/mfaruqui/eval-word-vectors. It's just 205 KB, and contains all the major gold standards, it'd be good to integrate them into gensim itself, and have one method to directly evaluate word2vec models, right inside gensim. What do you think?

The script I used to convert word2vec into the format for evaluating word vectors is quite small actually:

import gensim

model = gensim.models.Word2Vec.load_word2vec_format(
    'GoogleNews-vectors-negative300.bin', binary=True)

words = [line.split()[0] for line in open(
    "eval-word-vectors/vocab.txt", 'r')]

with open('output_vecs.txt', 'wb') as f:
    for word in words:
        if word in model:
            word_vector = model[word]
            f.write("%s " % word)
            f.write(" ".join(str(x) for x in word_vector))
            f.write("\n")

akutuzov · 2017-01-07T08:24:44Z

I am not sure it's a good idea to overload Gensim with various semantic similarity datasets included in the distribution.
Most people would use their own gold datasets anyway, either because they deal with non-English data or because their text preprocessing differs from the preprocessing in SimLex999 or WS353 (lemmatization/stemming, POS-tagging, etc).
So I think it's better to leave WS353 as an example (and for testing), and may be put a couple of links to other datasets in the documentation.

anmolgulati · 2017-01-07T11:00:43Z

Yeah you are right. Sounds Good.

tmylk and others added 30 commits November 5, 2015 19:07

Merge branch 'release-0.12.3rc1'

1c63c9a

Merge branch 'release-0.12.3'

280a488

Merge branch 'release-0.12.3'

ddeb002

Update CHANGELOG.txt

f2ac3a9

Update CHANGELOG.txt

cf09e8c

cbow_mean default changed from 0 to 1.

b8b8f57

Hyperparameters' default values are aligned with Mikolov's word2vec.

6456cbc

Merge remote-tracking branch 'upstream/master' into develop

966a4b0

Conflicts: CHANGELOG.txt

Fix for piskvorky#538: cbow_mean default changed from 0 to 1.

d9ec7e4

Update changelog

76d2df7

(main) defaults aligned to Mikolov's word2vec.

0b6f45b

Merge remote-tracking branch 'upstream/develop' into develop

7fb5f18

Conflicts: CHANGELOG.txt gensim/models/word2vec.py

word2vec (main) now mimics command-line arguments for Mikolov's word2…

bc7a447

…vec.

Fix for piskvorky#538

e689b4f

Fix for piskvorky#538 (tabs and spaces).

a5274ab

Fix for piskvorky#538 (tests).

5c32ca8

For piskvorky#538: slightly relaxed sanity check demands (because now…

ac889b3

… default vector size is 100, not 200).

Fixes as per @gojomo comments.

92087c0

Test fixes due to negative sampling becoming default behavior.

06785b5

Commented out tests which work for HS only.

3ac5fd4

Fix for piskvorky#538.

e0ac3d2

Yet another fix.

0aad977

Merge remote-tracking branch 'upstream/develop' into develop

1db616b

Conflicts: gensim/models/word2vec.py

Merging.

e4eb8ba

Fix for CBOW test.

ab25344

Merge remote-tracking branch 'upstream/develop' into develop

6b3f01d

Changelog mention of piskvorky#538

2bf45d3

Fix for CBOW negative sampling tests.

1a579ec

Merge remote-tracking branch 'upstream/develop' into develop

78372bf

Factoring out word2vec _main__ into gensim/scripts

0c10fa6

akutuzov and others added 17 commits July 2, 2016 21:59

Finalizing.

b8b30c2

'fisrt_push'

f3f2a52

Initial shippable release

873f184

Merge remote-tracking branch 'upstream/develop' into develop

68a3e86

Conflicts: CHANGELOG.md README.md gensim/models/word2vec.py tutorials.md

Evaluation function to measure model correlation with human similarit…

498474d

…y judgments datasets.

Updating semantic similarity evaluation.

ce64d5a

Scipy stats import

0936971

Evaluation function to measure model correlation with human similarit…

e11909f

…y judgments datasets.

Merge branch 'develop' of https://github.com/akutuzov/gensim into dev…

5f38818

…elop

Remove unneccessary.

b4b8d14

Changing the neame of the word pairs evaluation function.

2429dc4

Merge branch 'develop' into develop

ad6b268

Merge remote-tracking branch 'upstream/develop' into develop

fddbc0a

Wordsim353 dataset added.

910a511

Fixed bug in evaluate_word_pairs.

54e0ba2

Tests for evaluate_word_pairs function.

41f8f8e

Atrributing Wordsim353 dataset.

9dfbac5

tmylk suggested changes Dec 28, 2016

View reviewed changes

akutuzov added 2 commits December 28, 2016 16:14

Merge remote-tracking branch 'upstream/develop' into develop

5899610

Test for out-of-vocabulary pairs in evaluate_word_pairs.

11c9afb

tmylk merged commit 88d032b into piskvorky:develop Dec 28, 2016

jayantj pushed a commit to jayantj/gensim that referenced this pull request Jan 4, 2017

Tests for the evaluate_word_pairs function (piskvorky#1061)

bb725b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests for the evaluate_word_pairs function #1061

Tests for the evaluate_word_pairs function #1061

akutuzov commented Dec 27, 2016

akutuzov commented Dec 27, 2016

tmylk left a comment

tmylk Dec 28, 2016

akutuzov commented Dec 28, 2016

tmylk commented Dec 28, 2016

tmylk commented Dec 30, 2016 •

edited

Loading

akutuzov commented Dec 30, 2016

anmolgulati commented Jan 6, 2017

akutuzov commented Jan 7, 2017

anmolgulati commented Jan 7, 2017

Tests for the evaluate_word_pairs function #1061

Tests for the evaluate_word_pairs function #1061

Conversation

akutuzov commented Dec 27, 2016

akutuzov commented Dec 27, 2016

tmylk left a comment

Choose a reason for hiding this comment

tmylk Dec 28, 2016

Choose a reason for hiding this comment

akutuzov commented Dec 28, 2016

tmylk commented Dec 28, 2016

tmylk commented Dec 30, 2016 • edited Loading

akutuzov commented Dec 30, 2016

anmolgulati commented Jan 6, 2017

akutuzov commented Jan 7, 2017

anmolgulati commented Jan 7, 2017

tmylk commented Dec 30, 2016 •

edited

Loading