Skip to content

Commit

Permalink
Expose sentence to vector fn
Browse files Browse the repository at this point in the history
  • Loading branch information
hailiang-wang committed Sep 21, 2018
1 parent 5a37ca5 commit 9979984
Show file tree
Hide file tree
Showing 4 changed files with 29 additions and 1 deletion.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
# 3.8
* 获得一个分词后句子的向量,向量以BoW方式组成


```
sentence: 句子是分词后通过空格联合起来
ignore: 是否忽略OOV,False时,随机生成一个向量
```


# 3.7
* change import path of utils in word2vec.py to local path
* expose vector fn

Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,15 @@ array([-2.412167 , 2.2628384 , -7.0214124 , 3.9381874 , 0.8219283 ,
dtype=float32)
```

### synonyms#sv(sentence, ignore=False)
获得一个分词后句子的向量,向量以BoW方式组成

```
sentence: 句子是分词后通过空格联合起来
ignore: 是否忽略OOV,False时,随机生成一个向量
```


## PCA
以“人脸”为例主要成分分析:

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

setup(
name='synonyms',
version='3.7.0',
version='3.8.0',
description='Chinese Synonyms for Natural Language Processing and Understanding',
long_description=LONGDOC,
author='Hai Liang Wang, Hu Ying Xi',
Expand Down
9 changes: 9 additions & 0 deletions synonyms/synonyms.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,15 @@ def _levenshtein_distance(sentence1, sentence2):
# print("smoothing[%s| %s]: %s -> %s" % (sentence1, sentence2, d, s))
return s

def sv(sentence, ignore=False):
'''
获得一个分词后句子的向量,向量以BoW方式组成
sentence: 句子是分词后通过空格联合起来
ignore: 是否忽略OOV,False时,随机生成一个向量
'''
return _get_wv(sentence, ignore = ignore)


def v(word):
'''
获得一个词语的向量,OOV时抛出 KeyError 异常
Expand Down

0 comments on commit 9979984

Please sign in to comment.