Skip to content

messiest/word-embedding

Repository files navigation

Word Embeddings

Texts from Project Gutenberg, here and here.

  • Joseph Conrad's Heart of Darkness

  • Homer's Iliad

    Note: This package uses pre-trained word vectors, downloading them automatically if they aren't found. The size of this file is 1.2GB. They will be installed in the directory .vector_caches/

For additional texts, I recommend checking out another repo of mine that will help you download a corpus of Wikipedia articles.

Files

word_search.py

Finds words with the lowest cosine similarity to the passed word.

Comannd line interface:

python3 word_search.py <word>

analogy.py

Finds word that best completes the analogy in the form of:

  • word 1 : word 2 :: word 3 : ???

Command line usage:

python3 analogy.py <word 1> <word 2> <word 3>

learn_embedding.py

Learn embedded word vectors from the provided text.

python3 learn_embedding.py <text>

About

Tools to use and train word embeddings.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages