Keras External Embeddings

Problem

Keras supports using pretrained word embeddings in your models. In a lot of cases, it makes sense to freeze the pretrained word embeddings at training time. Keras provides an easy option for that in its Embedding layer, by setting the trainable argument to False (check the FAQ section).

However, by adding the Embedding layer to your model, you will be saving the word embeddings alongside your model. This is fine if you are dealing with a couple of models. In production environments, however, you might have several models, all using frozen pretrained embeddings. In that case, you will be duplicating the embeddings in all models. This results in orders of magnitude increase in storage on disk and in much higher RAM usage.

Solution

It is more efficient to share the embeddings across the models and to perform the mapping from words to vectors only once for all your models. This repository shows how this can be done by building on the same example provided with Keras with GloVe embeddings and the 20 Newsgroup dataset.

The first file pretrained_word_embeddings.py is the original file from Keras. The second file pretrained_external_word_embeddings.py is the one where the embeddings are external to the model. The main changes are in how the data is loaded and in the first layer of the model.

To run it yourself, head to the files and adjust the directories GLOVE_DIR and TEXT_DATA_DIR to your preference, along with other parameters. Then simply run:

python pretrained_external_word_embeddings.py

You can check the difference between the two files with your favorite diff checker to understand the differences between them.

Prerequisites:

Keras

Developer

Hamza Harkous

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
pretrained_external_word_embeddings.py		pretrained_external_word_embeddings.py
pretrained_word_embeddings.py		pretrained_word_embeddings.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keras External Embeddings

Problem

Solution

Prerequisites:

Developer

License

About

Releases

Packages

Languages

harkous/keras-external-embeddings

Folders and files

Latest commit

History

Repository files navigation

Keras External Embeddings

Problem

Solution

Prerequisites:

Developer

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages