next-bacon-prediction

This demo uses the Universal Sentence Encoder to build a next-word prediction service. Francis Bacon was an English essayist famous for his aphoristic style. A disregard for traditional essay convention noticeably marks his work. The model in this demo is trained on his essay Of Truth.

Process

The code in this repo is meant as a demo for how a similar service can be built. The model has not be tuned or evaluated. Similarly, there are no unit tests or deployment automation mechanisms. I'd be happy to provide more detail on either. Please reach out!

Training

The model was built in a cloud notebook using the 1999 most frequently occuring words. To construct the train set, the corpus was windowed over and randomly, phrases of of 5-15 words were samples. The label is simply the next word in each phrase.

The first layer of the model is the Universal Sentence Encoder which is hosted on TensorFlow Hub. We specifically use version 5 because previous ones have comparability issues with Keras' training and inference methods. It is also good to note that the embeding should be read as a hub.KerasLayer. Importing it as a regular tf module and building a custom Keras layer with a Lambda function can result in serialization errors, regardless of save API, when saving the model. tf and Keras operations still don't always play nice 😔.

The output of the model is a 2000 (1999+UKN) element matrix of next-word probabilities.

After the model was trained, the Keras save API was used to export the model as a Protobuf file and index files. The dictionary containing index2word mappings was exported as a JSON file.

Deployment

The model is hosted on an EC2 (at 3.235.192.240:8501) instance running the TensorFlow Serving image. tf Serving requires models to be versioned by directory. Here, v1 is the where our model lives. Within this directory, the following artifacts, which are generated through the Keras save API, are added.

|__ v1
    |__ saved_model.pb
    |__ assets (optional)
    |__ variables
        |__ variables.index
        |__ variables.data-00000-of-00001

Our variables.data-00000-of-00001 is around 600MB so it has been omitted from GitHub.

To request a prediction(s):

$ curl -d '{"instances": [<phrase1>, <phrase2>, ...]}' \ 
  -X POST http:/3.235.192.240:8501/v1/models/next-bacon-prediction:predict

# Returns => { "preidctions:" [0.2, 0.4, ...], [0.5,0.43, ...]}

For our example we would only like the next word, not the entire probability matrix. To do this we can set up a Flask API which contains the index2word mapping (see /api directory). I have provided an example Dockerfile to run it. It will ping the tf Serving endpoint then use index2word to convert the matrix to the most likely next word. Having a second service enables us to customize what we would like to return. The service can be easily extended to return the top 5 most likely next words. Unlike the tf serving API, it only supports 1 prediction per request.

ex:

import requests
requests.post("http://3.235.192.240:5000/api/get_next_word",
                        data="the world")
# Returns => {"next_word":"governs"}

NOTE: I've since torn down the server

Thanks for reading!

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
api		api
model_artifacts		model_artifacts
README.md		README.md
requirements.txt		requirements.txt
train.ipynb		train.ipynb
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

next-bacon-prediction

Process

Training

Deployment

About

Releases

Packages

Contributors 2

Languages

SunriseLong/next-bacon-prediction

Folders and files

Latest commit

History

Repository files navigation

next-bacon-prediction

Process

Training

Deployment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages