Skip to content

Latest commit

 

History

History

deploy_simple_model

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Deploy a sentiment-analysis API in a few lines of code

This example illustrates how to use BudgetML to launch an API endpoint that runs on a preemtible Google Cloud Platform VM.

The endpoint gets issued a SSL certificate via LetsEncrypt. The server automatically spins back up again whenever GCP shuts down the instance. The endpoint is a full fledged FastAPI server, complete with interactive Swagger docs.

This particular example uses the awesome HuggingFace library to deploy a simple sentiment analysis model based on BERT. It uses the HuggingFace pipeline convenience function to achieve this. Here is the simple Predictor class required for this (see predictor.py).

class Predictor:
    def load(self):
        from transformers import pipeline
        self.model = pipeline(task="sentiment-analysis")

    async def predict(self, request):
        # We know we are going to use the `predict_dict` method, so we use
        # the request.payload pattern
        req = request.payload
        return self.model(req["text"])[0]

Just overriding two functions and the API is ready. No frills.

Here is how you can try it for yourself.

Pre-requisites

Create a Google Cloud Project, and additionally follow the pre-requisite steps:

  • Enable billing.
  • Enable APIs:
    • Cloud Storage
    • Compute Engine API
    • Google Cloud Pub/Sub API
    • Google Cloud Functions
    • Cloud Build API
    • Cloud Scheduler API
    • Cloud Functions API
  • Create a service account with the following roles: roles/editor. For ease:

Install the gcloud CLI and use it as follows:

export GCP_PROJECT=<enter your Google Cloud Project name>
export SA_NAME="sa-name"
export SA_PATH="$(pwd)/sa.json"

# creating the actual Service Account
gcloud iam service-accounts create ${SA_NAME}

# creating a json-key for the new Service Account
gcloud iam service-accounts keys create ${SA_PATH} \
    --iam-account ${SA_NAME}@${GCP_PROJECT}.iam.gserviceaccount.com
    
# give permissions to the new Service Account
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
    --member=serviceAccount:${SA_NAME}@${GCP_PROJECT}.gserviceaccount.com \
    --role "roles/editor" 

Export env variables

export DOMAIN='DOMAIN'
export SUBDOMAIN='SUBDOMAIN' 
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
export GCP_PROJECT='GCP_PROJECT'
export GCP_MACHINE_TYPE='GCP_MACHINE_TYPE'  # change to whatever machine type you like

Create a static IP (if not already created)

python create_ip.py

This will create and print a static IP address on the terminal. Set this in the environment

export IP_ADDRESS='IP_ADDRESS' # change to static IP generated from `create_ip.py`

Create the appropriate A record

In order to issue the SSL certificate via LetsEncrypt, there is a requirement to specify an A record that binds the static IP address to the subdomain and domain provided. This is simple in most popular domain providers. Here are guides for Hostgator, Namecheap and GoDaddy. These are the values to be set:

NAME=$SUBDOMAIN.$DOMAIN (e.g. model.mywebsite.com)
Value:$IP_ADDRESS (e.g. 35.137.10.45)

This might take a few minutes or even hours to propogate so good to pause here for a bit (or try the local deploy while we wait).

Deploy locally (optional)

It is often a good idea to deploy locally to make sure that the Predictor class is created properly.

Note that a Docker installation is required for this.

python deploy_local.py

This will pull the base API image, add your requirements (if any), and create a local container with the same logic as the main launch function does. After its completed, a USERNAME and PASSWORD will be completed, and the API will be accessible at 0.0.0.0:8000/docs.

Deploy to endpoint

Once tested locally, you are ready to deploy to the cloud. Use:

python deploy.py

There will a lot of output, but at the end a USERNAME and PASSWORD should show up (to be used in the next step). Here is what happened:

  • A Google Cloud Bucket is created in GCP_PROJECT, where the Predictor class is uploaded.
  • A (preemtible) is launched with a startup script that runs the docker-swag image and a custom API image based on the base budgetml image on startup.
  • A Cloud Scheduler function is initiated that always starts the server after every minute, to ensure minimum downtime.

The above three combined ensures the cheapest possible API endpoint with the lowest possible downtime.

Interact with API

Navigating to http://0.0.0.0:8000/docs (after deploy_local.py) or https://$SUBDOMAIN.$DOMAIN/docs (after deploy.py) will show a Swagger Docs page generated by FastAPI. To login, press the lock icon and enter the username and password generated above. Then you can play with the predict_dict/ endpoint by making a POST request with the following json body:

{
  "payload": {"text": "BudgetML is so awesome. I love it!"}
}

and the response should be something like:

{
  "label": "POSITIVE",
  "score": 0.999879777431488
}

Alternatively, you can also use curl. First, get the token by using your USERNAME and PASSWORD generated above.

curl -X POST "ENDPOINT/token" -H  "accept: application/json" -H  "Content-Type: application/x-www-form-urlencoded" -d "grant_type=&username=USERNAME&password=PASSWORD&scope=&client_id=&client_secret="

Then use the token generated from the above command to hit the API.

curl -X POST "ENDPOINT:8000/predict_dict/" -H  "accept: application/json" -H  "Authorization: Bearer TOKEN" -H  "Content-Type: application/json" -d "{\"payload\":{\"text\":\"BudgetML is so awesome. I love it!\"}}"

Again, endpoint will be http://0.0.0.0:8000 for local and https://$SUBDOMAIN.$DOMAIN for actual deployment.

See it in action

Here is a screenshot of what to expect in the Swagger docs!

Screenshot of swagger UI

Questions

If you have questions, please open up an issue on the GitHub channel. Thats the easiest way to consolidate all requests for the maintainer team.