A Flask-based API to convert a Google Sheet to JSON. Structurally, this application is based on this example, which itself is a restructuring of this example and its accompanying article.
For both local and remote environments, you'll need to make sure the application has access to any Google Sheets data that it needs to load. In version 4 of the Sheets API, this happens through Service Accounts.
If you are authenticating this application with the Sheets API for the first time, you'll need to create a new Google Cloud project. Start by following this guide from Google. When you've finished Google's steps, you should have a new project.
Our specific Google Sheets integration uses the Sheetfu library, which has an authentication guide to finish this process. The screenshots are not necessarily up to date with the names Google uses for things.
Between these resources, you should follow these steps to create and access the authentication credentials:
- Create a new Google Cloud Platform project.
- Enable the Sheets and Drive APIs in the APIs & Services section of the Google Cloud Platform settings.
- Create a Service Account in the IAM & Admin section of the Google Cloud Platform settings.
- Download the new JSON-formatted key for that Service Account. Click the Actions menu for that account and choose the Manage keys option to create and download the key. Only use this key for one environment.
This new Service account will have an automatically-created email address. For this application, that email address must have at least Viewer-level access on any Google Sheets that it needs to access. It's best to give it that level of access on the folder level.
If this user is new or it is being given new access, it can take a few minutes for the changes to propogate.
If the Service Account user already exists in the Google Cloud Platform, you can access it at https://console.cloud.google.com/home/dashboard?project=[application-name].
If it hasn't been, you'll need your Google account added. An Administrator can do that at the correct dashboard URL by clicking "Add People to this Project."
Follow these steps to access the authentication credentials:
- Once you have access to the project's dashboard, click "Go to project settings" in the Project info box.
- Click Service Accounts in the IAM & Admin section of the Google Cloud Platform settings.
- If there is more than one service account, find the correct one.
- Click the Actions menu for that account and choose the Manage keys option.
- Click Add Key, choose Create new key, and choose JSON as the Key type. Click the Create button and download the key for that Service Account. Only use this key for one environment.
Once you have downloaded the JSON file with the credentials, you will use the values from it in the .env
file or in the project's Heroku settings. See the sections of this readme that cover authentication for Google Sheets. Once you've authenticated successfully, you don't have to keep the JSON file around, unless you'd like to have a backup.
Enter the configuration values from the JSON key downloaded above into the .env
file's values for these fields:
SHEETFU_CONFIG_TYPE
SHEETFU_CONFIG_PROJECT_ID
SHEETFU_CONFIG_PRIVATE_KEY_ID
SHEETFU_CONFIG_PRIVATE_KEY
SHEETFU_CONFIG_CLIENT_EMAIL
SHEETFU_CONFIG_CLIENT_ID
SHEETFU_CONFIG_AUTH_URI
SHEETFU_CONFIG_TOKEN_URI
SHEETFU_CONFIG_AUTH_PROVIDER_URL
SHEETFU_CONFIG_CLIENT_CERT_URL
In the project's Heroku settings, enter the configuration values from the production-only JSON key downloaded above into the values for these fields:
SHEETFU_CONFIG_TYPE
SHEETFU_CONFIG_PROJECT_ID
SHEETFU_CONFIG_PRIVATE_KEY_ID
SHEETFU_CONFIG_PRIVATE_KEY
SHEETFU_CONFIG_CLIENT_EMAIL
SHEETFU_CONFIG_CLIENT_ID
SHEETFU_CONFIG_AUTH_URI
SHEETFU_CONFIG_TOKEN_URI
SHEETFU_CONFIG_AUTH_PROVIDER_URL
SHEETFU_CONFIG_CLIENT_CERT_URL
This API supports storing JSON in either the application's Redis instance (whether locally or on Heroku) or in an Amazon S3 bucket. If you do not enable S3 usage, Redis will be used by default.
Before running the application, you'll need to run a Redis server for caching data. One way to do this is with Homebrew.
- Run
brew update
thenbrew install redis
- If you want Redis to start automatically on login, run
brew services start redis
. To stop it, runbrew services stop redis
. If you want to run Redis in a terminal shell instead, you can runredis-server /usr/local/etc/redis.conf
instead of using brew service. - Test if Redis is running with the command
redis-cli ping
. Redis should respond with "PONG." - You shouldn't need a graphic interface for this project, but if you prefer to use one, Medis is free on the Mac App Store.
To run Redis on Heroku, installing the free Heroku Redis add-on should be sufficient.
Use the following fields in your .env
or in your Heroku settings.
CACHE_TYPE = "RedisCache"
CACHE_REDIS_HOST = "redis"
CACHE_REDIS_PORT = "6379"
(unless you are using a non-default port for Redis)CACHE_REDIS_DB = "0"
(unless you need a separate Redis database. Redis creates databases in numeric order, so you can use other numbers)CACHE_REDIS_URL = "redis://127.0.0.1:6379/0"
(make sure the:6379
matches your port value, and that/0
matches your Redis database number)CACHE_DEFAULT_TIMEOUT = "500"
To push JSON data from this application to Amazon AWS, we use the boto3
library.
To upload a file to S3, fill in these .env
values to match your S3 account. If S3 is enabled, URLs for your files will be https://s3.amazonaws.com/[your bucket]/[your folder]/[this spreadsheet's cachekey].json
.
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_BUCKET
AWS_FOLDER
USE_AWS_S3
use "true" or "false" for this; it will be converted to lowercase if you forget. "false" is the default value for this setting.
We use a basic setup of Flask-JWT-Extended to manage application access.
To generate API keys, run this code in a Python shell:
import secrets
secrets.token_hex(64)
Do this for each valid API key that an application can use to access this API.
Store a list of valid API keys in the .env
file or in Heroku configuration.
VALID_API_KEYS = '["key1", "key2"]'
Store a single secret key in the JWT_SECRET_KEY
field of the .env
file or in Heroku configuration.
There are a few general .env
variables not already discussed. You can see these settings, and their default values (or the kind of value they expect), in the repository's .env-example
file.
- Install
git
- Get the code:
git clone https://github.com/MinnPost/google-sheet-to-json-api.git
- Change the directory:
cd google-sheet-to-json-api
- Create a
.env
file based on the repository's.env-example
file in the root of your project. - Run
pipenv install
. - Run
pipenv shell
- Run
flask run --host=0.0.0.0
. This creates a basic endpoint server at http://0.0.0.0:5000.
This application should be deployed to Heroku. If you are creating a new Heroku application, clone this repository with git clone https://github.com/MinnPost/google-sheet-to-json-api.git
and follow Heroku's instructions to create a Heroku remote.
Currently, this application has three endpoints:
authorize
is used to get a token from a valid API key. The token is then required by the/parser/custom-overwrite/
endpoint. This endpoint acceptsPOST
requests./parser/
is the main endpoint. It acceptsGET
requests, and will return JSON of that Google sheet's data and cache it. If there is a customized JSON structure that has already been cached and has not expired, it will return that instead./parser/custom-overwrite/
receivesPOST
requests. It receives custom formatted JSON, caches it, and returns it. APOST
request requiresapplication/json
as theContent-Type
header. It's a good idea to useapplication/json
forGET
requests as well to make theAuthorization
header easier, but you don't necessarily have to do this.
api_key
is required in thePOST
body for requests toauthorize
. A request with a valid API key returns atoken
.token
is required as anAuthorization
header value on/parser/
and/parser/custom-overwrite/
requests. A request without a valid token in that header will fail. It can be passed to aGET
orPOST
endpoint.
To pass this header to an endpoint within Python, do this:
authorized_headers = {
"Authorization": f"Bearer {valid_token_value}"
}
Both URL endpoints support many of the same data parameters, but they get passed on the URL (in the case of a GET
endpoint) or in the post body (in the case of a POST
endpoint). Most parameters are usable by both endpoints, but if they are end-point specific that will be noted.
spreadsheet_id
is a required ID of a Google Sheet that the application user can access.worksheet_names
is an optional parameter of worksheet names, such asRaces|Candidates
. If it is left blank, the endpoint will load the first worksheet in the spreadsheet. If there are multiple worksheets provided, they will be sorted alphabetically to keep consistency.
output
is a full, customized JSON output of the modified Google Sheet data as it should be stored. It is only accepted on the/parser/custom-overwrite
endpoint as aPOST
parameter.
external_use_s3
is an optional parameter that takes a value of "true" or "false", and determines whether JSON data is saved to Amazon S3 instead of in Redis. A value for this parameter will override the default setting in the.env
file.
All of these parameters are specific to the Redis cache. They are all optional, but have default values.
bypass_cache
takes a value of "true" or "false". "false" is the default value. If the value is "true" the application will not check the Redis cache for any stored data, but will attempt to parse the Google Sheet directly and return its data.delete_cache
takes a value of "true" or "false". "false" is the default value. If the value is "true" the application will delete the cached data for the Google Sheet.cache_data
takes a value of "true" or "false". "true" is the default value. If the value is "false" the application will not store data for the Google Sheet in the cache.cache_timeout
takes the number of seconds that the cache should last before it expires. If present, this will override the default value from the.env
file. This value only has an effect on requests where data is actually being cached.