Python package to retrieve and process google doc revision history data.
Documentation website: https://harvard-vpal.github.io/gdocrevisions/docs
- python 3
pip install gdocrevisions
- Create a google service account and create a json credentials file.
- Share a google doc with the service account email (e.g. service-account@appspot.gserviceaccount.com)
Code example demonstrating how to:
- generate credentials with the google-auth library
- load a document with the gdocrevisions
GoogleDoc
class - inspect a few attributes of the
GoogleDoc
object instance
from google.oauth2 import service_account
import gdocrevisions
# The file id can be found in the URL
# e.g. https://docs.google.com/document/d/<FILE_ID>
FILE_ID = 'abcdefg12345'
# Specify the service account credentials file
CREDENTIAL_FILE = 'my-credentials.json'
SCOPE = ['https://www.googleapis.com/auth/drive']
credentials = service_account.Credentials.from_service_account_file(CREDENTIAL_FILE, scopes=SCOPE)
# Initialize a GoogleDoc object instance, which retrieves revision data
gdoc = gdocrevisions.GoogleDoc(FILE_ID, credentials)
# Doc and revision data is available in the object instance attributes
gdoc.metadata
gdoc.revisions
gdoc.revisions[0].operation
A Docker-based environment is specified for development and testing.
The environment variable GOOGLE_SERVICE_ACCOUNT_INFO
must be populated in order to run tests. The variable should contain the text content of a Google service account file, for use with google.oauth2.service_account,.Credentials.from_service_account_info. It can be defined in a
.envfile in the same directory as
docker-compose.yml`.