Skip to content

Clip2Story is a prototype web application that transcribes news video clips, summarizes transcripts using OpenAI, and feeds summaries as the first draft of a story into a CMS.

License

Notifications You must be signed in to change notification settings

associatedpress/local-ai-ksat

Repository files navigation


AP Logo

Clip2Story

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact

About The Project

Clip2Story is a prototype web application that transcribes news video clips, summarizes transcripts using OpenAI, and feeds summaries as the first draft of a story into a CMS.

This project was originally built for KSAT-TV in San Antonio, Texas. The Associated Press and Stanford University collaborated to develop this application as part of the Local News AI Initiative, funded by the John S. and James L. Knight Foundation, which aims to leverage AI for the benefit of local news.

The development team thanks the staff at KSAT-TV and Graham Media Group for proposing this project, and for their participation, feedback, and encouragement.

Project Objectives

  • Reduce Workloads: Decreasing the burden for journalists to post new stories on digital platforms.
  • Build New Capabilities: Allowing journalists to experiment with generating text content from different types of videos.
  • Cost Effectiveness: Running the system should not incur expenses greater than the savings or revenue gains generated by using it.

How It Operates

Clip2Story functions through this process:

  1. Video Upload: The system takes an input of pre-edited video clips uploaded via the web application or via the Trint transcription service.
  2. Transcription: A transcript of the video is generated via a call to the Trint API.
  3. Approval of Transcript: After a transcript is completed, the system awaits a journalist to review and/or edit the transcript for accuracy. (The dashboard is seen in the above image.)
  4. Summarization: The validated transcript is summarized via a call to OpenAI's GPT 3.5 Turbo model via API. (GPT prompts are controlled via the Django Administration interface, see image below.)
  5. Keywording: Relevant tags for the transcript are generated via a call to OpenAI's GPT 3.5 Turbo model via API.
  6. Publication: The summary and keywords are uploaded to the Arc XP CMS via API as a draft story for review by a journalist.

(back to top)

Built With

  • Python

(back to top)

Getting Started

To do initial configuration and setting up of third-party apps, prompts, and user accounts; See the administrators documentation.

Production

This application was originally designed to be hosted on Google Cloud Platform. See GCP Deployment for the details. There is a job running every 30 minutes on Cloud Run that checks for and deletes old videos. See Management Tasks and Jobs for details.

Prerequisites

  • Django
  • Postgres
  • Trint API access
  • Arc XP API access
  • OpenAI API access
  • Google Cloud Platform (for production hosting)

Installation

Install Postgres.app.

Create a local version of the database. For this step, you may need to locate the createdb command on your computer. This will vary depending on the version of Postgres.app that you installed.

# First try plain old createdb

createdb summarizerdb

# If the above doesn't work, try locating the createdb command in
# Postgres.app folder. Below is an example if you're on Postgres.app version 15

/Applications/Postgres.app/Contents/Versions/15/bin/createdb summarizerdb

Grab a copy of the codebase and install Python requirements.

git clone git@github.com:associatedpress/local-ai-ksat.git
cd local-ai-ksat
pipenv install --dev

NOTE: All of the following commands should be executed on the command line, from the top-level local-ai-ksat/ directory, unless otherwise specified.

Set up a .env file to store secrets and other project-specific environment variables.

cp env.template .env

Add your database username to the .env file:

echo DJANGO_DB_USER=$(whoami) >> .env

Migrate your local database (this will create all tables, fields, etc.).

# If you haven't already done so, activate the virtual environment
# by running "pipenv shell" from the top-level "local-ai-ksat/" directory

# Then navigate to the clip2story/ directory and update the database
cd clip2story/
python manage.py migrate

Create a superuser for the Django admin database

python manage.py createsuperuser

IMPORTANT: Any time you install a new application dependency using pipenv install, you must regenerate the requirements.txt file used in the production deployment by running pipenv lock -r > requirements.txt. And of course, commit that update along with any code updates in order to make the new software dependencies available in production.

(back to top)

Usage

For day-to-day usage, use the below commands.

Note, you may also need to occassionally migrate your database, per the instructions above in Setup

# Activate the virtual environment
cd local-ai-ksat/
pipenv shell

# Fire up the dev server
cd clip2story/
python manage.py runserver

Use your superuser credentials to log into:

(back to top)

Roadmap

At the end of the MVP development period, these were the features that we thought would be useful to have in the future:

  • Integration with a variety of other transcription services, especially an OpenAI Whisper model because it performed really well in a separate project for Michigan Radio.
  • Integration with a variety of other content management systems
  • Native UI management and help guides that do not use the Django interface

(back to top)

Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the GNU GENERAL PUBLIC LICENSE. See LICENSE for more information.

(back to top)

Contact

The Associated Press does not provide technical support for this open-source application.

Serdar Tumgoren - @zstumgoren - tumgoren@stanford.edu

Project Link: https://github.com/associatedpress/local-ai-ksat

Original Developers

  • Ryan Leahy - @RyanLeahy - Gonzaga University
  • Ozge Terzioglu - @ozterz - Stanford University
  • Kalyn Epps - Stanford University

(back to top)

About

Clip2Story is a prototype web application that transcribes news video clips, summarizes transcripts using OpenAI, and feeds summaries as the first draft of a story into a CMS.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published