Automated Interview Transcribe and Proofread

This is a script to automate Interview transcription and proofreading and transcript formatting into .docx. Uses AWS Transcribe, S3, and OpenAI API (GPT3.5). Produces very high-quality transcriptions, even with very bad sound quality input files.

Installation

Python 3.10+
AWS IAM with access to S3, AWS Transcribe
S3 bucket with public access - so this script can upload and read bucket contents.
OpenAI API key
See .envsample for what is needed

  pip3 install -r requirements.txt

Rename .envsample to .env and save with your keys and bucket name.

Usage/Examples

usage: process_transcripts.py [-h] input_folder s3_folder

positional arguments:
  input_folder  Input folder with .mp4 videos
  s3_folder     Output folder name in S3 bucket

e.g:

python3 process_transcripts.py /Users/kvyb/Documents/Uitrial_Interviews testing_proofread

Features

Uploads .mp4 2-speaker video to S3 bucket
Transcribes speaker voices into text
Proofreads the transcribed text, improving quality
Formats the text into a .docx format. Output sample:

Feedback

If you have any feedback, please reach out to me.

Note:

Only supports .mp4
Still needs a quick manual proof-read. AWS Transcribe isn't perfect.
Costs approximately $0.08 in total cost per 60-minute interview.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
modules		modules
.envsample		.envsample
.gitignore		.gitignore
README.md		README.md
process_transcripts.py		process_transcripts.py
pyvenv.cfg		pyvenv.cfg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Interview Transcribe and Proofread

Installation

Usage/Examples

Features

Feedback

About

Releases

Packages

Languages

uitrial/Interview-Transcribe-Proofread

Folders and files

Latest commit

History

Repository files navigation

Automated Interview Transcribe and Proofread

Installation

Usage/Examples

Features

Feedback

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages