NeuralBlock (NB) is a neural network built using Keras/Tensorflow that detects in-video YouTube sponsorships. There is support for both predicting (1) whether or not a text excerpt is a sponsorship (spot) or (2) whether or not this word in the sequence is part of a sponsorship.
NB is loosely based on and inspired by this project. Unlike the aforementioned project, this project leverages the crowd-sourced labels provided by SponsorBlock.
Some examples of NB's predictions are provided in the examples/
directory. The code for the web application is also provided and can be run locally.
A video demo is available on YouTube.
- NeuralBlock extracts transcripts from YouTube with YouTubeTranscriptApi.
- The SponsorBlock community has already pre-labeled sponsors.
- The timestamps from (2) are used to find the sections in the transcript that are sponsorships, thereby creating a training set.
- The sequence of text is tokenized using the top 10,000 words found in sponsorships. Note, using a pre-trained word embedding by fastText does not yield better performance.
- A bidirectional LSTM RNN is trained.
Somewhat outdated. To be updated later. Dockerfile can be used
The app/
directory contains a simple flask application that performs the primary functions of predict_stream.py
and predict_timestamps.py
, and presents the results in the browser.
- Install flask and other necessary libraries.
- Move the models from the
data
folder intoapp/models
. There should be no subfolders. - Run
python app/application.py
from a terminal. - Go to
localhost:5000
in a browser. - Submit a valid video ID and click Submit
The results should return in a few seconds. Note, if a good transcript cannot be extracted by YouTubeTranscriptApi, the app will fail.
Somewhat outdated. To be updated later.
- Install the python libraries TensorFlow and YouTubeTranscriptApi
- Update paths if necessary
- Provide a video id (vid). The network was trained on the database as of 3/3/20. Use a video that was created after that date to ensure that the video hasn't already been seen.
- Run predict_stream.py
- Manually inspect the output stored in the variable
df
orresults
.
Note, overusing YouTubeTranscriptApi can get your IP banned.
- Better transcripts: NeuralBlock depends on being able to download the full closed captioning. Some creators disallow auto-generated English captions, making it impossible for NB to predict on. The latter could be resolved through existing speech-to-text projects such as Mozilla's DeepSpeech.
- More accurate labels: The labels is imperfect because we don't know the moment a word is spoken, only an approximate time. For example, silence (visual only ad) or really short ad segments are hard to account for.
- Incorporate video: Visual cues, such as scene cuts, are also valuable in determining ads and can help with (2).
- Support for other languages: Only English is supported at this moment.