subscraper

SubScraper is a simple Python script that allows you to download subtitles from OpenSubtitles.org. The script is designed to work with movies identified by their IMDb ID.

While OpenSubtitles has an API, it has many limitations, and using the API often results in a "too many requests" error. Instead, I used a Selenium-based approach to retrieve the subtitles. This approach has helped me to overcome the limitations of the OpenSubtitles API and avoid some limitations.

Getting Started

Clone the repo:

    git clone https://github.com/ccan23/subscraper.git

Navigate to the project directory:

    cd subscraper

Install the required libraries:

    pip install -r requirements.txt

Usage

Here are the available arguments for the script:

--imdb_id               Specify one or more IMDb IDs for the movies (prefix with "tt" or fully numeric)
--subtitle_type         Filter by subtitle type (e.g., srt, sub)
--language              Filter by language (e.g., eng, spa)
--incognito             Launch the browser in incognito mode (private mode)
--headless              Launch the browser in headless mode (no graphical interface)
--output_path           Specify the path to the folder where to download the subtitles
--safe_downloading      Wait until the download completes before getting the next subtitle (only works for bulk download)
--change_file_names     Change the subtitle file names to their IMDb IDs after download complete
--save_process          Save which movie were downloaded in a JSON file
--save_process_path     Specify the path where to save the JSON file
--reset_process         Reset process

Examples

Install The Matrix subtitle (default language: english, default format: .srt)

python3 subscraper.py --imdb_id tt0133093

Install Spanish subtitle in .sub format.

python3 subscraper.py --imdb_id tt0133093 --language spa --subtitle_type sub

Install 4 subtitles, save the process to the process.json file, wait until the download completes before getting the next subtitle, and change file names.

python3 subscraper.py --save_process --safe_downloading --change_file_names --imdb_id tt0111161 tt0068646 tt15097216 tt0468569

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
subscraper.py		subscraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

subscraper

Getting Started

Usage

Examples

License

About

Releases

Packages

Languages

ccan23/subscraper

Folders and files

Latest commit

History

Repository files navigation

subscraper

Getting Started

Usage

Examples

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages