Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLR scraper filenames #1440

Merged
merged 3 commits into from
Nov 5, 2023
Merged

SLR scraper filenames #1440

merged 3 commits into from
Nov 5, 2023

Conversation

vt-idiot
Copy link
Contributor

The SLR scraper currently sets filenames for every resolution possible, and every fisheye projection possible when the video is fisheye. It adds both MONO and TB for 360. All of this can be determined via their API...

I'm not sure if it adds up to much, or how the DB compression works, but compare 5,847 bytes of "filenames" when every single name possible is used for a fisheye projection video with 580 bytes when only the actually possible ones are used. There are 40,000 some odd scenes on SLR if you scrape everything from every studio...

Gigantic wall of text
["SLR_SLR Originals_Notes of Affection_6400p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_6400p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_6400p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_6400p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_6400p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_4096p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_4096p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_4096p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_4096p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_4096p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3160p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_3160p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3160p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_3160p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3160p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3072p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_3072p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3072p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_3072p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3072p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3000p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_3000p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_3000p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_3000p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3000p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2900p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_2900p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2900p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_2900p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2900p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2700p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_2700p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2700p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_2700p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2700p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2650p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_2650p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2650p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_2650p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2650p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1080p_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_1080p_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_1080p_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_1080p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_1080p_37756_VRCA220_alpha.mp4","SLR_SLR Originals_Notes of Affection_original_37756_MKX200_alpha.mp4","SLR_SLR Originals_Notes of Affection_original_37756_MKX220_alpha.mp4","SLR_SLR Originals_Notes of Affection_original_37756_RF52_alpha.mp4","SLR_SLR Originals_Notes of Affection_original_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_original_37756_VRCA220_alpha.mp4"]

vs.

["SLR_SLR Originals_Notes of Affection_original_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_4000p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3840p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_3360p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2880p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_2160p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_1920p_37756_FISHEYE190_alpha.mp4","SLR_SLR Originals_Notes of Affection_1440p_37756_FISHEYE190_alpha.mp4"]

Before:
image

After:
image

It can now also determine the correct projection
image

For 360 video as well as other fisheye projections
image

image

The entire scraper might need a looksie (by someone more knowledgeable!!!) to see where the API can be used instead; we're already grabbing that JSON anyways, but then using colly for almost everything.

Attempt at re-writing the filename part of the scraper.
Uses the API to properly set only the relevant filenames.
Additional fix for trans scene ID collisions. And the old filename method is necessary for trans scenes unless someone knows the API endpoint.
@crwxaj crwxaj merged commit 10bfaec into xbapps:master Nov 5, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants