Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] youtube-dl does not find videos listed on user/channel "shorts" subpage #31336

Closed
GigoMego opened this issue Nov 7, 2022 · 9 comments · Fixed by #31409
Closed

[YouTube] youtube-dl does not find videos listed on user/channel "shorts" subpage #31336

GigoMego opened this issue Nov 7, 2022 · 9 comments · Fixed by #31409

Comments

@GigoMego
Copy link

GigoMego commented Nov 7, 2022

Hi

youtube-dl won't parse videos on user/channel "shorts" subpage

https://www.youtube.com/c/Sonyakisa8TT/shorts

if amout of videos is big then subpages are downloaded but output is "...Shorts: Downloading 0 videos"

Can it be fixed?

@dirkf
Copy link
Contributor

dirkf commented Nov 7, 2022

Reproducible in git master.

The Shorts tab isn't currently extracted but as its structure appears to be very similar to the Videos tab, there should be a simple patch, like this:

--- old/youtube_dl/extractor/youtube.py
+++ new/youtube_dl/extractor/youtube.py
@@ -2680,7 +2680,10 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
 
     def _rich_grid_entries(self, contents):
         for content in contents:
-            video_renderer = try_get(content, lambda x: x['richItemRenderer']['content']['videoRenderer'], dict)
+            video_renderer = try_get(content,
+                (lambda x: x['richItemRenderer']['content']['videoRenderer'],
+                 lambda x: x['richItemRenderer']['content']['reelItemRenderer']),
+                dict)
             if video_renderer:
                 entry = self._video_entry(video_renderer)
                 if entry:

Then:

$ python -m youtube_dl --flat-playlist 'https://www.youtube.com/c/Sonyakisa8TT/shorts'
[youtube:tab] Sonyakisa8TT: Downloading webpage
[download] Downloading playlist: Sonyakisa8 TT - Shorts
[youtube:tab] Downloading page 1
[youtube:tab] Downloading page 2
[youtube:tab] Downloading page 3
[youtube:tab] playlist Sonyakisa8 TT - Shorts: Downloading 151 videos
[download] Downloading video 1 of 151
[download] Downloading video 2 of 151
[download] Downloading video 3 of 151
[download] Downloading video 4 of 151
[download] Downloading video 5 of 151
[download] Downloading video 6 of 151
...
[download] Downloading video 150 of 151
[download] Downloading video 151 of 151
[download] Finished downloading playlist: Sonyakisa8 TT - Shorts
$

@dirkf dirkf changed the title youtube-dl does not find videos listed on user/channel "shorts" subpage [YouTube] youtube-dl does not find videos listed on user/channel "shorts" subpage Nov 7, 2022
@zhangeric-15
Copy link
Contributor

Hello, I am currently a student at a university. For my software engineering class, our final project is to contribute to an open source project. My classmate and I are interested in attempting to fix this bug, but wanted to reach out to any maintainers of this repo to see if there's any additional information that could help us get started. We will be spending a couple weeks on this project and would appreciate any advice on how we can help.

Thank you!

@dirkf
Copy link
Contributor

dirkf commented Nov 7, 2022

As you can see I've already provided a fix that works in this particular case. But that's just the start of the work that leads to a solution being committed to the repository.

If you're still interested, this is what I'd suggest

  • familiarise yourselves with the program by reading the manual, especially the development section, trying various command options, and studying the source repository
  • as suggested in the documentation, fork the repo, clone the forked repo to your development environment and start a branch for the fix
  • review my patch, check it against the more sophisticated yt-dlp fork (which already handles this correctly) to see if it could be improved (eg, it looks like we could get an interim title along with the video_id of each Short -- maybe that would be possible for Videos too?)
  • create a new version of youtube_dl/extractor/youtube.py in your patch branch
  • test it with some further real-life examples
  • create a test-case for the fix in the _TESTS list of YoutubeTabIE and check it locally
  • once you're happy with the patch branch, push it to your GH repo and create a Pull Request against the master branch of the yt-dl repo: check the result of your test in the integration workflow that the project runs on each commit.

You should be aware that (for obvious reasons) the YouTube extractor module is the most complex of the yt-dl extractors, about twice the size of the next biggest site-specific extractor. The yt-dlp extractor is twice as big again!

Of course there are other open issues for which no solution yet exists that you could try to solve but what I've suggested is probably a good dose of software engineering.

@zhangeric-15
Copy link
Contributor

zhangeric-15 commented Nov 8, 2022

Thank you so much for your detailed response and advice! I double-checked our project requirement, and unfortunately, we have to find a bug/feature that hasn't been solved yet, so it looks like we can not work on this bug. Do you have any suggestions on other open issue reports we could help out with?

Edit: Actually hold on, let me confirm with my course staff
Edit #2: We can take this bug!

notriddle added a commit to notriddle/youtube-dl that referenced this issue Nov 25, 2022
zhangeric-15 added a commit to zhangeric-15/youtube-dl that referenced this issue Nov 27, 2022
@zhangeric-15
Copy link
Contributor

@dirkf
do you think you can clarify what you mean by this "(eg, it looks like we could get an interim title along with the video_id of each Short -- maybe that would be possible for Videos too?)". I thought the extractor for youtube videos are already able to retrieve a video's id and title, unless I'm not fully understanding what you mean. Thanks again for your help!

@dirkf
Copy link
Contributor

dirkf commented Nov 28, 2022

The extraction of videos in a playlist has two steps:

  1. extract playlist as a list (-ish) of items
  2. extract each video in the playlist.

My suggestion was that the playlist data would allow a title to be proposed in the list of playlist items, whereas the final title isn't found until step 2.

Then the output of --flat-playlist --dump-json would include a title that wouldn't otherwise be available.

@zhangeric-15
Copy link
Contributor

Ok, I think that makes a little more sense. Thanks!

zhangeric-15 added a commit to zhangeric-15/youtube-dl that referenced this issue Nov 30, 2022
@zyrup
Copy link

zyrup commented Dec 13, 2022

Hi, I wanted to ask what the current status on this bug is. Two weeks have passed now, has any commit happened to any branch that could be used?

@zhangeric-15
Copy link
Contributor

The fix is currently in the process of a pull request here: #31409
I'm not sure when they are planning to merge though.

gaming-hacker added a commit to gaming-hacker/youtube-dl that referenced this issue Dec 18, 2022
* commit '308fc0bb9bb34a9908884bb72d87869098b39659':
  Linted?
  Spacing nits
  Updated youtube_dl/extractor/youtube.py to remove space on line 2210.
  Delete random3.txt
  Delete random2.txt
  Delete random.txt
  Added test case for YoutubetabIE.
  Removed print statements.
  Playlist data should include title of videos now ytdl-org#31336
  Added a test case to test downloading from shorts tab.
  Removed a comment. Reference ytdl-org#31336
  Added fix for youtube shorts.
  random3
  Added random2.txt
  random test
dirkf added a commit that referenced this issue Feb 2, 2023
…bpage. (#31409)

Resolves #31336

Co-authored-by: Jouni Järvinen <rautamiekka@users.noreply.github.com>
Co-authored-by: dirkf <fieldhouse@gmx.net>
@dirkf dirkf added the fixed label Feb 19, 2023
alxlive pushed a commit to alxlive/youtube-dl that referenced this issue Feb 27, 2023
…bpage. (ytdl-org#31409)

Resolves ytdl-org#31336

Co-authored-by: Jouni Järvinen <rautamiekka@users.noreply.github.com>
Co-authored-by: dirkf <fieldhouse@gmx.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants