[bug] [reddit] Some v.redd.it links on User Profiles (possibly others) fail to download, 'NoneType' is not iterable error. #3258

Silent-Soldier · 2022-11-19T10:44:30Z

I recently ran across this bug while parsing a subreddit, but I can only reliably recreate the issue with a NSFW video link on a users profile so far. Otherwise, the issue is intermittent/fails to occur, no idea why.

Verbose output:

>gallery-dl --verbose "https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/"
2022-11-19 05:29:01 [gallery-dl][debug] Version 1.24.0-dev
2022-11-19 05:29:01 [gallery-dl][debug] Python 3.11.0 - Windows-10-10.0.19045-SP0
2022-11-19 05:29:01 [gallery-dl][debug] requests 2.28.1 - urllib3 1.26.12
2022-11-19 05:29:01 [gallery-dl][debug] Starting DownloadJob for 'https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/'
2022-11-19 05:29:03 [cookies][debug] Extracting cookies from C:\Users\*****\*****\*****\Mozilla\Firefox\Profiles\*****\cookies.sqlite
2022-11-19 05:29:03 [reddit][debug] Using RedditSubmissionExtractor for 'https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/'
2022-11-19 05:29:03 [urllib3.connectionpool][debug] Starting new HTTPS connection (1): oauth.reddit.com:443
2022-11-19 05:29:03 [urllib3.connectionpool][debug] https://oauth.reddit.com:443 "GET /comments/x8p3yf/.json?limit=0&raw_json=1 HTTP/1.1" 200 2307
2022-11-19 05:29:03 [reddit][debug] Using download archive '*****/gallery-dl/.archives/reddit.sqlite3'
2022-11-19 05:29:03 [postprocessor.metadata][debug] Using download archive '*****/gallery-dl/.archives/reddit-metadata.sqlite3'
2022-11-19 05:29:03 [postprocessor.ugoira][debug] using mkvmerge demuxer
2022-11-19 05:29:03 [reddit][debug] Active postprocessor modules: [ClassifyPP, MetadataPP, MtimePP, UgoiraPP]
2022-11-19 05:29:04 [downloader.ytdl][debug] [generic] ypr3fhcnzjm91: Downloading webpage
2022-11-19 05:29:05 [downloader.ytdl][debug] [redirect] Following redirect to https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/
2022-11-19 05:29:05 [downloader.ytdl][debug] [generic] eufrat: Downloading webpage
2022-11-19 05:29:05 [downloader.ytdl][warning] [generic] Falling back on generic information extractor
2022-11-19 05:29:06 [downloader.ytdl][debug] [generic] eufrat: Extracting information
2022-11-19 05:29:06 [downloader.ytdl][error] ERROR: Unsupported URL: https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/
2022-11-19 05:29:06 [reddit][error] An unexpected error occurred: TypeError - argument of type 'NoneType' is not iterable. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
2022-11-19 05:29:06 [reddit][debug]
Traceback (most recent call last):
  File "C:\Users\*****\*****\Roaming\Python\Python311\site-packages\gallery_dl\job.py", line 84, in run
    self.dispatch(msg)
  File "C:\Users\*****\*****\Roaming\Python\Python311\site-packages\gallery_dl\job.py", line 128, in dispatch
    self.handle_url(url, kwdict)
  File "C:\Users\*****\*****\Roaming\Python\Python311\site-packages\gallery_dl\job.py", line 248, in handle_url
    if not self.download(url):
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\*****\*****\Roaming\Python\Python311\site-packages\gallery_dl\job.py", line 380, in download
    return downloader.download(url, self.pathfmt)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*****\*****\Roaming\Python\Python311\site-packages\gallery_dl\downloader\ytdl.py", line 69, in download
    if "entries" in info_dict:
       ^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable

The text was updated successfully, but these errors were encountered:

mikf · 2022-11-19T11:03:29Z

Without cookies I only get a non-fatal error:

[urllib3.connectionpool][debug] https://oauth.reddit.com:443 "GET /comments/x8p3yf/.json?limit=0&raw_json=1 HTTP/1.1" 200 2360
[downloader.ytdl][debug] [generic] ypr3fhcnzjm91: Downloading webpage
[downloader.ytdl][debug] [redirect] Following redirect to https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/
[downloader.ytdl][debug] [generic] eufrat: Downloading webpage
[downloader.ytdl][warning] [generic] Falling back on generic information extractor
[downloader.ytdl][debug] [generic] eufrat: Extracting information
[downloader.ytdl][error] ERROR: Unsupported URL: https://www.reddit.com/user/69beautifulporn69/comments/x8p3yf/eufrat/
[download][error] Failed to download ytdl:https://v.redd.it/ypr3fhcnzjm91

InterruptSpeed · 2022-11-22T02:07:26Z

looks like reddit extractor wants to hand off to yt-dlp because the JSON file has is_video=true but it's using the JSON url key/value
"url" : "https://v.redd.it/ypr3fhcnzjm91"
rather than the correct key/value
"fallback_url" : "https://v.redd.it/ypr3fhcnzjm91/DASH_720.mp4?source=fallback"
found within media->reddit_video elements.

A proposed fix would be to check for the existence of fallback_url when the domain is v.redd.it and use that value to hand off to yt-dlp. I can work on that if it makes sense?

InterruptSpeed · 2022-11-22T02:19:22Z

what is a more pythonic fix?
a)

try:
  url = submission["media"]["reddit_video"]["fallback_url"]
except KeyError:
  pass

b)

if "media" in submission \
  and "reddit_video" in submission["media"] \
  and "fallback_url" in submission["media"]["reddit_video"]:
  url = submission["media"]["reddit_video"]["fallback_url"]

to be inserted in the RedditExtractor items() method right before the yield in the elif submission["is_video"]: block

how to test that the change doesn't break other scenarios? can submit pull request for fix if we are on the right track.

Silent-Soldier · 2022-11-25T02:26:15Z

I believe @InterruptSpeed may be partially correct on this. I've been experimenting with various solutions over the last few days, focusing mainly on cookies being the issue (due to verbose feedback from gallery-dl and yt-dlp independently). Removing cookies altogether, the same behavior exists when trying the URI with yt-dlp by itself.

The "fallback_url" appears to download correctly when passed to yt-dlp, though the audio is cut/nonexistent. I believe the URIs need to be redirected to https://v.redd.it/ypr3fhcnzjm91/DASHPlaylist.mpd (higher quality) or https://v.redd.it/ypr3fhcnzjm91/HLSPlaylist.m3u8 (lower quality)?

* use fallback_url for reddit_video to fix issue 3258 * changed to dash_url to include audio * update - use [] instead of .get - catch TypeErrors in case one of the elements is not a dict Co-authored-by: InterruptSpeed <steven@docherty.ca> Co-authored-by: Mike Fährmann <mike_faehrmann@web.de>

mikf added the bug label Nov 19, 2022

InterruptSpeed mentioned this issue Nov 26, 2022

use dash_url for reddit_video to fix issue 3258 #3306

Merged

mikf closed this as completed in #3306 Nov 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] [reddit] Some v.redd.it links on User Profiles (possibly others) fail to download, 'NoneType' is not iterable error. #3258

[bug] [reddit] Some v.redd.it links on User Profiles (possibly others) fail to download, 'NoneType' is not iterable error. #3258

Silent-Soldier commented Nov 19, 2022

mikf commented Nov 19, 2022

InterruptSpeed commented Nov 22, 2022

InterruptSpeed commented Nov 22, 2022 •

edited

Loading

Silent-Soldier commented Nov 25, 2022 •

edited

Loading

[bug] [reddit] Some v.redd.it links on User Profiles (possibly others) fail to download, 'NoneType' is not iterable error. #3258

[bug] [reddit] Some v.redd.it links on User Profiles (possibly others) fail to download, 'NoneType' is not iterable error. #3258

Comments

Silent-Soldier commented Nov 19, 2022

mikf commented Nov 19, 2022

InterruptSpeed commented Nov 22, 2022

InterruptSpeed commented Nov 22, 2022 • edited Loading

Silent-Soldier commented Nov 25, 2022 • edited Loading

InterruptSpeed commented Nov 22, 2022 •

edited

Loading

Silent-Soldier commented Nov 25, 2022 •

edited

Loading