Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Twitter] Why doesn't it download this retweet? #1555

Closed
nisehime opened this issue May 14, 2021 · 6 comments
Closed

[Twitter] Why doesn't it download this retweet? #1555

nisehime opened this issue May 14, 2021 · 6 comments

Comments

@nisehime
Copy link

https://twitter.com/morino_ya/status/1392763691599237121 (NSFW)

gallery-dl.exe -v https://twitter.com/morino_ya/status/1392763691599237121
[gallery-dl][debug] Version 1.17.4
[gallery-dl][debug] Python 3.7.9 - Windows-8.1-6.3.9600-SP0
[gallery-dl][debug] requests 2.25.1 - urllib3 1.25.11
[gallery-dl][debug] Starting DownloadJob for 'https://twitter.com/morino_ya/status/1392763691599237121'
[twitter][debug] Using TwitterTweetExtractor for 'https://twitter.com/morino_ya/status/1392763691599237121'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): twitter.com:443
[urllib3.connectionpool][debug] https://twitter.com:443 "GET /i/api/2/timeline/conversation/1392763691599237121.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&count=100&ext=mediaStats%2ChighlightedLabel HTTP/1.1" 200 3825

It's just empty. The direct link to the retweeted tweet works fine, but when downloading it from other user's timeline, it is ignored.

@mikf
Copy link
Owner

mikf commented May 14, 2021

Because Twitter is very inconsistent with its results.
The data for tweet 1392763691599237121 does not contain any media entries (*), even though it should and retweets usually do. It works with "retweets": "original", though.

(*)

There should be an extended_entities entry here, but there isn't.

      "1392763691599237121": {
        "created_at": "Thu May 13 08:48:35 +0000 2021",
        "id_str": "1392763691599237121",
        "full_text": "RT @marunika: 【宣伝】新作読切です!一人暮らしを始めた男の娘の性欲が暴走していく主人公視点の漫画です。ファンザさんはモザイク、他白ヌキです。46頁500円です。\nDLsite→https://t.co/4pi74gWvF9\nFANZA→https://t.co/y…",
        "display_text_range": [
          0,
          140
        ],
        "entities": {
          "user_mentions": [
            {
              "screen_name": "marunika",
              "name": "かにまる🥷",
              "id_str": "129260683",
              "indices": [
                3,
                12
              ]
            }
          ],
          "urls": [
            {
              "url": "https://t.co/4pi74gWvF9",
              "expanded_url": "https://dlsite.jp/mawot/RJ327260/?utm_content=RJ327260",
              "display_url": "dlsite.jp/mawot/RJ327260…",
              "indices": [
                95,
                118
              ]
            }
          ]
        },
        "source": "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Web App</a>",
        "user_id_str": "1321392586066595842",
        "retweeted_status_id_str": "1392756582925049867",
        "retweet_count": 15,
        "favorite_count": 0,
        "reply_count": 0,
        "quote_count": 0,
        "conversation_id_str": "1392763691599237121",
        "possibly_sensitive_editable": true,
        "lang": "ja"
      }

@nisehime
Copy link
Author

I see. Well, twMediaDownloader doesn't have this issue. So far as I can tell it uses a bit different API urls and the responses contain those media links for the tweet.

@nisehime
Copy link
Author

After setting "retweets": "original" I see that request url hasn't changed, nor there's additional requests. Does it mean it gets the metadata from the same response?

@mikf
Copy link
Owner

mikf commented May 14, 2021

twMediaDownloader uses the official Twitter API, gallery-dl only uses the "site-internal" API (what your browser uses while on Twitter). The official API needs a consumer_key and consumer_secret, and I wouldn't want to publish the credentials associated with my Twitter account. It's fine for sites like DeviantArt, but maybe not Twitter. If I do implement support for the official API, I'd want to at least wait until they're done with API v2. Also #980.

Does it mean it gets the metadata from the same response?

Yep, Twitter returns both Retweet and original Tweet (as well as potential replies). "retweets": true uses the Retweet entry, "retweets": "original" the original Tweet. You could argue that "original" should be the default, but backwards compatibility (i.e. someone would complain if anything changed)

@nisehime
Copy link
Author

nisehime commented May 14, 2021

twMediaDownloader uses the official Twitter API

Wasn't it using it only to download videos? Unless it has changed recently.

You could argue that "original" should be the default

Not really, setting it to "original" will make saving retweets to the user's folder (the one who's retweeting) impossible, won't it? Can't the program just check the Tweet entry for media files if it hasn't found them in the Retweet entry?

@mikf
Copy link
Owner

mikf commented May 15, 2021

Wasn't it using it only to download videos? Unless it has changed recently.

It seems it is using the official API for regular timelines and https://api.twitter.com/2/timeline/media/ (same as gallery-dl) for media timelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants