Use of V2 API #445

sugarcane29 · 2020-09-10T09:13:39Z

Is there a possibility using this package with the Twitter V2 API.
I'm trying to do some historical search and it appears that the V2 API allows for it in the free version.

Ref: start_time and end_time in the api: https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-recent

I tried changing the query parameters in the source and rebuilding the package, but it throwing lots of errors.

Thanks.

JNavelski · 2020-09-11T18:11:50Z

Hi, I am also trying to use rtweet with the V2 API from twitter (with a developer account), and for some reason I am having a difficult time getting historical data from my timeline. I would like to pull the last 30 days form my timeline, but I am still just getting the results I would get if I just pull without the developers account.

andypiper · 2020-09-12T02:44:17Z

There is no 30 day search or user timeline option in the Twitter API V2 yet but these are both coming soon. We are as excited as you are to see rtweet expand to support V2!

alexpghayes · 2021-03-03T20:44:22Z

I would be very happy to contribute v2 API functionality, especially now since it seems rtweet is back under active development. Can @mkearney, @llrs, and @hadley perhaps comment on the current governance strategy for rtweet? Are y'all the right people to talk to about this?

Also, it seems like the internals are undergoing a fairly extensive re-write at the moment, largely by @hadley. @hadley, do you plan on implementing v2 API infrastructure, or will you have to switch your attention other projects soon? If not, is there any dev-facing documentation about the new internals to read to get oriented, or is the appropriate place to start in the code itself?

llrs · 2021-03-03T21:37:36Z

Good questions @alexpghayes. I tried reaching @mkearney via several methods (twitter, github, email) but got no responses over several months. I requested to rOpenSci to maintain the package because I had some interest on new functionality and there were lots of issues and pending PR. @sckott gave me permissions like 2 weeks ago (2021/02/15), but we haven't talked about governance strategy, if anyone want to take over the maintenance for me it is ok.

Initially I wanted to solve the issues and later on start making more profound changes without breaking changes (see #471). But since @hadley contributed with a new internal process to call Twitter's API and rewritten some functions this will no longer be possible. He is now working (temporary?) on other projects (see comments on #526 and #523).

About the API v2 itself I haven't read much about it yet. As far as I know the path to the endpoint is different and now you can select which fields you want to get when making GET requests. You'll probably need to start modifying the TWIT_method to accept v2 endpoints and work from there (TWIT_GET and the new user facing function). There is no documentation of the internals. Hope this helps

hadley · 2021-03-03T22:06:15Z

I looked into the v2 api very little but I wondered if it might be better to pull the hard code version number out of TWIT_get() so it has to be specified in the call eg TWIT_get(token, “1/tweets/search”) (or whatever the end point is). If we go down that path, it would be good to decide now and fix all calls in a PR before starting more material work.

alexpghayes · 2021-03-03T23:24:16Z

My impression from a day of playing with v2 is that it is dramatically better and well worth investing in internal infrastructure to support v2. From a research perspective, it enables a lot. However, the return objects also seem be fundamentally different. I will read the v2 documentation more thoroughly tonight, but based on my current understanding we'd probably want entirely separate interfaces to the v2 endpoints and the v1.1 endpoints. I'm not sure how to approach this.

@hadley I can make a PR with a revision of TWIT_get() to support version specification and request a review when it's ready?

hadley · 2021-03-04T00:34:23Z

@alexpghayes yeah, that’d be great.

hadley · 2021-03-04T13:04:29Z

@alexpghayes once the PR is merged, I'd suggested by making a list of 1.1 vs 2 endpoints to figure out where we need to add behaviour vs change existing behaviour. It'll probably be easiest to start by adding functions for capabilities that are not available in v1.1, in order to get a sense of the API and what additional parameters the functions will need. I'd suggest picking one function to implement (maybe #363) then doing a PR to make a solid foundation for future work.

alexpghayes · 2021-03-09T04:02:19Z

Two key resources appear to be the migration guide: https://developer.twitter.com/en/docs/twitter-api/migrate and the data dictionary: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/using-fields-and-expansions. In particular, the data returned by v2 looks very different than data returned by v1. Presumably (but not definitively) we'll want to figure out a map from the new, more structured and nested JSON objects to a single line in a tibble for several different varieties of JSON object. Additionally, it looks like the API itself is moving from a "you get everything" model to a "explicitly request the data you want" model. There is then a whole new conceptual model for explicitly requesting the data you want with fancy arguments.

Based on the level of enthusiasm for https://github.com/cjbarrie/academictwitteR it seems like a good place to begin would be with the full archive search. I'll start on a PR this week.

alexpghayes · 2021-03-09T04:04:41Z

From the migration guide, the new API endpoints are:

Resource	Endpoint group	How it can be used	Implemented
Tweets	Tweets lookup	Returns information about a Tweet or group of Tweets.	[ ]
--	Recent search	Returns Tweets over the last seven to nine days that match your query criteria.	[ ]
--	Full-archive search	Query the complete archive of public Tweets created since the first Tweet in March 2006. This endpoint is currently only available with the Academic Research product track	[ ]
--	User Tweet timeline	Returns the Tweets composed by, or mentioning, a specified Twitter account.	[ ]
--	User mention timeline	Returns the Tweets mentioning a specified Twitter account.	[ ]
--	Filtered stream	Delivers Tweets which match your rules through a persistent HTTPS streaming connection.	[ ]
--	Sampled stream	Delivers about a 1% sample of all new public Tweets as they happen through a persistent HTTP streaming connection.	[ ]
--	Hide replies	Hides or unhides replies to Tweets that you or other authenticated users publish.	[ ]
Users	Users lookup	Returns the profile information for a given user with the newly added ability to specify fields to be returned.	[ ]
--	Follows lookup	Retrieve an account’s followers and who they are following using their user ID.	[ ]
--	Manage follows	Follow or unfollow users using their user ID.	[ ]

alexpghayes · 2021-03-14T21:59:58Z

I started on this at https://github.com/alexpghayes/rtweet/tree/v2 but the current bearer token interface is challenging enough to work with that I'm putting this on hold until #469 is resolved.

nikolassch · 2021-04-12T12:59:23Z

Dear All, thanks a lot for your work!!

Do you have a broad estimate when the usage of V2 API will be available? I would be particularly interested in searching by conversation_ids.

llrs · 2021-04-12T13:38:07Z

Probably this summer I'll make a sprint on rtweet (if it is not included by then, I would include v2 API). Then I would like to get some feedback and would leave some time before finally submitting to CRAN. The best estimate I have of a CRAN release with v2 support is before the end of the year.

Note that rtweet development version has a function to retrieve threads: tweet_threading.

If tweet_threading doesn't work and you want to have sooner support from rtweet, you can send pull requests and we can work on it. I think that @alexpghayes was also interested and started working on this.

alexpghayes · 2021-04-12T17:28:18Z

Now that #542 is merged v2 support is once again on my radar, but in classic academic fashion it is but one of many competing priorities at the moment.

sugarcane29 · 2021-04-12T18:41:48Z

I'm sorry if there are elementary questions, I'm only asking as I'd like to help in anyway I can.

Last year, after getting a response that V2 wasn't supported, I made a few changes to the rtweet internals and I got it working. (I had just changed the URLs that were being called from within rtweet.)
I wasn't making complicated queries, but I didn't face any issues with search, trends (and I don't remember exactly, but maybe also tweet ids).
The results were the same with the Node package that I shifted to for V2.

I've gone through the thread a couple of times and also the various other issues mentioned here, but I couldn't get a few things.
Would be really nice if someone could clarify:

Will just changing the URLs not be enough?
Are we aiming for a more thorough overhaul?
Will there be any significant differences between this and the academicTwitter package?
How can I contribute? (I'm comfortable with R coding, but have never built any packages) (If my earlier code, the one mentioned above can be help, I'll dig through my history to find it.)

Thanks.
Once again, apologies for asking some really basic questions.

hadley · 2021-04-12T19:16:02Z

@sugarcane29 it's not a one-to-one mapping from v1 to v2 because more has changed than just the url (e.g. you can now select which fields to include in the results). So this makes it an opportunity to revisit rtweet's API to make bigger changes. I think it would be a missed opportunity to just change the URLs and not reconsider the large API.

llrs · 2021-04-13T08:22:58Z

Hi @sugarcane29, many thanks for opening the issue and wanting to help. Help maintaining this package is always welcomed!

Adding to Hadley's comments: Changing the URL will not be enough because API v2 does not provide all the functionality of API v1 (yet) and when it does, it does not return the same data or works the same.
Yes, after 2 years without any maintenance or improvement, there were many bugs and it is on a process to make a big overhaul on rtweet spearheaded by Hadley. Besides, code for API v1 might not work well for v2 for the reasons mentioned above.
I think that academicTwitter is meant to be a temporary solution until rtweet supports v2.
Perhaps you have one endpoint that you wish rtweet supported. You can contribute with code with a pull request to add it. If it meets the overall design and quality of rtweet I will include the code on rtweet. However, due to recent work on rtweet, I don't think changes you made last year will would be up to date with current code.

Hope this helps to clarify your doubts.

alexpghayes · 2021-11-16T01:21:37Z

Just in case it wasn't abundantly clear from me being totally AWOL, v2 support is not on any critical research path for me at the moment and I am unlikely to be able to find any significant time to develop to v2 in the near future -- so please don't hold back on implementing stuff if you're excited about making this happen and were holding off on the basis of my comments above!

billmclellan · 2022-03-08T19:06:28Z

just coming into this discussion - anyone taking this up?

llrs · 2022-03-08T21:04:48Z

@billmclellan I'm currently doing other things. But certainly this would be very much welcomed.
If you or anyone else want to start working on this I'll support and advise them to get APIv2 supported on rtweet.

billmclellan · 2022-03-12T16:24:20Z

I've not created a package before, but I've done lots of functions and scripts for my own analysis. Here's what I've started for myself using httr instead of rtweet, as suggested by Twitter for their v2 api. Am I on the right track?
Snippet:
get_user <- function(username, headers, params) {
url_handle <- sprintf('https://api.twitter.com/2/users/by?usernames=%s', username)
response <- httr::GET(url = url_handle,
httr::add_headers(.headers = headers),
query = params)
obj <- httr::content(response, as = "text")
json_data <- fromJSON(obj, flatten = TRUE) %>% as_tibble()
}
pragmantics.txt

llrs · 2022-03-12T16:43:52Z

@billmclellan This is a first step. The attached code is fine for a script but not for a package. The code to use twitter's API v2 should be inside the code of the package (you'll need to fork and modify/add files) and use internal functions of rtweet in order to be added (to ensure user consistency, for instance here you don't handle API rate limits or check users input).
Look how it is done on API v1 and try to emulate this for v2.

billmclellan · 2022-03-14T19:22:04Z

thanks @llrs that's what I thought - thanks for confirming

brshallo · 2022-04-04T22:46:24Z

Link to Twitter's documentation on Getting started with R and v2 of the Twitter API (though note that expansion parameter needs to be changed to expansions or will get a 400 error). This looks like what @billmclellan may have used as a starting point.

JessicaGarson · 2022-08-04T20:50:44Z

@brshallo This was fixed a little while back on our end.

llrs · 2022-08-04T20:58:05Z

Thanks Jessica, for checking in and reporting the updates.

To everybody here: I hope in a couple of weeks to retake the development of rtweet (I've moved to the devel branch, if you don't see any change).

llrs · 2023-01-16T00:00:28Z

Update to all interested in this issue!

The latest rtweet release 1.1.0 supports the streaming endpoints. Next release will support the archive an recent tweets among others because I can easily support all the endpoints requiring OAuth 2.0 App Only (With the bearer token)

Unfortunately, I have some problems authenticating via Oauth 2.0 with PKCE, see this comment for a brief summary, which means no support for bookmarks yet (#344) and other endpoints requiring it. I don't want to add support for endpoints that could work via OAuth 1.0; I expect its support might end soon, although I might find a way to support it easily. See this table of endpoints and authentications to know other endpoints afected by this roadblock.

Some decisions I am facing in case someone wants to add their opinion:

Would you prefer to get all the data as the previous API did? The new API the functions by default only provide the bare minimum information requested. Currently rtweet returns the minimal data with an easy way to get all the data via the fields and expansions. The API to set expansions and fields might change, as it isn't intuitive and idomatic.
Do you prefer to mimic the old data or provide new outputs structures? The API output is different and more flexible, parsing it will be different too and currently not performed. I will try to keep the new helpers (ids, rbind, entity, ...) working even if the new endpoints return a different output .
Do you have preferences for an endpoint? Currently there is support in the API v2 for the streaming endpoints because they stop working with API v1. Besides the bookmark endpoint, I will focus on supporting the compliance jobs, so that any user can check if they need to delete stored tweets and user info. But there are a lot more, see this guide for mappings between API versions. There is already support for the searching in the archive in the devel branch if one is ready for a wild rodeo and have academic research access you can use search_archive (name might change see discussions in Add function to list muted accounts #480)

Thanks all for you patience.

erima2020 · 2023-01-20T10:58:29Z

Hello,
Thank you for this polling of preferences !

Would you prefer to get all the data as the previous API did? The new API the functions by default only provide the bare minimum information requested. Currently rtweet returns the minimal data with an easy way to get all the data via the fields and expansions. The API to set expansions and fields might change, as it isn't intuitive and idomatic.

I would prefer as an option (e.g., with argument complete = TRUE) to have the same fields as in the previous API, which would simplify code on my end, if that is possible, and maybe the minimum information as a default.

Do you prefer to mimic the old data or provide new outputs structures? The API output is different and more flexible, parsing it will be different too and currently not performed. I will try to keep the new helpers (ids, rbind, entity, ...) working even if the new endpoints return a different output .

Ideally it would be good to have both options (the current and the more flexible structure)

Do you have preferences for an endpoint? Currently there is support in the API v2 for the streaming endpoints because they stop working with API v1. Besides the bookmark endpoint, I will focus on supporting the compliance jobs, so that any user can check if they need to delete stored tweets and user info. But there are a lot more, see this guide for mappings between API versions. There is already support for the searching in the archive in the devel branch if one is ready for a wild rodeo and have academic research access you can use search_archive (name might change see discussions in Add function to list muted accounts #480)

In priority, I would like an update on the search_tweets endpoint. The timelines would also be of interest.
Best wishes,
Eric

llrs · 2023-01-20T13:18:40Z

Thanks for sharing your preferences @erima2020 .

There is the expansions and the fields arguments to control this, users will need to set those to get all the data available. Which by default they only provide the minimal information requested. I'm thinking how to make it easier and more intuitive: Currently there is a way to get all data and get just the default. I might need to experiment a bit more about what is accepted by the API v2.

While I agree on maintaining the old output for old code is always nice (which I break in the 1.0.2 release), the current parsing of the data to generate the output is slow (imho) and forces for instance, a user interested in media to retrieve everything. Given that old code will be updated I think it is a good opportunity to provide faster and better interface for the users (or completely abandon the package :/).

As discussed privately the search_tweets is not in risk to get deprecated by the API. But I expect it will be easy to support, and might be available soon via the v2 in rtweet. I only mentioned those that were not available with API v1.1 (although I'm missing, some see API v2 ) as I will focus on at least maintain current functionality via the v2.

llrs · 2023-04-27T22:30:59Z

Now (since version 1.1 for the streaming endpoints) it is possible to use the API v2.
You can retrieve data from all the endpoints but rtweet currently only allows to manage tweets (POST and DELETE) and not likes or lists or similar actions.

sugarcane29 changed the title Hi Use of V2 API Sep 10, 2020

llrs mentioned this issue Feb 15, 2021

Update roadmap #471

Closed

llrs added the enhancement label Feb 17, 2021

llrs mentioned this issue Feb 28, 2021

Get_timeline() doesn't pull older tweets than 3200 tweets cap #278

Closed

hadley mentioned this issue Mar 4, 2021

Support for Academic Product Track? #468

Closed

llrs mentioned this issue Apr 7, 2021

Use of negation term not working in premium search #561

Closed

llrs added the API v2 label Dec 7, 2021

llrs mentioned this issue Jan 31, 2023

Error in check_token_v2(): ! A bearer token is needed for this endpoint. #760

Closed

llrs closed this as completed Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of V2 API #445

Use of V2 API #445

sugarcane29 commented Sep 10, 2020

JNavelski commented Sep 11, 2020 •

edited

Loading

andypiper commented Sep 12, 2020

alexpghayes commented Mar 3, 2021

llrs commented Mar 3, 2021

hadley commented Mar 3, 2021

alexpghayes commented Mar 3, 2021

hadley commented Mar 4, 2021

hadley commented Mar 4, 2021

alexpghayes commented Mar 9, 2021 •

edited

Loading

alexpghayes commented Mar 9, 2021

alexpghayes commented Mar 14, 2021

nikolassch commented Apr 12, 2021

llrs commented Apr 12, 2021

alexpghayes commented Apr 12, 2021

sugarcane29 commented Apr 12, 2021

hadley commented Apr 12, 2021

llrs commented Apr 13, 2021

alexpghayes commented Nov 16, 2021 •

edited

Loading

billmclellan commented Mar 8, 2022

llrs commented Mar 8, 2022

billmclellan commented Mar 12, 2022 •

edited

Loading

llrs commented Mar 12, 2022

billmclellan commented Mar 14, 2022

brshallo commented Apr 4, 2022

JessicaGarson commented Aug 4, 2022

llrs commented Aug 4, 2022

llrs commented Jan 16, 2023

erima2020 commented Jan 20, 2023

llrs commented Jan 20, 2023

llrs commented Apr 27, 2023

Use of V2 API #445

Use of V2 API #445

Comments

sugarcane29 commented Sep 10, 2020

JNavelski commented Sep 11, 2020 • edited Loading

andypiper commented Sep 12, 2020

alexpghayes commented Mar 3, 2021

llrs commented Mar 3, 2021

hadley commented Mar 3, 2021

alexpghayes commented Mar 3, 2021

hadley commented Mar 4, 2021

hadley commented Mar 4, 2021

alexpghayes commented Mar 9, 2021 • edited Loading

alexpghayes commented Mar 9, 2021

alexpghayes commented Mar 14, 2021

nikolassch commented Apr 12, 2021

llrs commented Apr 12, 2021

alexpghayes commented Apr 12, 2021

sugarcane29 commented Apr 12, 2021

hadley commented Apr 12, 2021

llrs commented Apr 13, 2021

alexpghayes commented Nov 16, 2021 • edited Loading

billmclellan commented Mar 8, 2022

llrs commented Mar 8, 2022

billmclellan commented Mar 12, 2022 • edited Loading

llrs commented Mar 12, 2022

billmclellan commented Mar 14, 2022

brshallo commented Apr 4, 2022

JessicaGarson commented Aug 4, 2022

llrs commented Aug 4, 2022

llrs commented Jan 16, 2023

erima2020 commented Jan 20, 2023

llrs commented Jan 20, 2023

llrs commented Apr 27, 2023

JNavelski commented Sep 11, 2020 •

edited

Loading

alexpghayes commented Mar 9, 2021 •

edited

Loading

alexpghayes commented Nov 16, 2021 •

edited

Loading

billmclellan commented Mar 12, 2022 •

edited

Loading