Skip to content

TCAT 4CAT Comparison

Dale Wahl edited this page Sep 29, 2022 · 32 revisions

Both DMI-TCAT and 4CAT offer a suite of analyses to run on datasets. It would be nice to offer parity between the two tools in this regard, so teaching materials can be adapted more easily and 4CAT can be a viable alternative to TCAT in some circumstances.

Tweet statistics and activity metrics

TCAT Analysis TCAT description 4CAT equivalent
Tweet stats Contains the number of tweets, number of tweets with links, number of tweets with hashtags, number of tweets with mentions, number of retweets, and number of replies ✔️ Twitter Statistics
User stats (overall) Contains the min, max, average, Q1, median, Q3, and trimmed mean for: number of tweets per user, urls per user, number of followers, number of friends, nr of tweets, unique users per time interval ✔️ Aggregated Statistics by Tweet Author
User stats (individual) Lists users and their number of tweets, number of followers, number of friends, how many times they are listed, their UTC time offset, whether the user has a verified account and how many times they appear in the data set. ✔️ Individual User Statistics
Hashtag frequency Contains hashtag frequencies. ✔️ Count Values with hashtags column
Hashtag-user activity Lists hashtags, the number of tweets with that hashtag, the number of distinct users tweeting with that hashtag, the number of distinct mentions tweeted together with the hashtag, and the total number of mentions tweeted together with the hashtag. ✔️ Hashtag Statistics
Twitter client (source) frequency Contains source frequencies. ✔️ Count Values with source column
Twitter client (source) stats (overall) Contains the min, max, average, Q1, median, Q3, and trimmed mean for: number of tweets per source, urls per source ✔️ Aggregated Statistics by Source of Tweet
Twitter client (source) stats (individual) Lists sources and their number of tweets, retweets, hashtags, URLs and mentions. ✔️ Source Statistics
User visibility (mention frequency) Lists usernames and the number of times they were mentioned by others. ✔️ User Visibility
User activity (tweet frequency) Lists usernames and the amount of tweets posted. ✔️ User Visibility
User activity + visibility (tweet+mention frequency) Lists usernames with both tweet and mention counts. ✔️ User Visibility
Identical tweet frequency Contains tweets and the number of times they have been (re)tweeted indentically. ✔️ Identical Tweet Frequency
Word frequency Contains words and the number of times they have been used. Tokenise -> Vectorise -> Top vectors
Media frequency Contains media URLs and the number of times they have been used. ✔️ Count Values with urls column
Export table with potential gaps in your data Exports a spreadsheet with all known data gaps in your current query, during which TCAT was not running or capturing data for this bin. Not natively; there is a 4CAT datasource that can interact with a TCAT database and allow you to pull any table including Error/Gap tables

Tweet exports

TCAT Analysis TCAT description 4CAT equivalent
Random set of tweets from selection Contains 1000 randomly selected tweets and information about them (user, date created, ...). ✔️ Random Sample
Export all tweets from selection Contains all tweets and information about them (user, date created, ...). ✔️ Download NDJSON or CSV
List each individual retweet Lists all retweets (and all the tweets metadata like follower_count) chronologically. ✔️ Filter by value is_retweet = yes
Only tweets with lat/lon Contains only geo-located tweets. ✔️ Filter by value with place_name or long_lat
Export tweet ids Contains only the tweet ids from your selection. Download CSVs, extract id column
Export hashtag table (tweet id, hashtag) Contains tweet ids from your selection and hashtags. Download CSV, open in Excel, write a formula OR Custom Network directed with id and hashtags -> open gefx file in Excel as XML file (Click Yes/OK for prompts), use the source and target columns
Export mentions table (tweet id, user from id, user from name, user to id, user to name, mention, mention type) Contains tweet ids from your selection, with mentions and the mention type. ✔️ Mentions Export

Networks

TCAT Analysis TCAT description 4CAT equivalent
Social graph by mentions Produces a directed graph based on interactions between users. If a users mentions another one, a directed link is created. The more often a user mentions another, the stronger the link ("link weight"). The "count" value contains the number of tweets for each user in the specified period. ✔️ Custom Network with author and mentions
Social graph by in_reply_to_status_id Produces a directed graph based on interactions between users. If a tweet was written in reply to another tweet, a directed link is created. ✔️ Filter by value by is_reply = yes -> Custom Network with author and thread_id (for original tweet) or author and reply_to (for author of original tweet)
Co-hashtag graph Produces an undirected graph based on co-word analysis of hashtags. If two hashtags appear in the same tweet, they are linked. The more often they appear together, the stronger the link ("link weight"). ✔️ Co-tag network
Bipartite hashtag-user graph Produces a bipartite graph based on co-occurence of hashtags and users. If a user wrote a tweet with a certain hashtag, there will be a link between that user and the hashtag. The more often they appear together, the stronger the link ("link weight"). ✔️ Bipartite Author-tag Network
Bipartite hashtag-mention graph Produces a bipartite graph based on co-occurence of hashtags and @mentions. If an @mention co-occurs in a tweet with a certain hashtag, there will be a link between that @mention and the hashtag. The more often they appear together, the stronger the link ("link weight"). ✔️ Custom Network with hashtags and mentions
Bipartite hashtag-source graph Produces a bipartite graph based on co-occurence of hashtags and "sources" (the client a tweet was sent from is its source) . If a hashtag is tweeted from a particular client, there will be a link between that client and the hashtag. The more often they appear together, the stronger the link ("link weight"). ✔️ Custom Network with hashtag and source columns
Bipartite user-source graph Produces a bipartite graph based on co-occurence of users and "sources" (the client a tweet was sent from is its source) . If a users tweets from a particular client, there will be a link between that client and the user. The more often they appear together, the stronger the link ("link weight"). ✔️ Custom Network with author and source columns

Experimental

TCAT Analysis TCAT description 4CAT equivalent
Cascade The cascade interface provides a ground level view of tweet activity by charting every single tweet in the current selection. User accounts are distributed vertically; tweets - shown as dots - are spread out horizontally over time. Lines indicate retweets. ?
The Sankey Maker Produces an alluvial diagram. ?
Associational profile (hashtags) Produces an associational profile as well as a time-encoded co-hashtag network. ?
Modulation Sequencer (URL) The tool allows one to qualitatively examine how a URL is shared on Twitter over time. See Moats and Borra (2018) for a full explanation. ?