twick
is a command-line tool for fetching and storing tweets on short notice.
twick
fetches tweets that match a given search query, and stores them in any SQLAlchemy-supported database (SQLite, PostgreSQL, MySQL, and more).
Developed at BuzzFeed.
pip install twick
To authenticate its API requests, twick
requires the standard set of Twitter credentials: API key, API secret, access token, and access token secret. (For instructions on how to obtain these credentials, read this StackOverflow answer or follow Dan Nguyen's guide.) You can either supply them via the --credentials
command-line argument (as four, space-separated strings), or by setting the following environment variables in your shell:
export TWICK_API_KEY="[replace me]"
export TWICK_API_SECRET="[replace me]"
export TWICK_ACCESS_TOKEN="[replace me]"
export TWICK_ACCESS_TOKEN_SECRET="[replace me]"
twick
has two subcommands:
-
twick fetch
polls for new tweets at a regular interval. -
twick backfill
pulls earlier tweets, and stops when it can find no more.
Both store basic data on each tweet (id
, text
, created_at
, user_name
, screen_name
, and user_location
) and each API response (query
, count
, completed_in
, max_id
, since_id
, refresh_url
, next_results
).
Your search query will be the first argument after each subcommand. You can also supply any of these optional arguments:
--db [connection string]
: Any valid SQLAlchemy connection string, describing where to store your results. Default:sqlite:///twick.sqlite
--throttle [num]
: Wait [num] seconds between API requests. Defaults to 15 to stay under standard rate limits.--store-raw
: Store raw tweet JSON, in addition to excerpted fields described above.--quiet
: Silence logging.--credentials [api_key, api_secret, access_token, access_token_secret]
: See "Setup" above.
twick fetch "harlem building collapse" --db sqlite:///tweets.db
twick fetch "drone from:buzzfeedben" --db sqlite:///ben-drone-tweets.sqlite --throttle 60
twick backfill "to:davidplotz pandas" --store-raw --throttle 5