- Install Python IDE, packages
- Understanding HTTP, interacting with API
- Set up API credentials
- Live coding to pull tweets
- What Elon Musk tweeted
- What kind of tweets that Taylor Swift liked
- Tweets related to ND Sports with hashtag #GoIrish
- What are people talking about Justice Breyer's retirement (Time Permitting)
Client Request - Server Response
The data pulling through Twitter API follows HTTP protocol. It is very useful for debugging when the API returns error message.
- Successful responses: 200 (Successful)
- Client-end error response: 401 (Unauthorized), 403 (Forbidden), 404 (Not Found)
- Response Status Code starting with 4 means the client end has error(s).
- Check the request detail
- Server-end error response: 500 (Internal Server Error), 503 (Service Unavailable)
- Response Status Code starting with 5 means the server end (Twitter Server) has error(s).
- Nothing you can do about it, wait and retry
Twitter Limits How Much and How Frequent You Can Request Data
- There is a cap of 3200 tweets for pulling a certain user's timeline.
- Each request allows a maximum of 100 tweets, use for loop and
next_token
to send sequential requests- To pull more than 3200 tweets of user's timeline, use search_all_tweets and query operators.
- By default, the API only returns the
id
and thetext
fields.- Tweepy Response
data
error
includes
user
tweet
meta
- Request additional fields
author_id
context_annotations
conversation_id
created_at
entities
in_reply_to_user_id
public_metrics
referenced_tweets
- Expansions
- To request the reference tweets data. Referenced tweets are quoted tweets or the tweets that are replied to.
- An example:
- Allows you to get information about a user’s liked Tweets.
- use function get_liked_tweets
- use search_recent_tweets() (essential or elevated account) - allow to pull data within the last 7 days
- or use search_all_tweet() (academic research account) - allow to pull the full-archive data of twitter, back to 2006
- Simple query: hashtag only
- use search_recent_tweets() (essential or elevated account) - allow to pull data within the last 7 days
- or use search_all_tweet() (academic research account) - allow to pull the full-archive data of twitter, back to 2006
- Building Query
keyword
"exact phrase match"
#
@
from:
to:
conversation_id:
is:retweet
is:quote
negation
- Specify Periods
- Date and time format (ISO 8601/RFC 3339) with 24h-clock. UTC timezone.
YYYY-MM-DDTHH:mm:ssZ
2022-01-31T00:00:01Z
2022-02-01T23:59:59Z
- Note the date specified above is in UTC timezone, you will need to convert time from local time to UTC
- e.g.
ET 2022-01-31 00:00:01
->UTC 2022-01-31T05:00:01Z