Usage

For bash command line interface help:

$ python -m tweetscrape.twitter_scrape -h

Fetch User Profile Tweets

from tweetscrape.profile_tweets import TweetScrapperProfile 

tweet_scrapper = TweetScrapperProfile("5hirish", 40, 'twitter.csv', 'csv')
tweet_count, tweet_id, tweet_time, dump_path = tweet_scrapper.get_profile_tweets()
print("Extracted {0} tweets till {1} at {2}".format(tweet_count, tweet_time, dump_path))

The TweetScrapperProfile class scrapes the tweets using Twitter frontend APIs with XPATH queries. It requires four parameters, the Twitter username, the number of tweets to scrape (default tweets scraped are 40), the file path to dump the data and the data export format, which can be JSON or CSV. You can even add proxy to the scraper via request_proxies parameter. NOTE: To extract as much tweets as possible set the number of tweets to -1.

The get_profile_tweets() method returns the count of tweets, last extracted tweet id, last extracted tweet time and the file export path of extracted tweets.

id,type,time,author,author_id,re_tweeter,associated_tweet,text,links,hashtags,mentions,reply_count,favorite_count,retweet_count
993872079274508289,tweet,1525792543000,5hirish,428808036,,993872079274508289,"Built @twitter #scrapper inspired by @kennethreitz similar project. Does a bunch of other cool stuff like extracting user tweets with all meta-data, hastags, images, likes, etc. extracting tweets based on keyword or hastag search #python @Github https://github.com/5hirish/tweet_scrapper …pic.twitter.com/bXdnrWXNwr","['https://t.co/ID5hJ6InIu', 'https://t.co/bXdnrWXNwr']","['#scrapper', '#python']","['@Twitter', '@kennethreitz', '@github']",1,14,7
1141791578970894338,tweet,1561059300000,gracecondition,127701253,5hirish,1141791578970894338,everyone else using word2vec:king – man + woman = queenme using word2vec:fish + music = bassfish + friend = chumfish + hair = mulletfish + struggle = flounderoink - pig + bro = wassupyeti – snow + economics = homo economicushttps://graceavery.com/word2vec-fish-music-bass/ …,['https://t.co/UAiViuEnM2'],[],[],17,939,227
1141849459342610437,tweet,1561073100000,Reuters,1652541,5hirish,1141849459342610437,WATCH: Elon Musk gives #E3 audience a preview of gaming in Tesla carspic.twitter.com/u7rVedhDyW,['https://t.co/u7rVedhDyW'],"['#E3', '#E3']",[],3,49,18
1141812196453699584,tweet,1561064216000,xamat,9316452,5hirish,1141812196453699584,"The annoying pop-up about cookies on websites is basically teaching me to click ""ok"" on anything that gets in my way asap, which seems very dangerous and exactly the opposite of what is intended.",[],[],[],1,23,3
1141897990627446784,tweet,1561084671000,data_mike_j,1053368990695706624,5hirish,1141897990627446784,Check out my newest blog post where I build a graph visualization of the #MuellerReport using @spacy_io and #Python including paragraph recommendation engine.https://minimizeuncertainty.com/post/graph-visualization-of-the-mueller-report-with-spacy-and-pyvis/ …,['https://t.co/Q5GGKqmbYv'],"['#MuellerReport', '#Python']",['@spacy_io'],0,31,13
1142137189775507456,tweet,1561141700000,5hirish,428808036,,1142137187783213056,"Share your weekend goals here, could be anything, like reading a book, writing a blog post, preparing your favorite dessert or anything that will give you a positive feeling of #accomplishment for the coming week. Let's check back on Monday.",[],['#accomplishment'],[],1,0,0
1142137187783213056,tweet,1561141700000,5hirish,428808036,,1142137187783213056,"Weekend Goal: Convert my Flask app into a RESTful Flask API app template with Unit tests, Travis CI and Swagger docs. #python Repo: https://github.com/5hirish/flask-restful-template … (Contributors Welcome!)#weekendgoal #accountability",['https://t.co/m7isdCd6cc'],"['#python', '#weekendgoal', '#accountability']",[],1,3,1
1141840676394434560,tweet,1561071006000,naval,745273,5hirish,1141840676394434560,"Lasting novels don’t come from literature departments. Successful businesses don’t come from business schools. Scientific revolutions don’t come from research universities.Get your education, then get moving. Find the loners tinkering at the edge.",[],[],[],149,10188,2562
1141740790542213121,tweet,1561047191000,WSJ,3108351,5hirish,1141740790542213121,"Slack shares open at $38.50 in their trading debut, above $26 reference price and giving the company a valuation of about $23.2 billionhttps://on.wsj.com/2Xmkitn",['https://t.co/uo7yCGqSmC'],[],[],3,76,48
1141511813709717504,tweet,1560992599000,quocleix,989251872107085824,5hirish,1141511813709717504,"XLNet: a new pretraining method for NLP that significantly improves upon BERT on 20 tasks (e.g., SQuAD, GLUE, RACE)arxiv: https://arxiv.org/abs/1906.08237 github (code + pretrained models): https://github.com/zihangdai/xlnet with Zhilin Yang, @ZihangDai, Yiming Yang, Jaime Carbonell, @rsalakhupic.twitter.com/JboOekUVPQ","['https://t.co/C1tFMwZvyW', 'https://t.co/kI4jsVzT1u', 'https://t.co/JboOekUVPQ']",[],"['@ZihangDai', '@rsalakhu']",21,1763,715
1141736965311569920,tweet,1561046279000,justinkan,28917111,5hirish,1141736965311569920,"One of the most important skills I’ve built is the ability to sit with discomfort. Being able to be uncomfortable (bored, on the receiving end of anger, in pain) and not needing to escape has changed my happiness and my life. It would have seemed impossible to me 12 months ago.",[],[],[],38,1578,242
....

Fetch Advance Search Tweets

from tweetscrape.search_tweets import TweetScrapperSearch

tweet_scrapper = TweetScrapperSearch(search_from_accounts="@CNN", search_till_date="2016-04-01", search_since_date="2015-11-01", num_tweets=40, tweet_dump_path='twitter.csv', tweet_dump_format='csv')
tweet_count, tweet_id, tweet_time, dump_path = tweet_scrapper.get_search_tweets()
print("Extracted {0} tweets till {1} at {2}".format(tweet_count, tweet_time, dump_path))

The TweetScrapperSearch class scrapes the tweets based on the hashtags or a search term or any of the advanced filters of Twitter search Twitter Search Advance. It requires multiple parameters, based on the filters you might want to search with. Based on these filters a query is constructed and searched on Twitter. It also requires the number of tweets to scrape, the file path to dump the data and the data export format, which can be JSON or CSV. You can even add proxy to the scraper via request_proxies parameter.

The get_search_tweets() method returns the count of tweets, last extracted tweet id, last extracted tweet time and the file export path of extracted tweets.

Following are the supported search filters (parameters) for TweetScrapperSearch class.

search_all: Search all of these words. eg. Avengers Infinity War
search_exact: Search this exact phrase. eg. Avengers
search_any: Search any of these words. eg. SpiderMan Thor Hulk
search_excludes: Search excluding these words. eg. AntMan
search_hashtags: Search these hashtags. eg. #avengers #endgame
search_from_accounts: Search tweets from these accounts. eg. @marvel @avengers
search_to_accounts: Search tweets to these accounts. eg. @marvel
search_mentions: Search tweets mentioning these accounts. eg. @avengers
search_near_place: Search tweets near this place. eg. New York
search_till_date: Search tweets until this date: YYYY-MM-DD eg. 2019-01-01
search_since_date: Search tweets since this date: YYYY-MM-DD. eg. 2018-11-01
language: Search tweets in language from language codes. eg. 'en' for English

Fetch User Tweet thread Tweets

from tweetscrape.conversation_tweets import TweetScrapperConversation

tweet_scrapper = TweetScrapperConversation("ewarren", 1146415363460141057, 40, 'twitter.csv', 'csv')
tweet_count, tweet_id, tweet_time, dump_path = tweet_scrapper.get_thread_tweets()
print("Extracted {0} tweets till {1} at {2}".format(tweet_count, tweet_time, dump_path))

The TweetScrapperConversation class scrapes the tweets from a tweet tread or conversation. It requires two parameters, the Twitter username of the user who tweeted the original tweet and the id of the tweet. It also requires the number of tweets to scrape, the file path to dump the data and the data export format, which can be JSON or CSV. You can even add proxy to the scraper via request_proxies parameter.

The get_thread_tweets() method returns the count of tweets, last extracted tweet id, last extracted tweet time and the file export path of extracted tweets.

Fetch User stats

from tweetscrape.users_scrape import TweetScrapperUser

ts = TweetScrapperUser("5hirish")
user_info = ts.get_profile_info()

The TweetScrapperUser class scrapes the stats of the user from Twitter. It requires one parameter, the Twitter username of the user. You can even add proxy to the scraper via request_proxies parameter.

The get_profile_info() method returns the JSON with user information.

{
 "username": "@5hirish", 
 "name": "Shirish Kadam", 
 "bio": "Building @alleviate_hq #SaaS #NLProc  #MachineLearning #DataScience\nAutomating Automation\nhttp://5hirish.com",
 "location": "Bengaluru, India", 
 "location_id": "1b8680cd52a711cb", 
 "url": "http://www.shirishkadam.com", 
 "tweets": "2619", 
 "following": "694", 
 "followers": "243", 
 "favorites": "8112"
}

Extracted Tweets data model

Method	Description
`get_tweet_id()`	Fetches the item unique identifier
`get_tweet_type()`	The type of the item eg. `tweet`
`get_tweet_author()`	Twitter username of the author
`get_tweet_author_id()`	Unique identifier of the author
`get_tweet_time_ms()`	Tweet time in milliseconds
`get_tweet_text()`	Original tweet text
`get_tweet_links()`	Extracted external links from the tweet
`get_tweet_hashtags()`	Extracted hashtags from the tweet
`get_tweet_mentions()`	Mentioned Twitter users in the tweet
`get_tweet_replies_count()`	Total count of replies on the tweet
`get_tweet_favorite_count()`	Total count of favorites on the tweet
`get_tweet_retweet_count()`	Total count of retweets of the tweet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USAGE.md

USAGE.md

Usage

Fetch User Profile Tweets

Fetch Advance Search Tweets

Fetch User Tweet thread Tweets

Fetch User stats

Extracted Tweets data model

Files

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Usage

Fetch User Profile Tweets

Fetch Advance Search Tweets

Fetch User Tweet thread Tweets

Fetch User stats

Extracted Tweets data model