Comparing twitter data and surveys data: a topic modeling and sentiment analysis approach

by Felipe Alamos and Bhargavi Ganesh

Goal of study

The goal of this study is to analyze if twitter data is a good proxy of masses opinion, and in particular, if tweets reveal the same information as traditional surveys. A full report of the project can be found in the file 'Is Twitter a Proxy for Public Opinion_Alamos-Ganesh.pdf'

Files in repo

download_tweets folder

download_tweets.py: python script that download tweets that contain words given a specific array of keywords To call, include args max_tweets, keyword_1, keyword_2, etc. Ex: python3 download_tweets.py 100 #environment environment
config.yaml: configuration file specifying some other parameters for downloading tweets (countries, dates and radius of search).
downloaded_tweets folder: contains csv with downloaded tweets.

data_cleaning folder

data_cleaning_environment.R and data_cleaning_climate_change.R are two R scripts that read the raw .csv files of downloaded tweets, cleans them and creates new files with the corpus.
corpus_df_#environment-usa.csv and corpus_df_#climatechange-usa.csv are the two files generated by the previous scripts. The former will be used for topic modelling and the latter for sentiment analysis.

topic_modeling folder

topic_modeling.R: R script that runs topic modelling analysis on the tweets corpuses
plots folder: presents plots and images from the topic modelling

sentiment analysis

sentiment_analysis.R: main script to run sentiment analysis on twitter.
vader_final.py: python script to make sentiment analysis with vader dictionary

eda_survey folder

eda_survey.R: sript that plots basic eda of survey responses, both for environmental issues and climate change questions.

Setup

pip install GetOldTweets3, library used to download tweets.

References

We used this example as a general guideline to conduct topic modelling on twitter data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing twitter data and surveys data: a topic modeling and sentiment analysis approach

Goal of study

Files in repo

download_tweets folder

data_cleaning folder

topic_modeling folder

sentiment analysis

eda_survey folder

Setup

References

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
data_cleaning		data_cleaning
download_tweets		download_tweets
eda_survey		eda_survey
sentiment_analysis		sentiment_analysis
topic_modeling		topic_modeling
Is Twitter a Proxy for Public Opinion_Alamos-Ganesh.pdf		Is Twitter a Proxy for Public Opinion_Alamos-Ganesh.pdf
Readme.md		Readme.md

fhalamos/topics-sentiments-twitter-data

Folders and files

Latest commit

History

Repository files navigation

Comparing twitter data and surveys data: a topic modeling and sentiment analysis approach

Goal of study

Files in repo

download_tweets folder

data_cleaning folder

topic_modeling folder

sentiment analysis

eda_survey folder

Setup

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages