message-translation

An assistive writing tool to analyze linguistic and cultural variation across communities

Environment Setup

Please run the following:

conda create -n message python=3.8
pip install -r requirements.txt

Dataset

You can follow the instructions from the public BLM Twitter dataset to download tweets using our filtered tweetid to generate a smaller dataset which contains ~200K pro-BLM tweets and ~100K anti-BLM tweets. The preprocessing code and data are here. After that, move the dataset to ./data/blm_alm/raw/ such that you have the following two files: pro_blm_200k.txt and anti_blm_100k.txt.

Semantic Shift Analysis

cd semantic_shift
# download BERTweet to your local machine
python download_bertweet.py
sh ./bash_scripts/compute_semantic_shifts.sh

Check the notebook to see the analysis.

Cultural and Ideological Analysis

cd ideology-alignment
sh train_script.sh

Check the notebook to see the analysis.

Acknowledgement

This github is developed on the basis of UiO-UvA at SemEval-2020 Task 1 and Aligning Multidimensional Worldviews and Discovering Ideological Differences.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

message-translation

Environment Setup

Dataset

Semantic Shift Analysis

Cultural and Ideological Analysis

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

message-translation

Environment Setup

Dataset

Semantic Shift Analysis

Cultural and Ideological Analysis

Acknowledgement