- Sentimental Big Brother
- Table of contents
- Description
- Second round
- First Round
- How to run as a module
- Contributors
Table of contents generated with markdown-toc
En 2021, selon le rapport de l'Economist Intelligence Unit, la France a été classée comme démocratie défaillante par le Democracy Index.
Nos démocraties se numérisent depuis plusieurs années, et une part croissante du débat public se joue dorénavant sur les réseaux sociaux. Alors qu’en période d'élections les débats télévisées sont encadrées par l’ARCOM (ex-CSA), les débats au sein des réseaux sociaux échappent encore à un contrôle clair, et notamment par manque de métriques caractérisant les enjeux qui les traversent.
En tant que citoyens, et en tant qu'étudiants dans l’intelligence artificielle nous ressentons le besoin de mettre au service de notre démocratie des outils permettant de décrypter une partie du débat politique qui se déroule aujourd’hui sur Twitter.
A cet effet, nous étudions aujourd’hui le sentiment de la twittosphère à l’encontre des différents candidats en fonction du temps.
Merci de l’attention que vous portez à notre travail, tout commentaire et toute aide est la bienvenue.
==============================
In 2021, according to the Democracy Index published by the Economist Intelligence Unit, France has been ranked as a flawed democracy.
Our democraties have been going digital for several years, and an increasing part of the public debate is now played on social networks. Although during election periods televised debates are supervised by ARCOM (ex-CSA), debates within social networks still escape clear control, notably due to the lack of metrics characterizing the issues that run through them.
As citizens, and as students in artificial intelligence, we feel the need to put at the service of our democracy some tools allowing to decipher part of the political debate that now takes place on Twitter.
To this end, today we are studying the sentiment of the twittosphere against the different candidates as a function of time.
Thank you for your attention to our work, any comments and help are welcome.
Candidates order is random.
poetry run python -m src --argument
poetry run python -m src data --download aclImdb
One can download tweets from twitter, a candidat must be mention:
poetry run python -m src data --download twitter --mention [candidat]
[candidat] must be within ["Pecresse", "Zemmour", "Dupont-Aignan", "Melenchon", "Le Pen", "Lassalle", "Hidalgo", "Macron", "Jadot", "Roussel", "Arthaud", "Poutou"]
You have several more parameters accessible:
--text
: text you wish to find in the tweet:--text retraite
--start_time
: date from which you want to start to collect the tweets (need to follow the format:YYYY-mm-DD HH:MM
,HH
andMM
are optional)--end_time
: date until which you want to collect the tweets (need to follow the format:YYYY-mm-DD HH:MM
,HH
andMM
are optional)
The dataset collected from twitter are saved into file: data/raw/[candidat]/twitter_{mention}_{start_time}_{end_time}.csv
.
The following command applies a model to a given .csv
file or recursively to all .csv
files in a directory.
The path is relative to the data/raw
directory.
poetry run python -m src features --model [model_name] --data [path_relative_to_data_raw]
The model name must be within ["random", "naive_bayes", "twitter-xlm-roberta-base-sentiment"] and the default is "twitter-xlm-roberta-base-sentiment".
The output of the model is added in new columns and saved to a .csv
file with the same path and name but relative to the data/processed
directory.