Skip to content

Latest commit

 

History

History
33 lines (19 loc) · 1.09 KB

README.md

File metadata and controls

33 lines (19 loc) · 1.09 KB

tweet-sentiment

ESC403 Introduction to Data Science Project


Description

In this project, we analyse a Kaggle dataset containing 27000+ tweets that have been labelled with a sentiment (negative, neutral, or positive), and also explore various methods in sentiment analysis.

Motivation

Through this project we hope to gain experience with Natural Language Processing and GitHub, as well as practise what we have learned in the course.

Results

Method Accuracy
Random forest (bootstrap=False) 0.7005
DistilBERT (3 epochs) 0.7845
DistilBERT (2 epochs) 0.7890
BERT 0.7903
Tom 0.6415
Jessica

Tasks:

Tom - Finish BERT (possibly add another model, if time left), write a short script to test human-level accuracy - DONE

Plot all methods in one graphs, create the slides (possibly discuss over Skype first) - maybe one slide could be some pros/cons table of all models.