-
Notifications
You must be signed in to change notification settings - Fork 301
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Included a README file to ensure that anyone interacting with the project can quickly understand and effectively use the project.
- Loading branch information
Showing
1 changed file
with
35 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
|
||
# Irony/Sarcasam detection on Twitter data | ||
|
||
This project aims to detect irony and sarcasm in tweets using various machine learning models. Irony and sarcasm are forms of speech where the intended meaning is different from the literal meaning, making their detection a challenging task for natural language processing. | ||
## Goal | ||
|
||
The aim of this project is to develop and evaluate models capable of accurately detecting irony and sarcasm in tweets. This includes comparing the performance of traditional machine learning models to identify the most effective approach. | ||
## Methodology | ||
|
||
Utilizing a combination of EDA techniques and machine learning algorithms, we have meticulously analyzed data to discern patterns and correlations associated with tweets. Key steps include data cleaning, feature engineering, and insightful visualization to extract meaningful insights. | ||
## Data Preprocessing | ||
|
||
Data preprocessing steps include: | ||
1. Stop words removal | ||
2. Lemmatization/Stemming | ||
3. Vectorization using TF-IDF | ||
## Models Utilized | ||
|
||
1. Logistic Regression | ||
2. Random Forest Regressor | ||
3. Multinomial Naive Bayes | ||
## Libraries Used | ||
|
||
1. numpy: For efficient numerical operations | ||
2. pandas: For data manipulation and analysis | ||
3. seaborn: For visually appealing statistical graphics | ||
4. matplotlib: For comprehensive data visualization | ||
5. Sklearn: For implementing machine learning algorithms | ||
## Results | ||
1. Logistic Regression : 97% | ||
2. Random Forest Regressor : 97% | ||
3. Multinomial Naive Bayes : 95% | ||
|
||
## Conclusion | ||
Through rigorous analysis and experimentation, it has been determined that Logistic Regressioin and Random Forest Regressor models exhibit the highest predictive accuracy for irony/sarcasm detection. |