This repository contains the homework assignment for the AI course, which we implemented, focusing on a Naive Bayes classifier for sentiment classification with three labels: Positive, Negative, and Neutral. The assignment included training, evaluation, and testing subsets of the dataset.
This homework assignment aimed to help students understand and implement a Naive Bayes classifier for sentiment analysis. The task was to classify text data into three sentiment labels: Positive, Negative, and Neutral. We provided students with training and evaluation subsets of the dataset and expected them to implement a Naive Bayes classifier with Laplacian smoothing.
The dataset was divided into three subsets:
- Train Set: Used for training the model.
- Eval Set: Used for evaluating the model during development.
- Test Set: Used by TAs to evaluate the final submissions after the deadline.
The homework PDF we provided explained the structure of a Naive Bayes binary classifier. Students were required to extend this structure to handle three labels. The key components included:
- Tokenization: Breaking down text into individual words or tokens.
- Probability Calculation: Calculating the probability of each word given a class.
- Classification: Using the calculated probabilities to classify new text data.
Students needed to adapt the Naive Bayes classifier to handle three sentiment labels:
- Positive
- Negative
- Neutral
After the submission deadline, we evaluated the models using a test set.