Finetuning-BERT-For-classification-of-text

Dataset classification of quora questions sincere/insincere

An insincere question is defined as a question intended to make a statement rather than look for helpful answers. Some characteristics that can signify that a question is insincere:

Has a non-neutral tone
Has an exaggerated tone to underscore a point about a group of people
Is rhetorical and meant to imply a statement about a group of people
Is disparaging or inflammatory
Suggests a discriminatory idea against a protected class of people, or seeks confirmation of a stereotype
Makes disparaging attacks/insults against a specific person or group of people
Based on an outlandish premise about a group of people
Disparages against a characteristic that is not fixable and not measurable
Isn't grounded in reality
Based on false information, or contains absurd assumptions
Uses sexual content (incest, bestiality, pedophilia) for shock value, and not to seek genuine answers The training data includes the question that was asked, and whether it was identified as insincere (target = 1). The ground-truth labels contain some amount of noise: they are not guaranteed to be perfect.

Data fields

qid - unique question identifier
question_text - Quora question text
target - a question labeled "insincere" has a value of 1, otherwise 0

size of the Dataset

Total text in dataset is around 13 lakh of which sincere(93%) and insincere(7%)

0=>sincere
1=>insincere

sample of dataset

model

Our project involves creating a model that utilizes BERT, which is a transformer-based model known for its exceptional ability to encode text data bidirectionally, capturing intricate contextual relationships in language. This BERT model is combined with Artificial Neural Network (ANN) layers and dropout regularization techniques to enhance its predictive capabilities and prevent overfitting.

Furthermore, we employ advanced techniques such as transfer learning and fine-tuning. Transfer learning involves leveraging knowledge gained from pre-trained models like BERT and applying it to our specific task, accelerating the training process and potentially improving performance. Fine-tuning, on the other hand, involves fine-tuning the parameters of the pre-trained BERT model to adapt it to our specific task, optimizing its performance further.

model architecture

demo of application

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
model		model
static		static
templates		templates
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
model.ipynb		model.ipynb
prediction.py		prediction.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetuning-BERT-For-classification-of-text

Dataset classification of quora questions sincere/insincere

Data fields

size of the Dataset

sample of dataset

model

model architecture

demo of application

About

Releases

Packages

Languages

License

Krish-2505/Finetuning-BERT-For-classification-of-text

Folders and files

Latest commit

History

Repository files navigation

Finetuning-BERT-For-classification-of-text

Dataset classification of quora questions sincere/insincere

Data fields

size of the Dataset

sample of dataset

model

model architecture

demo of application

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages