Skip to content

Latest commit

 

History

History
6 lines (3 loc) · 484 Bytes

File metadata and controls

6 lines (3 loc) · 484 Bytes

NLP-unique-words-feature-engineering

An initial feature engineering notebook for a Kaggle student essay scoring competition suggested somewhat suprisingly a negative correlation between unique words and essay score. This notebook shows how misspelled words significantly contributed to that result. It also outlines how using a fixed number of words can prevent multicollinearity with other features.

The notebook can be found here.