Sentence Outlier Detection
This is an R presentation that summarizes an experiment to test a couple of methods to detect anomalus sentence in a text:
- Stahel-Donoho Estimator Distance
- The PCout method
Each sentence is characterized by a number of metrics such as: number of words, number of syllables, parsing tree depth, etc.