Update README.md

aldengolab · Feb 26, 2017 · ba25225 · ba25225
1 parent c94b0de
commit ba25225
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -20,7 +20,7 @@ This project is using a [dataset published by Signal Media](http://research.sign
 
 From the raw article text, we generate the following features:
 
-1. Vectorized bigram Term Frequency-Inverse Document Frequency, with preprocessing to strip out named entities (people, places etc.) and replace them with anonymous placeholders (e.g. "Donald Trump" --> "<PERSON>"). We use Spacy for tokenization and entity recognition, and SkLearn for TFIDF vectorization.
+1. Vectorized bigram Term Frequency-Inverse Document Frequency, with preprocessing to strip out named entities (people, places etc.) and replace them with anonymous placeholders (e.g. "Donald Trump" --> "-PERSON-"). We use Spacy for tokenization and entity recognition, and SkLearn for TFIDF vectorization.
 2. Normalized frequency of parsed syntacical dependencies. Again, we use Spacy for parsing and SkLearn for vectorization. Here is an [excellent interactive visualization](https://demos.explosion.ai/displacy/) of Spacy's dependency parser.
 
 ## Pipeline