This program aim to help judicial experts to detect suspicious messages. It takes as a parameter an xml file containing all the messages and output a new xml file containing those same messages sorted from the most suspicious message to the less suspicious one. Those suspicious messages are detected by using french natural language processing.
You should first install python 3 : python 3 installation
And then install pip : pip installation
Then from the root directory of this repo run :
> pip install -r requirements.txt
> python main/main.py
A folder named results
containing the results files will be created. It is recommended to open those xml files in your browser for a better visualization.
-
In order to improve the antispam filter, you can add spams in
french_antispam/list_spams.txt
or add normal messages infrench_antispam/list_hams.txt
. Once it is done, you have to executepython french_antispam/model_init.py
to save the new model. -
In order to improve the sms -> frnech translator, you can add new words. The wrongly spelled words will be added to
sms_dico/sms.py
while the correctly spelled word will be added insms_dico/sms_traduction.py
(Each word should be added at the same line in both files). -
In order to improve the sms rate, you can modify the rate of a word or add a new word to the file
custom_textblob/textblob_fr/fr-sentiment.xml
(You should be aware that all the words contained in this file have been stemmed and you should do so before adding new words !).
- Ouassim BEN MOSBAH
- Clément BOULY