Using TextFlint to verify the robustness of a specific model is as simple as running the following command:
$ textflint --dataset input_file --config config.json
where input_file is the input file of csv or json format, config.json is a configuration file with generation and target model options.
input_file is the input file of csv or json format. Each line of the file just contains one sample JSON. Take the input file for SA task as example:
{"x": "Titanic is my favorite movie.", "y": "pos", "sample_id": 0}
{"x": "I don't like the actor Tim Hill", "y": "neg", "sample_id": 1}
Note that the input format of different tasks is different, please refer to this tutorial for details.
config.json is a configuration file with generation and target model options. Take the configuration for TextCNN model on SA task as example:
"task": "SA",
"out_dir": "./DATA/",
"trans_methods": [
["InsertAdv", "SwapNamedEnt"],
"trans_config": {
"Ocr": {"trans_p": 0.3},
task is the name of target task.
out_dir is the directory where each of the generated sample and its corresponding original sample are saved.
flint_model is the python file path that saves the instance of FlintModel.
Note that flint_model is not necessary for transformation or subpopulation. You can remove this option, if you are not familar with FlintModel.
trans_methods is used to specify the transformation method. For example, "Ocr" denotes the universal transformation Ocr, and ["InsertAdv", "SwapNamedEnt"] denotes a pipeline of task-specific transformations, namely InsertAdv and SwapNamedEnt.
trans_config configures the parameters for the transformation methods. The default parameter is also a good choice.
After transformation, here are the contents in ./DATA/
where the trans_Keyboard_2.json
contains 2
successfully transformed sample by transformation Keyboard
and ori_Keyboard_2.json
contains the corresponding original sample. The content in ori_Keyboard_2.json
{"x": "Titanic is my favorite movie.", "y": "pos", "sample_id": 0}
{"x": "I don't like the actor Tim Hill", "y": "neg", "sample_id": 1}
The content in trans_Keyboard_2.json
{"x": "Titanic is my favorite m0vie.", "y": "pos", "sample_id": 0}
{"x": "I don't likR the actor Tim Hill", "y": "neg", "sample_id": 1}
Based on the results from Generation Layer, TextFlint can generate three types of adversarial samples and verify the robustness of the target model.
For example, on the Sentiment Analysis (SA) task, this is a statistical chart of the performance ofXLNET
with different types of Transformation
on the IMDB