Automated Classification of Overfitting Patches with Statically Extracted Code Features

This is the repository of Automated Classification of Overfitting Patches with Statically Extracted Code Features (doi:10.1109/tse.2021.3071750)

@article{ye2021ods,
 title = {Automated Classification of Overfitting Patches with Statically Extracted Code Features},
 author = {He Ye and Jian Gu and Matias Martinez and Thomas Durieux and Martin Monperrus},
 journal = {IEEE Transactions on Software Engineering},
 year = {2021},
 doi = {10.1109/tse.2021.3071750},
}

Folder Structure

├── Experiment: csv feature data and script for reproducing our experiment
│ 
├── Features: ODS code features
│   └── Code: ODS code description features in JSON format
│   └── Patterns: ODS repair pattern features in JSON format
│   └── Context: ODS context features in JSON format 
├── Source: The source program files that can be taken input for Coming to generate ODS features
│
├── Tests: Evosuite tests generated for Bugs.jar and Bears for labeling the correctness of RepairThemAll patches
│
└── RawRepairThemAllPatches: raw patches from the experiment of RepairThemAll

ODS Feature Extraction

We have integrated ODS feature extraction with an open source tool Coming. To extract code features, you can parse a pair of source and target files in Source folder. Use the feature mode of Coming to obtain ODS features.

parameters

We use the default parameters of XGBoost (i.e., learning_rate sets to 0.3 and max_depth sets to 6), only turning the gamma to 0.5. All parameters can be found in our notebooks.

How to use ODS to predict new and unseen patches:

checkout Coming repository and build it with maven command. Please note the Java version is 1.8.

https://github.com/SpoonLabs/coming.git
mvn install -DskipTests

execute the following script with the demo samples in Coming project. You will get a generated csv file called test.csv and the code features in Json format in output path.

java -classpath ./target/coming-0-SNAPSHOT-jar-with-dependencies.jar fr.inria.coming.main.ComingMain -input files -mode features -location ./src/main/resources/pairsD4j -output ./out

Please be noted that Coming project requires the specific structures of input source and target files:

<location_arg>
├── <diff_folder>
│   └── <modif_file>
│       ├── <diff_folder>_<modif_file>_s.java
│       └── <diff_folder>_<modif_file>_t.java

get the test.csv ready and predict it with the following code. You will find the prediction result generated in prediction.csv.

python3 predict.py

You may also need the dependecies:
python3 -m pip install  xgboost
python3 -m pip install scikit-learn
python3 -m pip install imblearn
python3 -m pip install matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Experiment		Experiment
Features		Features
RawRepairThemAllPatches		RawRepairThemAllPatches
Source		Source
Tests		Tests
CrossHeader.csv		CrossHeader.csv
JSON2CSV_Sample.py		JSON2CSV_Sample.py
PatchLabels.csv		PatchLabels.csv
README.md		README.md
predict.py		predict.py
test_example.csv		test_example.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Classification of Overfitting Patches with Statically Extracted Code Features

Folder Structure

ODS Feature Extraction

parameters

How to use ODS to predict new and unseen patches:

checkout Coming repository and build it with maven command. Please note the Java version is 1.8.

execute the following script with the demo samples in Coming project. You will get a generated csv file called test.csv and the code features in Json format in output path.

get the test.csv ready and predict it with the following code. You will find the prediction result generated in prediction.csv.

About

Releases

Packages

LittleYoung1017/ODSExperiment

Folders and files

Latest commit

History

Repository files navigation

Automated Classification of Overfitting Patches with Statically Extracted Code Features

Folder Structure

ODS Feature Extraction

parameters

How to use ODS to predict new and unseen patches:

checkout Coming repository and build it with maven command. Please note the Java version is 1.8.

execute the following script with the demo samples in Coming project. You will get a generated csv file called test.csv and the code features in Json format in output path.

get the test.csv ready and predict it with the following code. You will find the prediction result generated in prediction.csv.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages