This code will allow you to retrain model on the CredData dataset
-
Make sure that you are using Python 3.7.10 or higher
-
Download CredData dataset
git clone https://github.com/Samsung/CredData
cd CredData
pip install PyYAML
python download_data.py --data_dir data
- Go back to
CredSweeper/experiment
directory - Install the requirements
pip install -r requirements.txt
- Make sure that
credsweeper
in thePYTHONPATH
. You can add it with
export PYTHONPATH=<CredSweeper directory>:$PYTHONPATH
Example:
export PYTHONPATH=/home/user/code/CredSweeper:$PYTHONPATH
- Launch the experiment with
python main.py --data <CredData location> -j <num parallel process to run>
Example:
python main.py --data /home/user/datasets/CredData -j 16
- Resulting model will be saved to
results/ml_model_at-<date_time>
. You now can convert the model to onnx:
python -m tf2onnx.convert --saved-model results/ml_model_at-20240225_111951 --output ../credsweeper/ml_model/ml_model.onnx --verbose