This example demonstrates the use of two modules below from cleanlab:
The code and data for this example are taken from the repo below:
Install PyTorch with CUDA. If needed, change the CUDA version in the requirements.txt
file and the link below.
$ pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
Run bash script below to download all the data.
$ bash ./download_data.sh
The following will be saved in the data
folder:
- CIFAR-10 train and test images (png files)
- Noisy labels (json file)
- 20% Noise | 40% Sparsity as defined in the Confident Learning paper
- True labels (npy file)
- Pre-computed predicted probabilities from cross-validation (npy file)
- Pre-computed noisy label mask for training dataset (npy file)
Run below to train a CNN model with coteaching.
This script stores the output in a log file (out_4_2.log
) so we can see the resulting test accuracy for each epoch.
# run Confident Learning training with Co-Teaching on labels with 20% label noise
{ time python3 cifar10_train_crossval.py \
--coteaching \
--seed 1 \
--batch-size 128 \
--lr 0.001 \
--epochs 250 \
--turn-off-save-checkpoint \
--train-labels data/cifar10_noisy_labels__frac_zero_noise_rates__0.4__noise_amount__0.2.json \
--gpu 1 \
--dir-train-mask data/confidentlearning_and_coteaching/results/4_2/train_pruned_conf_joint_only/train_mask.npy \
data/ ; \
} &> out_4_2.log &
tail -f out_4_2.log;
Copyright (c) 2017-2022 Curtis Northcutt. Released under the MIT License. See LICENSE for details.