Cold PAWS: Unsupervised class discovery and addressing the cold-start problem for semi-supervised learning
This is python code for the label selection strategies from the paper. For the model fitting code see this repo.
First, download the sample data from here. Then, to initialise t-SNE clusters and the data files, run
python main_setup.py
The next step is to select indices using the methods defined the config files. In this instance, 'finetune' is the mini-max approach and 'repulsive' is the maxi-min approach. To run these methods on CIFAR-10, for a budget of 40 labels, use
python main_results.py --config config/test.yaml --dataset data_processed/cifar10.pickle
Here are some further examples to generate the benchmarking runs, or the results for the imagenette dataset
python main_results.py --config config/benchmark.yaml --dataset data_processed/sw24708.pickle
python main_results.py --config config/base-extra-class-disc.yaml --dataset data_processed/imagenette.pickle
To take the output files from a particular run and put them all together in a CSV file, run
python unsupervised_class_detection.py --config config/benchmark.yaml --processed_data 'data_processed/sw24708.pickle'
To make t-SNE plots from the encoding files in data, run
python vis_clustering.py
The output folder contains some example outputs for reference which need to be downloaded seperately from the link above.