Data and Code of "Estimating the perceived dimension of psychophysical stimuli using triplet accuracy and hypothesis testing"
This repository contains experiments and plotting routines of our publication:
Künstle, D.-E., von Luxburg, U., & Wichmann, F. A. (2022). Estimating the perceived dimension of psychophysical stimuli using triplet accuracy and hypothesis testing. Journal of Vision, 22(13), 5.
Please consider citing the paper if you use this code.
@article{kunstleEstimatingPerceivedDimension2022,
title = {Estimating the Perceived Dimension of Psychophysical Stimuli Using Triplet Accuracy and Hypothesis Testing},
author = {K{\"u}nstle, David-Elias and {von Luxburg}, Ulrike and Wichmann, Felix A.},
year = {2022},
month = dec,
journal = {Journal of Vision},
volume = {22},
number = {13},
pages = {5},
issn = {1534-7362},
doi = {10.1167/jov.22.13.5},
}
Contact: David-Elias Künstle, University of Tübingen, <david-elias.kuenstle[AT]uni-tuebingen.de>
The easiest setup on any platform is to install
conda
(e.g. miniconda), clone this
repository, and run the following commands:
conda env create -f environment.yml
conda activate dimensionality
pip install cblearn-cdbacb367d3efa2bef94a70bdf8e505f6c7bbd30.zip
python setup.py develop
We used a Ubuntu 20.04 machine to develop and test this repository and a slurm linux cluster to execute the computational jobs.
Here we explain how to reproduce the computational experiments and plots of our paper.
The relevant directories are:
data: contains input data and experiment results
src/tripletdim: source code for experiments and plots
├── data: code to generate artificial scales and sample triplets
├── experiment: code of experiments and utils to run them
└── plot: code to plot artificial scales
jobs: command line jobs of the experiment conditions to run
notebooks: code in jupyter notebooks to inspect models or datasets and plot results
The simulation experiments do not require any datasets and can be run right away. Experiments with psychophysical datasets require additional steps, because these datasets are not shipped with this repository.
Some datasets can be downloaded automatically by calling fetch-datasets
.
However, the behavioural datasets shown in the paper are not publicly available
and should be requested from the authors and then be placed in the data/raw/
directory,
named according to their usage in src/tripletdim/data/datasets.py
.
The experiment conditions of jobs
were defined in
src/tripletdim/experiment/make_jobs.py
.
If you add additional conditions by changing this file,
the jobs
directory should be re-generated by calling make-jobs
.
Run the experiments with the commands listed in the jobs
directory,
either on your computer or on a compute cluster (we run on a slurm cluster,
might require adjusting the batchjob.sh
and creating the logs
directory on another cluster).
These runs will skip the experiments if they have already been run, i.e. if the corresponding
result file exists in the data
directory.
conda activate dimensionality
cat jobs/* | sh
# or run on a slurm cluster
cat jobs/* | xargs -L1 sbatch batchjob.sh
# large datasets, e.g. material, imagenet and hypotests might run long, depending on your cluster's CPU.
# Thus, we added a longer batch job for convenience.
grep imagenet jobs/* | xargs -L1 sbatch batchjob-long.sh
You can run just simulations by cat jobs/simulations-embedding.txt
or filter the jobs by name,
e.g. grep normal jobs/*
to get all jobs with the normal ground-truth scale.
To avoid spamming the job-queue, I prefer running the hypotest jobs as an array-job:
wc -l jobs/hypotest.txt
# determine number of lines ...
sbatch --array=1-<NUMBER OF LINES>%50 batchjob-array.sh jobs/hypotest.txt
The repeated hypothesis test runs should be aggregated before data analysis.
Simply run make all
to execute the appropriate scripts (defined in Makefile
).
The plots are created mainly in jupyter notebooks. The plots are saved to the folder tex/plots/
notebooks/ekman_hue_embedding.ipynb
: Ekman scales (Figure 1).notebooks/plotting.ipynb
: Accuracy graphs, p-value bars, test error maps (Figure 2-8, 10, appendix).notebooks/normality.ipynb
: Normality of accuracy (appendix).plot-simulations
(script): Pitch scale, Ekman distributed (appendix).
This repository contains content that is not presented in the paper, mainly inspections of the used datasets.
notebooks/bosten_boehm_hue.ipynb
Inspection of the Bosten & Boehm (2014) dataset.
This work has been supported by the Machine Learning Cluster of Excellence, funded by EXC number 2064/1 – Project number 390727645. The authors would like to thank the International Max Planck Research School for In- telligent Systems (IMPRS-IS) for supporting David-Elias Künstle.
The code itself is free to use under the GNU General Public License v3.0.