Data and analysis code from "Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency"
Error consistency is a quantitative analysis for measuring whether two decision making systems systematically make errors on the same inputs. The paper is available on arXiv.
The R analysis scripts have the following dependencies which can be installed via install.packages("package-name")
. Data analysis was performed using R version 3.5.1
.
library(lattice)
library(jpeg)
library(R.matlab)
library(graphics)
library(pROC)
library(psych)
library(grid)
library(gridExtra)
library(stats)
library(png)
library(pBrackets)
library(PET)
library(TeachingDemos)
library(binom)
library(RColorBrewer)
library(ggplot2)
library(scales)
library(xtable)
library(viridis)
library(binom)
The Brain-Score parsing script has the following dependencies:
pip3 install jupyterlab
pip3 install urllib
pip3 install bs4
pip3 install numpy
Scripts to analyse the data from raw-data/
and plot figures to figures/
.
The main analysis script is data-analysis.R
. Confidence intervals are simulated via simulate-confidence-intervals.R
. Brain-Score metrics are parsed from the Brain-Score website with get_data_from_Brain_Score.ipynb
.
Project-related documentation.
Figures generated by scripts from data-analysis/
using data from raw-data/
.
Experiments with the prefix noise-generalisation
are from this paper; the raw data is copied from the corresponding github repository.
Experiments with the prefix texture-shape
are from this paper; the raw data is copied from the corresponding github repository.