Purpose

This toolkit is designed to analyse the underlying representation of a image classification network.
It is designed to be compatable with the outputs of the https://github.com/IML-DKFZ/fd-shifts Failure Detection Benchmark described in this paper (https://arxiv.org/abs/2211.15259).

Outputs

Generates outputs for representative images based on a k-means clustering of the latent space. Latent space dimenstions are reduced to 50 by pca and further condensed to 3 by t-SNE.
Generates the most overconfident as well as underconfident images based on the softmax response (could be adjusted to other confidence score). The app.py file can be run by "python3 app.py" and starts an interactive dash app that allows the user to explore the latent space of the neural network and the images linked to each data point. The data necessary to run the app can be generated by the package if raw outputs are present. After initalizing the main analyser simply run analyzer.prepaire_dash() and the correct data frame is written.

Instructions 🎥

Create a new folder
Create a new virtual environment
Update your pip (pip install --upgrade pip)
Install the package (pip install pip install visualfailureanalysis)
Setup your folder in the form "../experiemnt_group_name/experiment_name/test_results"
In this folder place your data as raw_outputs.npz
Format: Outputs of the softmax in the first columns followed by the label as an integer and dataset integer index in the last column. Each row is one data point. If you only have one data set the index is always 0. See below.
In this folder place your latent space as encoded_output.npz
Format: Outputs of the penultimate layer (e.g. inputs to the final layer) in the first columns and dataset index in the last. Each row is one data point. See below
In this folder place your data as attribution.csv
attributions.csv needs to at least contain a column called "filepath" containing the absolute filepaths of your images. If you have multiple data sets attributions are renamed to "attributions0.csv","attributions1.csv",..
All three data files need the rows to be in the same order ❗
In python: from visualfailureanalysis import analyser
Initalize the main class: my_data_visulizer = analyser.Analyser(path=path,class2name=class2name,class2plot=class2plot,ls_testsets=ls_testsets,test_datasets=test_datasets)
path ="../experiemnt_group_name/experiment_name"
class2plot = dict({0:"myclassname0",...}) conatianing a mapping of integer classes to real names
ls_testsets = ["nameoftestset",...] a list with names of all testsets
class2plot and test_datasets are two lists with a subset of classes/ testset names for which to generate outputs. (output can be quite large)
run my_data_visulizer.setup() to link the create the lower dimensional representation needed
generate outputs and statistics with the respective class methods.
Example Tree:
|--project
|--experiment_group_name
|--experiment_name
|--test_results
|--raw_output.npz
|--encoded_output.npz
|--attribution.csv

raw_outputs.npz
Nx(d+2)


  0, 1, ...                 d─1,   d,      d+1  
┌───────────────────────────────┬───────┬─────────────┐  
|           softmax1            | label | dataset_idx |  
├───────────────────────────────┼───────┼─────────────┤  
|           softmax2            | label | dataset_idx |  
├───────────────────────────────┼───────┼─────────────┤  
|           softmax3            | label | dataset_idx |  
└───────────────────────────────┴───────┴─────────────┘  
.  
.  
.  
┌───────────────────────────────┬───────┬─────────────┐  
|           softmaxN            | label | dataset_idx |  
└───────────────────────────────┴───────┴─────────────┘

encoded_output.npz
Nx(latent_d+1)

  0, 1, ...                 d─1,   d    
┌───────────────────────────────┬─────────────┐  
|           encoded1            | dataset_idx |  
├───────────────────────────────┼─────────────┤  
|           encoded2            | dataset_idx |  
├───────────────────────────────┼─────────────┤  
|           encoded3            | dataset_idx |  
└───────────────────────────────┴─────────────┘  
.  
.  
.  
┌───────────────────────────────┬─────────────┐  
|           encodedN            | dataset_idx |  
└───────────────────────────────┴─────────────┘

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
dist		dist
example_outputs		example_outputs
tests		tests
visualfailureanalysis		visualfailureanalysis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Purpose

Outputs

Instructions 🎥

Acknowledgements

About

Releases

Packages

Languages

License

IML-DKFZ/visual-failure-analysis

Folders and files

Latest commit

History

Repository files navigation

Purpose

Outputs

Instructions 🎥

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages