Skip to content
/ pGAN Public

A Deep Learning Approach to Private Data Sharing of Medical Images Using Conditional GANs

Notifications You must be signed in to change notification settings

tcoroller/pGAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

A Deep Learning Approach to Private Data Sharing of Medical Images Using Conditional GANs

Publications

Project

Investigate application of GANs in medical images. Scope of the project include:

  1. Generate artificial images of vertebra units (VUs) conditioned on anatomical region.
  2. Conduct an extensive evaluation of the dataset behavior and on the trade off between image quality/dataset faithfulness and privacy.

Related dataset:

  • Link to the data: https://zenodo.org/record/5031881
  • The synethetic dataset (10000 pairs of images and region, 2.95GB) is shared with the code (hdf5 dataset format).
    With some minor tweaking, the synthetic dataset can be used to run training and analysis to validate the code. (The analysis itself will be far less relevant because comparing privacy on two synthetic dataset is not very useful)

Code Lifting

Because the original data is not anonimized, it is not shared with the code. The preprocessing is not shared here either to avoid sharing sensitive system information.
This code cannot be run end to end out of the box.
Notebooks for analaysis still hold latest state with figures.

Scripts, Notebooks and Demos

  1. Training and generating synthetic VUs and corresponding regions
  2. Fidelity - Analysis
  3. Diversity - Analysis
  4. Privacy - Analysis

Structure

.
├── README.md
├── environment.yml                  # pgan-env
├── synthetic_dataset.h5
└── src
    ├── helper.py                         # utiliy function (current date-time for mlflow/grid for image visualization)
    └── manuscript                        
       ├── Diversity
       │   ├── 1_generate_synth_112_224.ipynb
       │   ├── 2_train_umap_112_224.ipynb
       │   ├── 3_plot_umap_diversity.ipynb
       │   ├── 4_classifier_analysis.ipynb
       │   ├── classifier_logs                 # restricted data (classifier on train might not be private)
       │   │   └── ... 
       │   ├── diversity_saves                 # restricted data (post processed real dataset included)
       │   │   └── ... 
       │   ├── images
       │   │   └── ... 
       │   └── train_classifier.py
       ├── Fidelity
       │   ├── 1_images_qualitative_inspection.ipynb
       │   ├── helpers
       │   │   └── utils.py                    # code for interpolation between regions
       │   └── images
       │       └── ... 
       ├── Privacy
       │   ├── 1_prepare_9_64_64_pixel_space.ipynb
       │   ├── 2_UMAP_64_64.ipynb
       │   ├── 3_compute_distances.ipynb
       │   ├── 4_plot_pairwise_attacks.ipynb
       │   ├── 5_plot_density_attacks.ipynb
       │   ├── 6_density_plot.ipynb
       │   ├── images
       │   │   ├── ...
       │   │   └── supp
       │   │       └── ... 
       │   └── privacy_saves                  # restricted data (post processed real dataset included, UMAP object might
       │       └── ...                        #  not be private)
       └── Train
            ├── batchers.py
            ├── fixed_architecture.py
            ├── Generate_image.ipynb
            ├── training_parser.py
            ├── train_region.py
            ├── restricted                    # restricted data (GAN weights, local machine preprocessing, indexes
            │   └── ...                       # of train/val split)
            └── transforms
                ├── augmentations.py
                └── transforms.py

Team

Authors:

  • Hanxi Sun*, Purdue University, Department of Statistics
  • Jason Plawinski*, Novartis
  • Sajanth Subramaniam, Novartis
  • Amir Jamaludin, Oxford Big Data Institute
  • Timor Kadir, Oxford Big Data Institute
  • Aimee Readie, Novartis
  • Gregory Ligozio, Novartis
  • David Ohlssen, Novartis
  • Mark Baillie#, Novartis
  • Thibaud Coroller#@, Novartis

*co-first authors; #co-last authors; @ corresponding author

About

A Deep Learning Approach to Private Data Sharing of Medical Images Using Conditional GANs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published