Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 1.09 KB

README.md

File metadata and controls

22 lines (19 loc) · 1.09 KB

Data Synthesis Methods

This directory contains wrappers for the synthesis methods usable in the pipeline. Each directory contains a run script used by the pipeline: see Adding another synthesis method for the requirements on this.

  • CTGAN (Conditional GAN for Tabular data)
  • SGF (Synthetic Data Generation Framework)
  • synthpop (multiple imputations library in R)

In addition, there are placeholders and worked examples for other methods:

  • Base: a synthesizer Python base class
  • mice
  • SDV (Synthetic Data Vault, parametric model synthesis library in Python, using the multivariate version of the Gaussian Copula)
  • simPop (micro-simulation using IPF, Simulated Annealing and model-based synthesis, designed for datasets with household structure). Currently only on embedded Austrian census data set)