TODO: add some explanation on how to create a new dataset and reference it in the YAML file.
TODO: add a simple guide for this
In this section, we explain the meta-configurations relating to our benchmarking experiments reported in the paper:
We have created a sweep configuration here that runs the Sachs dataset on different seeds using the seed_everything
option that Lightning provides. This will automatically create a sweep that runs the Sachs dataset on different hyper-parameter configurations, and for each configuration, it will run it for five different seeds.
Finally, the run will produce a set of model results as json
files in the experiments/saves/sachs
(TODO: fix) directory. These json
files will contain full detail of the final ordering that the model has converged to and it can then later on be used for pruning.
Similar to Sachs, the sweep configuration for this run is available here. This is a simple sweep that will run all of the Syntren datas (with identifiers ranging from 0 to 1) and produce the same set of result json
files in experiments/saves/syntren
(TODO: fix).
We provide several sweep configurations for synthetic datasets, each covering a specific set of conditions and scenarios. The results are conveniently summarized using the Weights and Biases UI.
The configuration for these experiments can be found here. It covers graphs with 3, 4, 5, and 6 covariates generated by different algorithms (tournaments, paths, and Erdos-Renyi graphs). The functional forms included are sinusoidal, polynomial, and linear, all accompanied with Gaussian noise. For a comparative study between affine and additive, both options are also included. Each configuration is run five times with different seeds.
We test each dataset using three algorithms: Gumbel top-k, Gumbel Sinkhorn, and Soft. In total, this sweep contains 1480 different configurations.
You can find the sweep configuration for these datasets here. Similar to the parametric configuration, it covers graphs with 3, 4, 5, and 6 covariates. However, these datasets are generated using Gaussian processes to sample the scale and shift functions. Both Affine and Additive options are included for comparison, and each set of configuration is also seeded 5 times, totalling to 240 different configurations.
The configuration for the linear Laplace runs can be found here. This experiment demonstrates that our model can handle broader classes of Latent Structural Nonlinear Models (LSNMs), providing insights into possible updates of our theoretical conditions. For these configurations, we use small graphs with different generation schemes, but we employ a linear function for the scale and shift and choose a standard Laplace noise. The number of configurations generated by this sweep on different seeds totals to 480 runs.