Configuration file options

Key	Type	Constraints	Required	Default	Description
`total_epochs`	int	> 0	True	1	Total number of epochs to train model for
`batch_size`	int	> 0	True	1	Number of training points in each minibatch
`nii_target_shape`	list[int]	>0, power of 2, length == 3	True	[128, 128, 128]	Resolution of training images along each dimension
`latents_per_channel`	list[int]	>0, length == log2(resolution) + 1	True	[log2(resolution) + 1, log2(resolution), log2(resolution) - 1, …, 1]	MISNOMER! Should be ‘latent_feature_maps_per_resolution’. Number of latent feature maps at each resolution.
`channels_per_latent`	list[int]	>0, length == log2(resolution) + 1	True	[20, ..., 20]	Number of channels per latent feature map.
`channels`	list[int]	>0, length == log2(resolution) + 1	True	[20, 40, 60, …, 20 x (log2(resolution) + 1)]	Number of output channels in the encoder’s Resnet blocks
`kernel_sizes_bottom_up`	list[int]	>0, length == log2(resolution) + 1	True	[3, …, 3, 2, 1]	At each resolution (decreasing order), the side lengths of the encoder’s kernels
`kernel_sizes_top_down`	list[int]	>0, length == log2(resolution) + 1	True	[3, …, 3, 2, 1]	At each resolution (decreasing order), the side lengths of the decoder's kernels
`channels_hidden`	list[int]	>0, length == log2(resolution) + 1	True	Set equal to 'channels'	Number of intermediate channels in the encoder’s Resnet blocks
`channels_top_down`	list[int]	>0, length == log2(resolution) + 1	True	Set equal to 'channels'	Number of output channels in the decoder’s Resnet blocks
`channels_hidden_top_down`	list[int]	>0, length == log2(resolution) + 1	True	Set equal to 'channels'	Number of intermediate channels in the decoder’s Resnet blocks
`warmup_iterations`	int	> 0	False	50	Iterations to wait before skipping excessively large gradient updates
`plot_recons_period`	int	> 0	False	1	Frequency (in epochs) with which to plot reconstructions
`subjects_to_plot`	int	> 0	False	4	Number of subjects to include when plotting reconstructions
`validation_period`	int	> 0	False	1	Frequency (in epochs) with which to evaluate the model on the validation set
`save_period`	int	> 0	False	1	Frequency (in epochs) with which to save checkpoints
`l2_reg_coeff`	float	> 0	False	1e-4	Coefficient scaling L2 regularization term in objective
`learning_rate`	float	> 0	False	1e-3	Scalar controlling magnitude of stochastic gradient steps
`train_frac`	float	in [0,1]	False	0.95	Fraction of data to use for training with remainder used for validation
`gradient_clipping_value`	float	> 0	False	1e2	Upper limit for the gradient norm, used when clamping gradients before applying gradient updates
`gradient_skipping_value`	float	> 0	False	1e12	If the gradient norm exceeds this value, skip that iteration’s gradient update
`sequence_type`	str	{"flair", "dwi"}	False	"flair"	HMMM. THIS IS DEPRECATED: FIRST STEP TO REMOVING IT WOULD BE TO ALWAYS SET IT TO ‘flair’... (We do not here vary the architecture based on the imaging modality…)
`likelihood`	str	{"Gaussian", ?}	False	"Gaussian"	Choice of likelihood function. I’M NOT SURE IF ANYTHING OTHER THAN ‘Gaussian’ IS (FULLY) IMPLEMENTED HERE!
`variance_hidden_clamp_bounds`	list[float]	Positive	False	`[0.001, 1]`	MISNOMER! Should be 'std_clamp_bounds_hidden'. Lower and upper bound on the std of the prior and posterior Gaussian distributions of the latent variables
`variance_output_clamp_bounds`	list[float]	Positive	False	`[0.01, 1]`	MISNOMER! Should be 'std_clamp_bounds_output'. Lower and upper bound on the std of the Gaussian distribution of the input given the latent
`latents_per_channel_weight_sharing`	str \| list[bool]	{"none", "all"} or length = log2(resolution) + 1	False	"none"	If not 'None', at each resolution (decreasing order) 'True' means use a shared set of weights at that resolution to predict the latents at that resolution.
`latents_to_use`	str \| list[bool]	{"none", "all"} or length = log2(resolution) + 1	False	"all"	If not ‘None’, at each resolution (decreasing order) ‘False’ means replace the latents at that resolution k with the output of deterministic Resnet blocks.
`latents_to_optimise`	str \| list[bool]	{"none", "all"} or length = log2(resolution) + 1	False	"all"	If not ‘None’, at each resolution (decreasing order) ‘False’ means withhold from the optimiser the parameters for the layers that predict the latents at that resolution
`half_precision`	bool		False	False	Whether to train model using 16-bit floating point precision
`print_model`	bool		False	False	Whether to display text representation of model at start of training
`use_tanh_output`	bool		False	True	Activation function used when predicting the location of the Gaussian distribution of the input given its latent. Alternative is torch.sigmoid()
`new_model`	bool		False	True	SHOULD ALWAYS BE TRUE! It was originally to support backwards compatibility with an older architecture…
`use_abs_not_square`	bool		False	False	Use absolute difference rather than sum of squares when computing the log likelihood
`plot_gradient_norms`	bool		False	True	Plot the norms of the gradients after each epoch
`apply_mask_in_input_space`	bool		False	False	DEPRECATED! IN THE PAST THIS WOULD TURN COST FUNCTION MASKING ON/OFF, BUT IT NO LONGER DOES ANYTHING
`include_mask_in_loader`	bool		False	False	AGAIN, DEPRECATED (SAME MASK AS ABOVE)
`resume_from_checkpoint`	bool		False	False	Resume training from a checkpoint
`restore_optimiser`	bool		False	True	When resuming training, restore the state of the optimiser (set to False to reset the optimiser’s parameters and start training from epoch 1)
`keep_every_checkpoint`	bool		False	True	Save, and keep, a checkpoint every epoch rather than just keeping the latest one
`predict_x_var`	bool		False	True	Model the scale, not just the location, of the Gaussian distribution of the input given its latent
`use_precision_reweighting`	bool		False	False	Re-weight the locations and scales of the prior and posterior distributions of the latents according to the Ladder VAE article
`verbose`	bool		False	True	Print more output
`bottleneck_resnet_encoder`	bool		False	True	In the encoder, use a three layer Resnet block with a middle layer that has fewer channels than the output layer (the bottleneck). Alternatively, use a two-layer Resnet block whose layers have equal numbers of output channels
`normalise_weight_by_depth`	bool		False	True	Normalise each convolution block’s randomly initialised kernel parameters by the (square root of the) depth of that block.
`zero_biases`	bool		False	True	Set each convolution block’s bias to zero after initialising it
`use_rezero`	bool		False	False	Use skip connections where the ‘non-skip’ part of the layer is multiplied by a scalar initialised to zero, as per the Rezero article.
`veto_batch_norm`	bool		False	True	Do not use batch normalisation anywhere
`veto_transformations`	bool		False	False	Do not apply augmentations to the training data
`use_nii_data`	bool		False	True	DEPRECATED! Should always be true: the only data we consume comes in the form if nifits (.nii)..
`nifti_standardise`	bool		False	True	APPEARS TO BE DEPRECATED… (THIS WOULD HAVE CONTROLLED Z-SCORING OF INPUT)
`shuffle_niftis`	bool		False	False	Randomise the order of the list of niftis before splitting into train and test sets.
`save_recons_to_mat`	bool		False	False	DEPRECATED!
`use_DDP`	bool		False	True	DEPRECATED!
`convolutional_downsampling`	bool		False	False	Down-sample using stride-two convolutions, rather than x2 nearest neighbour downsampling
`predict_x_var_with_sigmoid`	bool		False	True	Predict the scale of the Gaussian distribution of the input given its latent using a (scaled) sigmoid, rather than predicting the natural logarithm of the scale then exponentiating
`base_recons_on_train_loader`	bool		False	False	When plotting reconstructions, reconstruct the training data rather than the validation data
`only_use_one_conv_block_at_top`	bool		False	False	Use a truncated sequence of layers to predict from the latents the location and scale of the Gaussian distribution of the input given its latent
`separate_hidden_loc_scale_convs`	bool		False	False	Do not just use one convolutional block, with a two-channel output, to predict the location and scale of the prior and posterior Gaussian distributions of the latents. Instead use separate blocks for the location and scale.
`separate_output_loc_scale_convs`	bool		False	False	As above, but for the distribution of the input given its latent
`discard_abnormally_small_niftis`	bool		False	True	DEPRECATED!
`apply_augmentations_to_validation_set`	bool		False	False	Apply to the validation set the same augmentations applied to the training set
`visualise_training_pipeline_before_starting`	bool		False	True	Plot examples of the augmented training points before training begins

Options for specifying model architecture

Bottom-up graph

channels
channels_hidden
kernel_sizes_bottom_up

Top-down graph

channels_top_down
channels_hidden_top_down
channels_per_latent
latents_per_channel
kernel_sizes_top_down

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration file options

Options for specifying model architecture

Clone this wiki locally