-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration file options
High-Dimensional Neurology Group, UCL edited this page Jun 15, 2022
·
29 revisions
Key | Type | Constraints | Required | Default | Description |
---|---|---|---|---|---|
total_epochs |
int | > 0 | True | 1 | Total number of epochs to train model for |
batch_size |
int | > 0 | True | 1 | Number of training points in each minibatch |
nii_target_shape |
list[int] | >0, power of 2, length == 3 | True | [128, 128, 128] | Resolution of training images along each dimension |
latents_per_channel |
list[int] | >0, length == log2(resolution) + 1 | True | [log2(resolution) + 1, log2(resolution), log2(resolution) - 1, …, 1] | MISNOMER! Should be ‘latent_feature_maps_per_resolution’. Number of latent feature maps at each resolution. |
channels_per_latent |
list[int] | >0, length == log2(resolution) + 1 | True | [20, ..., 20] | Number of channels per latent feature map. |
channels |
list[int] | >0, length == log2(resolution) + 1 | True | [20, 40, 60, …, 20 x (log2(resolution) + 1)] | Number of output channels in the encoder’s Resnet blocks |
kernel_sizes_bottom_up |
list[int] | >0, length == log2(resolution) + 1 | True | [3, …, 3, 2, 1] | At each resolution (decreasing order), the side lengths of the encoder’s kernels |
kernel_sizes_top_down |
list[int] | >0, length == log2(resolution) + 1 | True | [3, …, 3, 2, 1] | At each resolution (decreasing order), the side lengths of the decoder's kernels |
channels_hidden |
list[int] | >0, length == log2(resolution) + 1 | True | Set equal to 'channels' | Number of intermediate channels in the encoder’s Resnet blocks |
channels_top_down |
list[int] | >0, length == log2(resolution) + 1 | True | Set equal to 'channels' | Number of output channels in the decoder’s Resnet blocks |
channels_hidden_top_down |
list[int] | >0, length == log2(resolution) + 1 | True | Set equal to 'channels' | Number of intermediate channels in the decoder’s Resnet blocks |
warmup_iterations |
int | > 0 | False | 50 | Iterations to wait before skipping excessively large gradient updates |
plot_recons_period |
int | > 0 | False | 1 | Frequency (in epochs) with which to plot reconstructions |
subjects_to_plot |
int | > 0 | False | 4 | Number of subjects to include when plotting reconstructions |
validation_period |
int | > 0 | False | 1 | Frequency (in epochs) with which to evaluate the model on the validation set |
save_period |
int | > 0 | False | 1 | Frequency (in epochs) with which to save checkpoints |
l2_reg_coeff |
float | > 0 | False | 1e-4 | Coefficient scaling L2 regularization term in objective |
learning_rate |
float | > 0 | False | 1e-3 | Scalar controlling magnitude of stochastic gradient steps |
train_frac |
float | in [0,1] | False | 0.95 | Fraction of data to use for training with remainder used for validation |
gradient_clipping_value |
float | > 0 | False | 1e2 | Upper limit for the gradient norm, used when clamping gradients before applying gradient updates |
gradient_skipping_value |
float | > 0 | False | 1e12 | If the gradient norm exceeds this value, skip that iteration’s gradient update |
sequence_type |
str | {"flair", "dwi"} | False | "flair" | HMMM. THIS IS DEPRECATED: FIRST STEP TO REMOVING IT WOULD BE TO ALWAYS SET IT TO ‘flair’... (We do not here vary the architecture based on the imaging modality…) |
likelihood |
str | {"Gaussian", ?} | False | "Gaussian" | Choice of likelihood function. I’M NOT SURE IF ANYTHING OTHER THAN ‘Gaussian’ IS (FULLY) IMPLEMENTED HERE! |
variance_hidden_clamp_bounds |
list[float] | Positive | False | [0.001, 1] |
MISNOMER! Should be 'std_clamp_bounds_hidden'. Lower and upper bound on the std of the prior and posterior Gaussian distributions of the latent variables |
variance_output_clamp_bounds |
list[float] | Positive | False | [0.01, 1] |
MISNOMER! Should be 'std_clamp_bounds_output'. Lower and upper bound on the std of the Gaussian distribution of the input given the latent |
latents_per_channel_weight_sharing |
str | list[bool] | {"none", "all"} or length = log2(resolution) + 1 | False | "none" | If not 'None', at each resolution (decreasing order) 'True' means use a shared set of weights at that resolution to predict the latents at that resolution. |
latents_to_use |
str | list[bool] | {"none", "all"} or length = log2(resolution) + 1 | False | "all" | If not ‘None’, at each resolution (decreasing order) ‘False’ means replace the latents at that resolution k with the output of deterministic Resnet blocks. |
latents_to_optimise |
str | list[bool] | {"none", "all"} or length = log2(resolution) + 1 | False | "all" | If not ‘None’, at each resolution (decreasing order) ‘False’ means withhold from the optimiser the parameters for the layers that predict the latents at that resolution |
half_precision |
bool | False | False | Whether to train model using 16-bit floating point precision | |
print_model |
bool | False | False | Whether to display text representation of model at start of training | |
use_tanh_output |
bool | False | True | Activation function used when predicting the location of the Gaussian distribution of the input given its latent. Alternative is torch.sigmoid() | |
new_model |
bool | False | True | SHOULD ALWAYS BE TRUE! It was originally to support backwards compatibility with an older architecture… | |
use_abs_not_square |
bool | False | False | Use absolute difference rather than sum of squares when computing the log likelihood | |
plot_gradient_norms |
bool | False | True | Plot the norms of the gradients after each epoch | |
apply_mask_in_input_space |
bool | False | False | DEPRECATED! IN THE PAST THIS WOULD TURN COST FUNCTION MASKING ON/OFF, BUT IT NO LONGER DOES ANYTHING | |
include_mask_in_loader |
bool | False | False | AGAIN, DEPRECATED (SAME MASK AS ABOVE) | |
resume_from_checkpoint |
bool | False | False | Resume training from a checkpoint | |
restore_optimiser |
bool | False | True | When resuming training, restore the state of the optimiser (set to False to reset the optimiser’s parameters and start training from epoch 1) | |
keep_every_checkpoint |
bool | False | True | Save, and keep, a checkpoint every epoch rather than just keeping the latest one | |
predict_x_var |
bool | False | True | Model the scale, not just the location, of the Gaussian distribution of the input given its latent | |
use_precision_reweighting |
bool | False | False | Re-weight the locations and scales of the prior and posterior distributions of the latents according to the Ladder VAE article | |
verbose |
bool | False | True | Print more output | |
bottleneck_resnet_encoder |
bool | False | True | In the encoder, use a three layer Resnet block with a middle layer that has fewer channels than the output layer (the bottleneck). Alternatively, use a two-layer Resnet block whose layers have equal numbers of output channels | |
normalise_weight_by_depth |
bool | False | True | Normalise each convolution block’s randomly initialised kernel parameters by the (square root of the) depth of that block. | |
zero_biases |
bool | False | True | Set each convolution block’s bias to zero after initialising it | |
use_rezero |
bool | False | False | Use skip connections where the ‘non-skip’ part of the layer is multiplied by a scalar initialised to zero, as per the Rezero article. | |
veto_batch_norm |
bool | False | True | Do not use batch normalisation anywhere | |
veto_transformations |
bool | False | False | Do not apply augmentations to the training data | |
use_nii_data |
bool | False | True | DEPRECATED! Should always be true: the only data we consume comes in the form if nifits (.nii).. | |
nifti_standardise |
bool | False | True | APPEARS TO BE DEPRECATED… (THIS WOULD HAVE CONTROLLED Z-SCORING OF INPUT) | |
shuffle_niftis |
bool | False | False | Randomise the order of the list of niftis before splitting into train and test sets. | |
save_recons_to_mat |
bool | False | False | DEPRECATED! | |
use_DDP |
bool | False | True | DEPRECATED! | |
convolutional_downsampling |
bool | False | False | Down-sample using stride-two convolutions, rather than x2 nearest neighbour downsampling | |
predict_x_var_with_sigmoid |
bool | False | True | Predict the scale of the Gaussian distribution of the input given its latent using a (scaled) sigmoid, rather than predicting the natural logarithm of the scale then exponentiating | |
base_recons_on_train_loader |
bool | False | False | When plotting reconstructions, reconstruct the training data rather than the validation data | |
only_use_one_conv_block_at_top |
bool | False | False | Use a truncated sequence of layers to predict from the latents the location and scale of the Gaussian distribution of the input given its latent | |
separate_hidden_loc_scale_convs |
bool | False | False | Do not just use one convolutional block, with a two-channel output, to predict the location and scale of the prior and posterior Gaussian distributions of the latents. Instead use separate blocks for the location and scale. | |
separate_output_loc_scale_convs |
bool | False | False | As above, but for the distribution of the input given its latent | |
discard_abnormally_small_niftis |
bool | False | True | DEPRECATED! | |
apply_augmentations_to_validation_set |
bool | False | False | Apply to the validation set the same augmentations applied to the training set | |
visualise_training_pipeline_before_starting |
bool | False | True | Plot examples of the augmented training points before training begins |
Bottom-up graph
channels
channels_hidden
kernel_sizes_bottom_up
Top-down graph
channels_top_down
channels_hidden_top_down
channels_per_latent
latents_per_channel
kernel_sizes_top_down