This framework tests different aspects of hyperparameter tuning via tensor completion (HTVTC):
-
Experiments for an initial iteration of HTVTC on different machine learning models for univariate regression and binary classification may be found in the experiments folder. These experiments use separate training and validation sets, rather than cross-validation, to evaluate hyperparameters.
-
A second iteration of HTVTC, over the same problems, which incorporates multi-fidelity (as used in hyperband and BOHB) has its experiments in the folder multi-fidelity-HTVTC. These experiments use cross-validation to evaluate hyperparameters.
-
Experiments with the final version of the technique are in the folder final-HTVTC. This final version uses automatic "narrowing down" of the hyperparameter search space over multiple tensor completion cycles, and is competitive with traditional state-of-the-art hyperparameter optimisation techniques in speed and suggestion of optimal hyperparameter combinations. These experiments use cross-validation to evaluate hyperparameters. This is the implementation of the technique presented in the ICASSP paper: "TENSOR COMPLETION FOR EFFICIENT AND ACCURATE HYPERPARAMETER OPTIMISATION IN LARGE-SCALE STATISTICAL LEARNING".
-
Experiments on traditional hyperparameter optimisation techniques (see the list of these here) on the same problems can be found in the traditional-methods folder, each technique implemented in a different subfolder. These experiments use cross-validation to evaluate hyperparameters.
-
For certain machine learning problems which are too computationally expensive to optimise on everyday devices like laptops, JuPyter notebooks have been created comparing the final version of HTVTC with traditional hyperparameter optimisation techniques. These notebooks are in the folder notebook-experiments and are meant to be run on Google Colab. The DNN and CNN experiments use separate training and validation sets, while the random forest covertype experiment uses cross-validation.
- Programs That Can Be Run
- Performance Metrics
- Validation Loss Metrics
- Traditional Hyperparameter Optimisation Techniques
- Structure of the Framework
- Non-Standard Python Library Dependencies
- Version Notes
- Compatibility Issue.
These contain tests that can be run by the end user to evaluate HTVTC and traditional hyperparameter optimisation techniques. Note that no other file depends on these programs as they are intended purely for experimentation. Therefore, they may be modified at will to perform different kinds of experiments.
-
*_test.py
: Runs unit tests for the module*.py
. -
tensorcompletion_instrumentation.py
: Runs performance measurement tests on large tensors for the tensor completion algorithms. -
folder/algo_workspace.py
: Whenfolder
isexperiments
ormulti-fidelity-HTVTC
, these files run correctness tests for HTVTC on saved hyperparameter score tensors generated using the the machine learning algorithmalgo
. Whenfolder
isfinal-HTVTC
, these files run performance tests for the final HTVTC technique in optimising hyperparameters of machine learning algorithmalgo
. These tests measure validation loss of the suggested hyperparameter combination and one of execution time (in nanoseconds), CPU utilisation time (in nanoseconds) or maximum memory allocated to the program during runtime (in bytes). -
experiments/algo_instrumentation.py
: Found in foldersexperiments
andmulti-fidelity-HTVTC
: runs performance tests for HTVTC on the machine learning algorithmalgo
measuring validation loss of the suggested hyperparameter combination and one of execution time (in nanoseconds), CPU utilisation time (in nanoseconds) or maximum memory allocation during runtime (in bytes). -
traditional-methods/method/algo_workspace.py
: Runs performance tests for traditional hyperparameter optimisation methodmethod
on machine learning algorithmalgo
, measuring validation loss of the suggested hyperparameter combination and one of execution time (in nanoseconds), CPU utilisation time (in nanoseconds) or maximum memory allocation during runtime (in bytes). -
*.ipynb
: These notebooks are meant to be run on Google Colab. Each one is for a different machine learning problem, comparing the final version of HTVTC with other traditional hyperparameter optimisation techniques. The notebookDNN_HTVTC.ipynb
evaluates hyperparameter optimisation of a 3-layer deep neural network performing multi-class classification on the MNIST data set. The notebookCNN_HTVTC.ipynb
evaluates hyperparameter optimisation of a CNN with three convolutional layers and one deep layer, performing the same multi-class classification task. Finally,Random_Forest_Covtype.ipynb
evaluates a random forest operating on the (adjusted) forest cover type data set.
These metrics may be found in different testing and evaluation modules throughout the framework to evaluate the quality of hyperparameter optimisation:
-
Validation Loss: The measures the prediction loss of the machine learning model generated from the specified machine algorithm with specified hyperparameters on the validation data set. It is measured using one of the validation loss metrics. In the folder
experiments
, validation loss is calculated using separate training and validation data sets. In folderstraditional-methods
,multi-fidelity-HTVTC
andfinal-HTVTC
validation loss is calculated using cross-validation with 5 folds i.e. a single combined training and validation data set split into five equal parts. -
Norm of Difference: The norm (square root of sum of squares of elements) of the difference between the predicted tensor (from tensor completion) and the true tensor. In some cases this may be normalised by dividing by the norm of the true tensor. This is a measure of tensor completion accuracy - the lower this metric, the more accurate the completion is.
-
Execution Time: Execution time of a critical program segment e.g. tensor completion, hyperparameter optimisation. It is measured using the
perf_counter_ns()
function (link to docs) from the Python standard librarytime
. -
CPU Utilisation Time: The total time spent by the critical program segment executing in user and kernel mode in the CPU core(s) of the computer. If the segment execution is parallelised across cores, this metric may be higher than execution time over the same segment. It is measured using the
process_time_ns()
function (link to docs) from the Python standard librarytime
. -
Maximum Memory Usage: The maximum amount of RAM allocated to Python objects during execution of the critical program segment. It is measured in bytes using the
tracemalloc
standard Python library (link to docs).
These are defined in regressionmetrics.py
(univariate regression metrics) and classificationmetrics.py
(binary classification metrics). Refer to these files for the definitions. These are ultimately used to calculate validation loss so as to compare different trained machine learning models.
- Mean Absolute Error (MAE)
- Mean Absolute Percentage Error (MAPE)
- Mean Squared Error (MAPE)
- Mean Squared Logarithmic Error (MSLE)
- logcosh loss
- Huber loss with default
delta = 1.35
- Posson loss.
- Indicator loss
- Binary cross-entropy (BCE)
- Kullback-Leibler divergence (KLD)
- Jensen-Shannon divergence (JSD).
Note that in DNN_HTVTC.ipynb
and CNN_HTVTC.ipynb
, the metric used is the implementation of categorical cross-entropy built into Keras (documentation here). This choice of metric is due to the neural networks performing a multi-class classification.
- Grid Search: Research paper describing the technique here in part 4.1.2. Implementation:
optuna.samplers.GridSampler
docs here. - Random Search: Research paper here. Implementation:
optuna.samplers.RandomSampler
docs here. - BO-TPE: Implementation:
optuna.samplers.TPESampler
docs here that also contain links to research papers describing the technique. - CMA-ES: Implementation
optuna.samplers.CmaEsSampler
docs here that also contains links to research papers describing the technique. - BO-GP: Implementation library:
bayesian-optimization
repo here with excellent explanations on the technique as well as links to research papers describing the technique. - Hyperband: Research paper here. Implementation
optuna.pruners.HyperbandPruner
docs here. - BOHB: Research paper here. Two implementation libraries of BOHB are used:
bohb-hpo
: repo here (as in experiments/BOHB-impl-1) andhpbandster
: docs here. The latter has documentation available and was able to perform more effective optimisation than the former.
Note that this image describes the structure of the framework before the file crosstechnique.py
was added. This new file is based on the cross technique described by Zhang in this paper. The functions within this file offer more efficient replacements to the functionality of generateerrortensor.py
and tensorcompletion.py
.
- Used throughout the software:
numpy
,pandas
,scipy
,sklearn
,tensorly
. - Used in folder traditional-methods:
optuna
,bayesian-optimization
,bohb-hpo
,hpbandster
. - Used in module loadddata.py:
requests
.
Take note of the compatibility issue between bayesian-optimization
and scipy
.
- Python version 3.10
tensorly
version 0.7.0numpy
version 1.22pandas
version 1.4sklearn
version 1.1scipy
version 1.8optuna
version 2.10bayesian-optimization
version 1.2bohb-hpo
version 0.7hpbandster
version 1.0
The following issue describes a compatibility issue between the versions of bayesian-optimization
and scipy
described in version notes. The solution is described in here - it is a simple change that can be made to the bayesian-optimization
library.
Alternatively, the bayesian-optimization
library can be downloaded as: pip install git+https://github.com/fmfn/BayesianOptimization
.