-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bandicoot library example to train an mnist example #219
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# This is a simple Makefile used to build the example source code. | ||
# This example might requires some modifications in order to work correctly on | ||
# your system. | ||
# This example trains mlpack neural network on GPU via OpenCL or CUDA, and it uses | ||
# the bandicoot library. | ||
|
||
TARGET := mnist_simple_coot | ||
SRC := mnist_simple_coot.cpp | ||
LIBS_NAME := bandicoot | ||
|
||
CXX := g++ | ||
CXXFLAGS += -std=c++14 -Wall -Wextra -O3 -DNDEBUG -fopenmp | ||
# Use these CXXFLAGS instead if you want to compile with debugging symbols and | ||
# without optimizations. | ||
# CXXFLAGS += -std=c++14 -Wall -Wextra -g -O0 | ||
|
||
LDFLAGS += -fopenmp | ||
# Add header directories for any includes that aren't on the | ||
# default compiler search path. | ||
INCLFLAGS := -I . | ||
# If you have mlpack or ensmallen installed somewhere nonstandard, uncomment and | ||
# update the lines below. | ||
INCLFLAGS += -I/opt/cuda/targets/x86_64-linux/include | ||
INCLFLAGS += -I/meta/mlpack/src | ||
#INCLFLAGS += -I/meta/m | ||
CXXFLAGS += $(INCLFLAGS) | ||
|
||
OBJS := $(SRC:.cpp=.o) | ||
LIBS := $(addprefix -l,$(LIBS_NAME)) | ||
CLEAN_LIST := $(TARGET) $(OBJS) | ||
|
||
# default rule | ||
default: all | ||
|
||
$(TARGET): $(OBJS) | ||
$(CXX) $(OBJS) -o $(TARGET) $(LDFLAGS) $(LIBS) | ||
|
||
.PHONY: all | ||
all: $(TARGET) | ||
|
||
.PHONY: clean | ||
clean: | ||
@echo CLEAN $(CLEAN_LIST) | ||
@rm -f $(CLEAN_LIST) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
/** | ||
* An example of using Feed Forward Neural Network (FFN) for | ||
* solving Digit Recognizer problem from Kaggle website. | ||
* | ||
* The full description of a problem as well as datasets for training | ||
* and testing are available here https://www.kaggle.com/c/digit-recognizer | ||
* | ||
* mlpack is free software; you may redistribute it and/or modify it under the | ||
* terms of the 3-clause BSD license. You should have received a copy of the | ||
* 3-clause BSD license along with mlpack. If not, see | ||
* http://www.opensource.org/licenses/BSD-3-Clause for more information. | ||
* | ||
* @author Eugene Freyman | ||
* @author Omar Shrit | ||
*/ | ||
#define MLPACK_ENABLE_ANN_SERIALIZATION | ||
#define MLPACK_HAS_COOT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a TODO, we should integrate definition of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, on my list, |
||
|
||
#include <bandicoot> | ||
#include <mlpack.hpp> | ||
|
||
#if ((ENS_VERSION_MAJOR < 2) || \ | ||
((ENS_VERSION_MAJOR == 2) && (ENS_VERSION_MINOR < 13))) | ||
#error "need ensmallen version 2.13.0 or later" | ||
#endif | ||
|
||
using namespace mlpack; | ||
using namespace std; | ||
|
||
coot::Row<size_t> getLabels(coot::mat predOut) | ||
{ | ||
coot::Row<size_t> predLabels(predOut.n_cols); | ||
for (coot::uword i = 0; i < predOut.n_cols; ++i) | ||
{ | ||
// predLabels(i) = predOut.col(i).index_max(); | ||
} | ||
return predLabels; | ||
} | ||
|
||
int main() | ||
{ | ||
// Dataset is randomly split into validation | ||
// and training parts in the following ratio. | ||
constexpr double RATIO = 0.1; | ||
// The number of neurons in the first layer. | ||
constexpr int H1 = 200; | ||
// The number of neurons in the second layer. | ||
constexpr int H2 = 100; | ||
// Step size of the optimizer. | ||
const double STEP_SIZE = 5e-3; | ||
// Number of data points in each iteration of SGD | ||
const size_t BATCH_SIZE = 64; | ||
// Allow up to 50 epochs, unless we are stopped early by EarlyStopAtMinLoss. | ||
const int EPOCHS = 50; | ||
|
||
// Labeled dataset that contains data for training is loaded from CSV file, | ||
// rows represent features, columns represent data points. | ||
arma::mat dataset; | ||
data::Load("../data/mnist_train.csv", dataset, true); | ||
|
||
// Originally on Kaggle dataset CSV file has header, so it's necessary to | ||
// get rid of the this row, in Armadillo representation it's the first column. | ||
arma::mat headerLessDataset = | ||
dataset.submat(0, 1, dataset.n_rows - 1, dataset.n_cols - 1); | ||
|
||
// Splitting the training dataset on training and validation parts. | ||
arma::mat train, valid; | ||
data::Split(headerLessDataset, train, valid, RATIO); | ||
|
||
// Getting training and validating dataset with features only and then | ||
// normalising | ||
const coot::mat trainX = | ||
coot::conv_to<coot::mat>::from(train.submat(1, 0, train.n_rows - 1, train.n_cols - 1) / 255.0); | ||
const coot::mat validX = | ||
coot::conv_to<coot::mat>::from(valid.submat(1, 0, valid.n_rows - 1, valid.n_cols - 1) / 255.0); | ||
|
||
// Labels should specify the class of a data point and be in the interval [0, | ||
// numClasses). | ||
|
||
// Creating labels for training and validating dataset. | ||
const coot::mat trainY = coot::conv_to<coot::mat>::from(train.row(0)); | ||
const coot::mat validY = coot::conv_to<coot::mat>::from(valid.row(0)); | ||
|
||
// Specifying the NN model. NegativeLogLikelihood is the output layer that | ||
// is used for classification problem. GlorotInitialization means that | ||
// initial weights in neurons are a uniform gaussian distribution. | ||
FFN<NegativeLogLikelihoodType<coot::mat>, GlorotInitialization, coot::mat> model; | ||
// This is intermediate layer that is needed for connection between input | ||
// data and relu layer. Parameters specify the number of input features | ||
// and number of neurons in the next layer. | ||
model.Add<LinearType<coot::mat>>(H1); | ||
// The first relu layer. | ||
model.Add<ReLUType<coot::mat>>(); | ||
// Intermediate layer between relu layers. | ||
model.Add<LinearType<coot::mat>>(H2); | ||
// The second relu layer. | ||
model.Add<ReLUType<coot::mat>>(); | ||
// Dropout layer for regularization. First parameter is the probability of | ||
// setting a specific value to 0. | ||
model.Add<DropoutType<coot::mat>>(0.2); | ||
// Intermediate layer. | ||
model.Add<LinearType<coot::mat>>(10); | ||
// LogSoftMax layer is used together with NegativeLogLikelihood for mapping | ||
// output values to log of probabilities of being a specific class. | ||
model.Add<LogSoftMaxType<coot::mat>>(); | ||
|
||
cout << "Start training ..." << endl; | ||
|
||
// Set parameters for the Adam optimizer. | ||
ens::Adam optimizer( | ||
STEP_SIZE, // Step size of the optimizer. | ||
BATCH_SIZE, // Batch size. Number of data points that are used in each | ||
// iteration. | ||
0.9, // Exponential decay rate for the first moment estimates. | ||
0.999, // Exponential decay rate for the weighted infinity norm estimates. | ||
1e-8, // Value used to initialise the mean squared gradient parameter. | ||
EPOCHS * trainX.n_cols, // Max number of iterations. | ||
1e-8, // Tolerance. | ||
true); | ||
|
||
// Declare callback to store best training weights. | ||
ens::StoreBestCoordinates<coot::mat> bestCoordinates; | ||
|
||
// Train neural network. If this is the first iteration, weights are | ||
// random, using current values as starting point otherwise. | ||
model.Train(trainX, | ||
trainY, | ||
optimizer, | ||
ens::PrintLoss(), | ||
ens::ProgressBar(), | ||
// Stop the training using Early Stop at min loss. | ||
ens::EarlyStopAtMinLossType<coot::mat>( | ||
[&](const coot::mat& /* param */) | ||
{ | ||
double validationLoss = model.Evaluate(validX, validY); | ||
cout << "Validation loss: " << validationLoss << "." | ||
<< endl; | ||
return validationLoss; | ||
}), | ||
// Store best coordinates (neural network weights) | ||
bestCoordinates); | ||
|
||
// Save the best training weights into the model. | ||
model.Parameters() = bestCoordinates.BestCoordinates(); | ||
|
||
coot::mat predOut; | ||
// Getting predictions on training data points. | ||
model.Predict(trainX, predOut); | ||
// Calculating accuracy on training data points. | ||
coot::Row<size_t> predLabels = getLabels(predOut); | ||
double trainAccuracy = | ||
coot::accu(predLabels == trainY) / (double) trainY.n_elem * 100; | ||
// Getting predictions on validating data points. | ||
model.Predict(validX, predOut); | ||
// Calculating accuracy on validating data points. | ||
predLabels = getLabels(predOut); | ||
double validAccuracy = | ||
coot::accu(predLabels == validY) / (double) validY.n_elem * 100; | ||
|
||
cout << "Accuracy: train = " << trainAccuracy << "%," | ||
<< "\t valid = " << validAccuracy << "%" << endl; | ||
|
||
data::Save("model.bin", "model", model, false); | ||
|
||
// Loading test dataset (the one whose predicted labels | ||
// should be sent to kaggle website). | ||
//data::Load("../data/mnist_test.csv", dataset, true); | ||
//coot::mat testY = dataset.row(0); | ||
//dataset.shed_row(0); // Strip labels before predicting. | ||
//dataset /= 255.0; // Apply the same normalization as to the training data. | ||
|
||
//cout << "Predicting on test set..." << endl; | ||
//coot::mat testPredOut; | ||
//// Getting predictions on test data points. | ||
//model.Predict(dataset, testPredOut); | ||
//// Generating labels for the test dataset. | ||
//coot::Row<size_t> testPred = getLabels(testPredOut); | ||
|
||
//double testAccuracy = coot::accu(testPred == testY) / | ||
//(double) testY.n_elem * 100; | ||
//cout << "Accuracy: test = " << testAccuracy << "%" << endl; | ||
|
||
//cout << "Saving predicted labels to \"results.csv\" ..." << endl; | ||
//testPred.save("results.csv", coot::csv_ascii); | ||
|
||
//cout << "Neural network model is saved to \"model.bin\"" << endl; | ||
cout << "Finished" << endl; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serialization will be a little tricky. When this macro is enabled, we compile all the
serialize()
functions for all the layer types---for the typearma::mat
. Instead, we'll need to doCEREAL_REGISTER_MLPACK_LAYERS(coot::mat)
or similar. Also, we'll need to actually implement aserialize()
function for Bandicoot types, a bit like insrc/mlpack/core/arma_extend/serialize_armadillo.hpp
. I think the strategy should just be to convert it to an Armadillo matrix (i.e. pull it off the GPU into the CPU), and then serialize that, plus a little bit of code to put the Armadillo matrix back onto the GPU during loading. Anyway, I can help with that when we get there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for now, the easy solution for now would be to convert it back to Armadillo.
In all cases, we are loading an armadillo matrix, so we will have to do the conversion at least once.
Once we have the load functionality for Bandicoot, if it makes sense, then we can think of serialization.