Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bandicoot library example to train an mnist example #219

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions mnist_simple_coot/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# This is a simple Makefile used to build the example source code.
# This example might requires some modifications in order to work correctly on
# your system.
# This example trains mlpack neural network on GPU via OpenCL or CUDA, and it uses
# the bandicoot library.

TARGET := mnist_simple_coot
SRC := mnist_simple_coot.cpp
LIBS_NAME := bandicoot

CXX := g++
CXXFLAGS += -std=c++14 -Wall -Wextra -O3 -DNDEBUG -fopenmp
# Use these CXXFLAGS instead if you want to compile with debugging symbols and
# without optimizations.
# CXXFLAGS += -std=c++14 -Wall -Wextra -g -O0

LDFLAGS += -fopenmp
# Add header directories for any includes that aren't on the
# default compiler search path.
INCLFLAGS := -I .
# If you have mlpack or ensmallen installed somewhere nonstandard, uncomment and
# update the lines below.
INCLFLAGS += -I/opt/cuda/targets/x86_64-linux/include
INCLFLAGS += -I/meta/mlpack/src
#INCLFLAGS += -I/meta/m
CXXFLAGS += $(INCLFLAGS)

OBJS := $(SRC:.cpp=.o)
LIBS := $(addprefix -l,$(LIBS_NAME))
CLEAN_LIST := $(TARGET) $(OBJS)

# default rule
default: all

$(TARGET): $(OBJS)
$(CXX) $(OBJS) -o $(TARGET) $(LDFLAGS) $(LIBS)

.PHONY: all
all: $(TARGET)

.PHONY: clean
clean:
@echo CLEAN $(CLEAN_LIST)
@rm -f $(CLEAN_LIST)
188 changes: 188 additions & 0 deletions mnist_simple_coot/mnist_simple_coot.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
/**
* An example of using Feed Forward Neural Network (FFN) for
* solving Digit Recognizer problem from Kaggle website.
*
* The full description of a problem as well as datasets for training
* and testing are available here https://www.kaggle.com/c/digit-recognizer
*
* mlpack is free software; you may redistribute it and/or modify it under the
* terms of the 3-clause BSD license. You should have received a copy of the
* 3-clause BSD license along with mlpack. If not, see
* http://www.opensource.org/licenses/BSD-3-Clause for more information.
*
* @author Eugene Freyman
* @author Omar Shrit
*/
#define MLPACK_ENABLE_ANN_SERIALIZATION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialization will be a little tricky. When this macro is enabled, we compile all the serialize() functions for all the layer types---for the type arma::mat. Instead, we'll need to do CEREAL_REGISTER_MLPACK_LAYERS(coot::mat) or similar. Also, we'll need to actually implement a serialize() function for Bandicoot types, a bit like in src/mlpack/core/arma_extend/serialize_armadillo.hpp. I think the strategy should just be to convert it to an Armadillo matrix (i.e. pull it off the GPU into the CPU), and then serialize that, plus a little bit of code to put the Armadillo matrix back onto the GPU during loading. Anyway, I can help with that when we get there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for now, the easy solution for now would be to convert it back to Armadillo.
In all cases, we are loading an armadillo matrix, so we will have to do the conversion at least once.
Once we have the load functionality for Bandicoot, if it makes sense, then we can think of serialization.

#define MLPACK_HAS_COOT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a TODO, we should integrate definition of MLPACK_HAS_COOT into the mlpack headers (maybe in prereqs.hpp or something, we try to detect Bandicoot), so that users don't have to write it by hand. You're probably already thinking that, I just wanted to write it down so it doesn't get forgotten.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on my list,


#include <bandicoot>
#include <mlpack.hpp>

#if ((ENS_VERSION_MAJOR < 2) || \
((ENS_VERSION_MAJOR == 2) && (ENS_VERSION_MINOR < 13)))
#error "need ensmallen version 2.13.0 or later"
#endif

using namespace mlpack;
using namespace std;

coot::Row<size_t> getLabels(coot::mat predOut)
{
coot::Row<size_t> predLabels(predOut.n_cols);
for (coot::uword i = 0; i < predOut.n_cols; ++i)
{
// predLabels(i) = predOut.col(i).index_max();
}
return predLabels;
}

int main()
{
// Dataset is randomly split into validation
// and training parts in the following ratio.
constexpr double RATIO = 0.1;
// The number of neurons in the first layer.
constexpr int H1 = 200;
// The number of neurons in the second layer.
constexpr int H2 = 100;
// Step size of the optimizer.
const double STEP_SIZE = 5e-3;
// Number of data points in each iteration of SGD
const size_t BATCH_SIZE = 64;
// Allow up to 50 epochs, unless we are stopped early by EarlyStopAtMinLoss.
const int EPOCHS = 50;

// Labeled dataset that contains data for training is loaded from CSV file,
// rows represent features, columns represent data points.
arma::mat dataset;
data::Load("../data/mnist_train.csv", dataset, true);

// Originally on Kaggle dataset CSV file has header, so it's necessary to
// get rid of the this row, in Armadillo representation it's the first column.
arma::mat headerLessDataset =
dataset.submat(0, 1, dataset.n_rows - 1, dataset.n_cols - 1);

// Splitting the training dataset on training and validation parts.
arma::mat train, valid;
data::Split(headerLessDataset, train, valid, RATIO);

// Getting training and validating dataset with features only and then
// normalising
const coot::mat trainX =
coot::conv_to<coot::mat>::from(train.submat(1, 0, train.n_rows - 1, train.n_cols - 1) / 255.0);
const coot::mat validX =
coot::conv_to<coot::mat>::from(valid.submat(1, 0, valid.n_rows - 1, valid.n_cols - 1) / 255.0);

// Labels should specify the class of a data point and be in the interval [0,
// numClasses).

// Creating labels for training and validating dataset.
const coot::mat trainY = coot::conv_to<coot::mat>::from(train.row(0));
const coot::mat validY = coot::conv_to<coot::mat>::from(valid.row(0));

// Specifying the NN model. NegativeLogLikelihood is the output layer that
// is used for classification problem. GlorotInitialization means that
// initial weights in neurons are a uniform gaussian distribution.
FFN<NegativeLogLikelihoodType<coot::mat>, GlorotInitialization, coot::mat> model;
// This is intermediate layer that is needed for connection between input
// data and relu layer. Parameters specify the number of input features
// and number of neurons in the next layer.
model.Add<LinearType<coot::mat>>(H1);
// The first relu layer.
model.Add<ReLUType<coot::mat>>();
// Intermediate layer between relu layers.
model.Add<LinearType<coot::mat>>(H2);
// The second relu layer.
model.Add<ReLUType<coot::mat>>();
// Dropout layer for regularization. First parameter is the probability of
// setting a specific value to 0.
model.Add<DropoutType<coot::mat>>(0.2);
// Intermediate layer.
model.Add<LinearType<coot::mat>>(10);
// LogSoftMax layer is used together with NegativeLogLikelihood for mapping
// output values to log of probabilities of being a specific class.
model.Add<LogSoftMaxType<coot::mat>>();

cout << "Start training ..." << endl;

// Set parameters for the Adam optimizer.
ens::Adam optimizer(
STEP_SIZE, // Step size of the optimizer.
BATCH_SIZE, // Batch size. Number of data points that are used in each
// iteration.
0.9, // Exponential decay rate for the first moment estimates.
0.999, // Exponential decay rate for the weighted infinity norm estimates.
1e-8, // Value used to initialise the mean squared gradient parameter.
EPOCHS * trainX.n_cols, // Max number of iterations.
1e-8, // Tolerance.
true);

// Declare callback to store best training weights.
ens::StoreBestCoordinates<coot::mat> bestCoordinates;

// Train neural network. If this is the first iteration, weights are
// random, using current values as starting point otherwise.
model.Train(trainX,
trainY,
optimizer,
ens::PrintLoss(),
ens::ProgressBar(),
// Stop the training using Early Stop at min loss.
ens::EarlyStopAtMinLossType<coot::mat>(
[&](const coot::mat& /* param */)
{
double validationLoss = model.Evaluate(validX, validY);
cout << "Validation loss: " << validationLoss << "."
<< endl;
return validationLoss;
}),
// Store best coordinates (neural network weights)
bestCoordinates);

// Save the best training weights into the model.
model.Parameters() = bestCoordinates.BestCoordinates();

coot::mat predOut;
// Getting predictions on training data points.
model.Predict(trainX, predOut);
// Calculating accuracy on training data points.
coot::Row<size_t> predLabels = getLabels(predOut);
double trainAccuracy =
coot::accu(predLabels == trainY) / (double) trainY.n_elem * 100;
// Getting predictions on validating data points.
model.Predict(validX, predOut);
// Calculating accuracy on validating data points.
predLabels = getLabels(predOut);
double validAccuracy =
coot::accu(predLabels == validY) / (double) validY.n_elem * 100;

cout << "Accuracy: train = " << trainAccuracy << "%,"
<< "\t valid = " << validAccuracy << "%" << endl;

data::Save("model.bin", "model", model, false);

// Loading test dataset (the one whose predicted labels
// should be sent to kaggle website).
//data::Load("../data/mnist_test.csv", dataset, true);
//coot::mat testY = dataset.row(0);
//dataset.shed_row(0); // Strip labels before predicting.
//dataset /= 255.0; // Apply the same normalization as to the training data.

//cout << "Predicting on test set..." << endl;
//coot::mat testPredOut;
//// Getting predictions on test data points.
//model.Predict(dataset, testPredOut);
//// Generating labels for the test dataset.
//coot::Row<size_t> testPred = getLabels(testPredOut);

//double testAccuracy = coot::accu(testPred == testY) /
//(double) testY.n_elem * 100;
//cout << "Accuracy: test = " << testAccuracy << "%" << endl;

//cout << "Saving predicted labels to \"results.csv\" ..." << endl;
//testPred.save("results.csv", coot::csv_ascii);

//cout << "Neural network model is saved to \"model.bin\"" << endl;
cout << "Finished" << endl;
}
Loading