Applied-Machine-Learning

Used R to implement and run various machine learning algorithms on given data sets
Used TensorFlow and Python to explore various deep neural network architectures (Assignment 8)
** Note: see relevant reports of more detailed analysis and other of what is shown below (click on "Assignment[X]")**

Assignment 1

Implemented Naive Bayes classifier to predict G3 scores of student performance in Portugal using binary and multinomial models
Performed classification with and without NA entries
Performed classification using klaR, caret R packages

Assignment 2

Implemented SVM classifier using stochastic gradient descent using various regularization constants to predict adult income data UC Irvine machine learning data repository

Assignment 3

Trained and evaluated Linear SVM (SVMlight), Naive Bayes, and Random Forest (randomforest) classifiers on facial recognition data provided on Kaggle
Ran RANN's approximate nearest neighbor algorithm to match public figures from a database provided by Columbia's Public Figures Face Database
Ran SVMlight's polynomial kernel to match public figures from a database provided by Columbia's Public Figures Face Database
- Training accuracy: 80.1 %
- Kaggle Eval: 76.480 %
- Results on Kaggle leaderboard "Skynet Dev Team"

Assignment 4

Plotted a scatter matrix of the iris dataset from the UC Irvine learning data repository
Plotted the above data on first two principal components
Plotted above data using NIPALS algorithm for PLS1 regression
Plotted the eigenvalues of the covariance matrix in sorted order principal components of the wine dataset provided by the UC Irvine Machine Learning Repository using NIPALS
Plotted a stem plot of the first 3 principal components of the above data. Below is the first principal component
Plotted the above data on first two principal components
Plotted data on breast cancer diagnostics provided by the UC Irvine learning data repository across 3 principal components
Plotted data on breast cancer diagnostics provided by the UC Irvine learning data repository across 3 discriminate directions using NIPALS algorithm for PLS1 regression

Assignment 5

Using the EM algorithm implemented a multinomial mixture of topics model using the NIPS dataset from the UCI Machine Learning dataset repository to generate clusters of topics and plot their priors
Plotted the most common words per cluster of the above
Using the EM algorithm applied to a mixture of normal distribution model, segmented color images over multiple iterations. Below is an image segmented using 20 clusters over 5 iterations.

Assignment 6

Implemented linear regression on the Geographical Original of Music Data Set provided by the UCI Machine Learning Repository
Plotted a straightforward linear regression against features with respect to longitude and latitude of origin of said music data set.
Below is a latitude prediction over a linear regression on features that yields an R-squared) value of 0.3645767
Analyzed the residuals. Below are that of the latitude
Produced a Box Cox transformation and analyzed its effect on linear regression performance. Below is a Box Cox plot with a max log likelihood value $$\lambda$$ of 3.6
Performed a ridge regression analysis ($$\alpha$$ values close to 0) using various regularization coefficients
Below is a plot showing cross-validated error against [log regularization coefficient]((http://luthuli.cs.uiuc.edu/~daf/courses/LearningCourse/learning-book-31-Mar#page=200.pdf) for $$\alpha = 0.1$$ (ridge regression) (see report for detailed analysis)
Lasso and Elastic Net Regression were performed and analyzed in a similar manner with their associated log regularization coefficients $$\alpha$$ (see report for more details)
A comparison of Unregularized, Lasso, Ridge, and Elastic Net Regression on the Geographical Original of Music Data Set provided by the UCI Machine Learning Repository

Assignment 7

Analyzed regression of spatial data using kernel functions, specifically data on temperature measurements from 112 weather stations in Oregon provided by Luke Spadavecchia of the University of Edinburgh
Used kernel smoothing to predict the average minimum annual temperature at each point on a 100x100 grid spanning these stations through the use of a Gaussian kernel. A scale was chosen through cross-validation, and combined to create the image below. This closely matches with Figure 4 which displays the result obtained from ordinary Kriging (see the report for further analysis).
Regularized the kernel method shown here using lasso, and again predicted the average minimum annual temperatures across the 100x100 grid. As above, scale was chosen through cross-validation, and combined to create the image below (see report for images corresponding to a wider array of scales and kernel functions).
Plotted the impact of the number of predictors on accuracy, much like in Figure 7.17. Note that the mean square error decreases from 0 to 24 predictors, where the regularization constant results in the residual hitting a knee indicating the point of diminishing returns.
A similar procedure was performed with Elastic Net Regression (see report)

Assignment 8

Worked with Google's TensorFlow in analyzing how convolutional neural networks are used to classify written digits from the MNIST database, and analyzing the results using TensorBoard. This closely follows the Deep MNIST for Experts tutorial.
Plotted accuracy of the convolutional neural networks over the number of steps, in this case nearly 2000, which yields a very impressive at nearly 97.44%!
Modified the convolutional neural networks from the Deep MNIST for Experts tutorial in an effort to derive a better result.
Below is a plot of the accuracy of a network with the following configuration:
- 7x7 kernel, 32 features, no max pooling, 3 convolutional layers deep (see report for more details and configurations)
- Note that a peak accuracy of around 98% is achieved after a minimal number of steps
A similar procedure was executed for the CIFAR data set

Name		Name	Last commit message	Last commit date
Latest commit History 244 Commits
assignment1		assignment1
assignment2		assignment2
assignment3		assignment3
assignment4		assignment4
assignment5		assignment5
assignment6		assignment6
assignment7		assignment7
assignment8		assignment8
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Applied-Machine-Learning

Assignment 1

Assignment 2

Assignment 3

Assignment 4

Assignment 5

Assignment 6

Assignment 7

Assignment 8

About

Releases

Packages

Contributors 2

Languages

hkiang01/Applied-Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

Applied-Machine-Learning

Assignment 1

Assignment 2

Assignment 3

About

Topics

Resources

Stars

Watchers

Forks

Languages