Here you will find various projects I completed during my Master's in Data Science program.
Index
Semester One
-
- Statistics_Probability_EDA_Feature_Engineering: contains a statistical analysis of Lending Club loans from 2008-~2015 (Pandas, Seaborn, Probability Distributions, Generalized Linear Models, Statsmodels).
-
- Linear_Algebra: contains a continued analysis of Lending Club loans using SVD and PCA to decompose these matrices and project if people would end up either defaulting or paying off their loan. (Pandas, Numpy, Matplotlib).
Semester Two
-
- Natural_Language_Processing: contains a mad libs style choose your own adventure story (Information Retreival with TF-IDF / Jaccard Similarity, word2vec, Named Entity Recognition, Part of Speech tagging, Spacey, NLTK).
-
- Machine_Learning_I: contains various machine learning models built off of Ultimatum Game data ( a variant of Prisoner's Dilemma) with the goal of maximizing your payoff (Tree Boosting, Random Forests, Regularized Regression, KNN Clustering, Stacking/Ensemble methods, Sklearn).
Semester Three
-
- Data_Engineering: has an overview of a bart ridership prediction system incorporating real time bart, and weather, information alongside historic bart, and weather, information. (Spark, MLlib, S3, EC2, EMR, MongoDB, boto, Airflow) Check out the finished webpage here!
-
- Data_Leadership: outlines the final business case presentation we completed as well as a bayesian A/B testing case. This case looked at a hospital chain to understand, and make recommendations, for how to reduce patient readmissions. (Data Science ROI, Cost Benefit Analysis, Managerial Clustering, Pandas, Sklearn, PyMC3, A/B testing)
Semester Four
-
- Deep Learning: contains an analysis of using reinforcement learning to maximize a daily taxicab driver's revenue in NYC. (Deep Q-Learning, Actor-Critic, Keras, Tensorflow, GPUs)
-
- Capstone: contains a lifetime value analysis using Markov Random Fields and a discrete event simulation framework (Probabilisitc Graphical Models, Markov Random Fields, Lifetime Value analysis, Discrete Event Simulation). Check out my final paper !
Semester Five
-
- advanced_stats_bayesian_optimization: shows a one-dimensional and two-dimensional implementation of Bayesian Optimization for hyperparameter selection. Includes a comparison to Grid Search, Random Grid Search, and the BayesianOptimization package (https://github.com/fmfn/BayesianOptimization) (Acquisition Function, Objective Function, Gaussian Process)