Skip to content

marileano/Marianne-Leano-Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Marianne Leano

A recent graduate with a Master's in Data Science. Currently looking for a new career as a data science professional to work with a diverse team and support innovation for data-driven business decisions and goals. Motivated to utilize my analytical, statistical, and programming skills to collect, analyze and interpret large datasets.

  • Given raw unscaled data with both numerical and categorical variables.
  • Performed exploratory data analysis in order to visualize the characteristics of our given variables.
  • Constructed various models to train the data, utilizing Optuna hyperparameter tuning to get parameters that maximize the model accuracies.
  • Used feature engineering techniques, we built new variables to help improve the accuracy of the models.
  • Using the strategies above, we built the final model and generated the forest cover type predictions for the test dataset.
  • Created an interactive data visualization to raise people's awareness on the issue of climate change.
  • Interacted with data to get insights faster and make critical decisions for the purpose of the project.
  • Created visualizations with large amounts of data for the following supporting points:
    • Economoic Development
    • Human Influence Factors
    • Energy Consumption
  • Performed data feature selection, feature eliminateion, and feature importance using techniwues such as Recursive Feature Elimination (RFE), Principal Component Analysis (PCA), and Random Forest.
  • Developed models using supervised, unsupervised, and semi-supervised learning techniques such as decision trees, regression trees, neural networks, and support vector machines.
  • Tuned model parameters, estimated prediction errors, and model validation.
  • Compared and ensembled multiple models in pipeline and automatically selected the best model.
  • Utitlizing a custom TRAIN dataset, a model was built to predict whether a data scientist will remain a member.
  • Performed data cleaning and pre-processing of data.
  • Performed PCA and Correlation to understand the relationship between the data.
  • Performed the following models: Logistic Regression, SVM, Decision Tree and Random Forest.
  • Used Recursive Feature Elimination (RFE) for feature selection.
  • Compared models to find the best model for testing accuracy and training convergence.
  • Utilized pyspark ML and created a SparkSession object using Databricks.
  • Explored and analyzed different datasets to build better insights on the Lahman Baseball database.

About

Marianne Leano's portfolio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published