Skip to content

cindyguan28/big-data-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Big Data Analytics

EEBDA big data analytics exercise project

Case Study 1: Descriptive and diagnostic analytics

Auditing, detection of duplicates

  • Data preperation: import, conversion, filter, merge
  • Data analytics with Chisquare, p-value,
  • Checking equal distribution and Benford distribution

Case Study 2: Predictive Analytics with machine learning algorithms

Fraud and error detection using statistical methods of non-parametric classification.

  • Machine Learning / Data Mining
  • Classification algorithms:
    • K-nearest neighbors
    • Decision trees
    • Support-vector-machines
  • training - predicting - evaluating

Case Study 3: Predictive Analytics using Regression while Incorporating Heterogeneity

Business Valuation M&A by means of simple multiples

  • Discounted Cash Flow model and valuation using multiples
  • Regression analyses: SLR, MLR, SUR
  • Hypotheses test using T-test, F-test and Chow-test
  • Dealing with unbalaced data
  • Cross validation

Case Study 4: Natural Language Processing (NLP)

Company Perception based on unstructured data

  • Tokenization, Stemming and Lemmatization, Word2Vec
  • Preparation of text data for machine analysis
  • Model development and evaluation of neural networks
    • torch
    • farm

About

EEBDA big data analytics exercise project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published