Common techniques for datathon.
- Network Yue
- Cluster
- Decision Tree / Random forest / XGBoost
- Time series
- Line plotting
- Plotting on a map
- Supervised Learning techniques: https://scikit-learn.org/stable/supervised_learning.html#supervised-learning Banghua &Yiling
- Linear regression; SVM; Nearest Neighbors
- Confidence interval Yiling
- Time series Analysis Yue
- Decision Tree Yue
- Random Forest Yue
- Hypothesis testing Yiling
- Scikit-learn: https://scikit-learn.org/stable/
- XGBoost: https://xgboost.readthedocs.io/en/latest/ Yue
- Feature selection https://scikit-learn.org/stable/modules/feature_selection.html#feature-selection Jiahao
- Dimension reduction: PCA, ICA, CCA, FLD, t-SNE: https://www.datacamp.com/community/tutorials/introduction-t-sne Jiahao
- Cluster Jiahao
- Text
- Word2Vec
- Numpy, Pandas, Pickle, Jupyter
- Github
- Google Drive
Potential machinary:
James-Stein estimation: adaptive estimator
Statsmodel gives an latex form automatically