Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models and methods students should know #2

Closed
bensoltoff opened this issue Dec 7, 2016 · 1 comment
Closed

Models and methods students should know #2

bensoltoff opened this issue Dec 7, 2016 · 1 comment

Comments

@bensoltoff
Copy link
Contributor

I've been thinking about next term and exactly what models and methods to cover. There's obviously a ton of potential approaches. Right now we have a lot of the basics covered but I wonder if we should be more ambitious and expose our students to more advanced modeling procedures like neural nets and other deep learning algorithms. The third term as currently structured focuses a lot on distributed computing platforms and methods (e.g. parallel processing, Hadoop, MapReduce, Spark, AWS, SQL and relational databases). At least last year, CS 123 covers a lot of that in probably far more detail than we'd want to do. We could still cover some of that in our spring Perspectives course focusing more on application to social science datasets, but I'm not sure how much we want to duplicate content they are learning elsewhere.

Instead, we could bleed Perspectives on Modeling into Perspectives on Advanced Computational Topics and add more algorithms and methods to students' toolkits. Right now, topics I can think of specific to modeling are:

  1. Model/theory building
  2. Data generating processes
  3. Model estimation procedures
    • Maximum likelihood estimation
    • Generalized method of moments
  4. Generalized linear models
    • Ordinary least squares
    • Logistic regression (binary, ordinal, and multinominal)
    • Poisson/negative binomial regression (regression for count variables)
  5. Model assessment
    • Cross-validation
    • Bootstrapping
    • Ensemble model averaging
  6. Flexible linear methods
    • Polynomial terms
    • LOESS
    • Generalized additive models
  7. Tree-based methods
    • Decision trees
    • Random forests
  8. Support vector machines
  9. Kernal density estimation
  10. Neural networks
  11. Nearest-neighbors
  12. Unsupervised learning
    • Clustering
    • Principle components analysis
    • Latent Direchlet allocation
  13. Structural/theoretical models

Certainly that all doesn't fit into a 10 week course. But what about treating the final terms as a 20 week course? Spend 14 or 15 weeks on methods like the ones above and the remaining 5-6 weeks on distributed computing methods. Thoughts @rickecon?

@bensoltoff
Copy link
Contributor Author

Closed per discussion with @rickecon

gingcee pushed a commit that referenced this issue Jan 23, 2017
gingcee pushed a commit that referenced this issue Jan 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant