Skip to content

Latest commit

 

History

History
32 lines (25 loc) · 7.11 KB

schedule.md

File metadata and controls

32 lines (25 loc) · 7.11 KB

Schedule

All assignments and dates are subject to change

Readings will be added to the later weeks of the course in response to student interest and new developments in the field.

Readings are due on the day indicated.

Problem sets will be distributed via CMS no later than the Friday indicated. They are are due as indicated on CMS (often, but not always, two weeks from distribution).

Occasional brief responses are due as assigned on Tuesdays before 4:00pm. For details, see the Discussions section of Canvas.

  • JM = Jurafsky and Martin, Speech and Language Processing, 3rd ed. (online)
Week Monday Wednesday Friday
1 (8/21) Introduction Tokenization. Problem set 0: Setup and shakedown
2 (8/28) Dictionary methods and vector space models. PS 1: Tokens, vectors, and regression
3 (9/4) No class. Labor Day. Regression.
4 (9/11) Clustering.
  • Moretti, "Slaughterhouse of Literature" (Canvas)
PS 2: Clustering and classification
5 (9/18) Classification.
    6 (9/25) Feature importance and hypothesis testing.
    7 (10/2) Topic models No section meetings. Fall break.
    PS 3: Features and comparisons.
    8 (10/9) No class. Fall break. NLP and feature expansion.
    9 (10/16) Static word embeddings. Nelson, "Leveraging the Alignment between Machine Learning and Intersectionality" (Canvas) PS 4: Entities and static embeddings
    10 (10/23) BERT and contextual embeddings.
    11 (10/30) Large language models and generative AI.
    12 (11/6) Catch-up: Using BERT for classification No class. Prof. Wilkens out of town. PS 5: Contextual embeddings and LLMs
    13 (11/13) Catch-up: The BERT architecture Text generation with LLMs
    14 (11/20) LLM applications No class. Thanksgiving. No section meetings. Thanksgiving.
    15 (11/27) Multilingual NLP Working with user-generated content. Review and exam preparation
    16 (12/4) Summary discussion and conclusions. ----- -----