Updated slides #478

qualiaMachine · 2024-06-04T20:27:22Z

I have some updated slides that I used to teach this lesson last week: https://docs.google.com/presentation/d/1uT4uvfWrpvrrQEFp84PGfAQ2r9Ylqx8tbwiVFuGfEao/edit?usp=sharing

Please feel free to use/repurpose anything in there.

I felt it was important to comment on the double descent phenomenon during the discussion of "how much data is needed?", especially in the age of increasingly large language models. Double descent is not currently mentioned in the lesson. I may make a pull request on the topic if I can find the time... it's something we may want to add to an earlier episode.

svenvanderburg · 2024-06-11T17:04:58Z

Thanks for sharing @qualiaMachine ! We'll think about what to do with slides, since it doesn't make much sense if everyone develops their own slides. I didn't actually know about double descent, thanks for teaching me! Is it something that you come across frequently in practice?

qualiaMachine · 2024-06-11T19:52:15Z

Glad I could share! It's something that hasn't really been discussed much up until a few years ago. Older textbooks still need to be updated since the classic bias-variance tradeoff is violated with deep neural networks! I have personally never experienced it, but I have worked with fairly small datasets relative to other deep learning applications. Evidently double descent is more frequently observed when you have larger datasets... at least 10,000 observations which I never encountered in my research applications :(. Many other learners may be in a similar boat, but I still think it's worthwhile to point out. I usually talk about it in the context of large language models, which despite having billions/trillions of weights, can still avoid overfitting. It's also something that's worth mentioning when early stopping is implemented. While in general, I would recommend sticking with early stopping, those with large datasets may want to explore "overparameterized" models to see if they can reach past the initial overfitting phase.

The book I recommended has a chapter on it if you're curious to learn more: https://udlbook.github.io/udlbook/.

Here are a couple other references that are worth checking out:

svenvanderburg · 2024-06-13T05:36:20Z

Cool, thanks for the clear explanations! 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated slides #478

Updated slides #478

qualiaMachine commented Jun 4, 2024

svenvanderburg commented Jun 11, 2024

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 13, 2024

Updated slides #478

Updated slides #478

Comments

qualiaMachine commented Jun 4, 2024

svenvanderburg commented Jun 11, 2024

qualiaMachine commented Jun 11, 2024

svenvanderburg commented Jun 13, 2024