Skip to content

Latest commit

 

History

History
56 lines (56 loc) · 2.34 KB

2024-04-18-deng24b.md

File metadata and controls

56 lines (56 loc) · 2.34 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
On the Generalization Ability of Unsupervised Pretraining
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. However, a rigorous understanding of how the representation function learned on an unlabeled dataset affects the generalization of the fine-tuned model is lacking. Existing theoretical research does not adequately account for the heterogeneity of the distribution and tasks in pre-training and fine-tuning stage. To bridge this gap, this paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase, ultimately affecting the generalization capabilities of the fine-tuned model on downstream tasks. We apply our theoretical framework to analyze generalization bound of two distinct scenarios: Context Encoder pre-training with deep neural networks and Masked Autoencoder pre-training with deep transformers, followed by fine-tuning on a binary classification task. Finally, inspired by our findings, we propose a novel regularization method during pre-training to further enhances the generalization of fine-tuned model. Overall, our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
deng24b
0
On the Generalization Ability of Unsupervised Pretraining
4519
4527
4519-4527
4519
false
Deng, Yuyang and Hong, Junyuan and Zhou, Jiayu and Mahdavi, Mehrdad
given family
Yuyang
Deng
given family
Junyuan
Hong
given family
Jiayu
Zhou
given family
Mehrdad
Mahdavi
2024-04-18
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
238
inproceedings
date-parts
2024
4
18