title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

On the Generalization Ability of Unsupervised Pretraining

Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization. However, a rigorous understanding of how the representation function learned on an unlabeled dataset affects the generalization of the fine-tuned model is lacking. Existing theoretical research does not adequately account for the heterogeneity of the distribution and tasks in pre-training and fine-tuning stage. To bridge this gap, this paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase, ultimately affecting the generalization capabilities of the fine-tuned model on downstream tasks. We apply our theoretical framework to analyze generalization bound of two distinct scenarios: Context Encoder pre-training with deep neural networks and Masked Autoencoder pre-training with deep transformers, followed by fine-tuning on a binary classification task. Finally, inspired by our findings, we propose a novel regularization method during pre-training to further enhances the generalization of fine-tuned model. Overall, our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

deng24b

0

On the Generalization Ability of Unsupervised Pretraining

4519

4527

4519-4527

4519

false

Deng, Yuyang and Hong, Junyuan and Zhou, Jiayu and Mahdavi, Mehrdad

given	family
Yuyang	Deng

given	family
Junyuan	Hong

given	family
Jiayu	Zhou

given	family
Mehrdad	Mahdavi

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics

238

inproceedings

date-parts

2024

4

18

https://proceedings.mlr.press/v238/deng24b/deng24b.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-04-18-deng24b.md

2024-04-18-deng24b.md

Files

2024-04-18-deng24b.md

Latest commit

History

2024-04-18-deng24b.md

File metadata and controls