Skip to content

Latest commit

 

History

History
52 lines (52 loc) · 2.2 KB

2024-04-18-chen24g.md

File metadata and controls

52 lines (52 loc) · 2.2 KB
title software abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an interpolating threshold with over-parameterization, which is not predicted by information criteria in their classical forms due to the limitations in the standard asymptotic approach. We update these analyses using the information risk minimization framework and provide Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for models learned by the Gibbs algorithm. Notably, the penalty terms for the Gibbs-based AIC and BIC correspond to specific information measures, i.e., symmetrized KL information and KL divergence. We extend this information-theoretic analysis to over-parameterized models by providing two different Gibbs-based BICs to compute the marginal likelihood of random feature models in the regime where the number of parameters $p$ and the number of samples $n$ tend to infinity, with $p/n$ fixed. Our experiments demonstrate that the Gibbs-based BIC can select the high-dimensional model and reveal the mismatch between marginal likelihood and population risk in the over-parameterized regime, providing new insights to understand double-descent.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
chen24g
0
Gibbs-Based Information Criteria and the Over-Parameterized Regime
4501
4509
4501-4509
4501
false
Chen, Haobo and W Wornell, Gregory and Bu, Yuheng
given family
Haobo
Chen
given family
Gregory
W Wornell
given family
Yuheng
Bu
2024-04-18
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
238
inproceedings
date-parts
2024
4
18