title | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Pessimistic Off-Policy Multi-Objective Optimization |
Multi-objective optimization is a class of optimization problems with multiple conflicting objectives. We study offline optimization of multi-objective policies from data collected by a previously deployed policy. We propose a pessimistic estimator for policy values that can be easily plugged into existing formulas for hypervolume computation and optimized. The estimator is based on inverse propensity scores (IPS), and improves upon a naive IPS estimator in both theory and experiments. Our analysis is general, and applies beyond our IPS estimators and methods for optimizing them. |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
alizadeh24a |
0 |
Pessimistic Off-Policy Multi-Objective Optimization |
2980 |
2988 |
2980-2988 |
2980 |
false |
Alizadeh, Shima and Bhargava, Aniruddha and Gopalswamy, Karthick and Jain, Lalit and Kveton, Branislav and Liu, Ge |
|
2024-04-18 |
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics |
238 |
inproceedings |
|