Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 409 Bytes

README.md

File metadata and controls

3 lines (2 loc) · 409 Bytes

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

This repository is a study of interesting characteristics of the robust value iteration algorithm. In particular, we introduce RPVL in our paper which considers finite-horizon tabular MDP. Gambler's problem (or Gambler's Ruin) is a simple yet nice example for us to characterize the behavior of an optimal robust policy.