Skip to content

fusion-ml/ocbo_qlearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

q-learning with ocbo

This code applies concepts from OCBO (Offline Contextual Bayesian Optimization) to reinforcement learning by choosing start states for each episode in a "smart" fashion: each episode starts at an "interesting" state that is expected to give high improvement, rather than choosing start states randomly.

The environment is tabular (discrete actions, discrete states), and a customizable Grid-World environment is implemented.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published