solve_gym

various methods and reinforcement learning libraries (stable baseline, my own) to solve various openAI gym environments
note that stable baseline uses a different virtualenvironment. i used autoswitch-venv for this.
currently i implemented monte carlo (some borrowed here), various TD (SARSA family, Q-learning family), and bellman equations (linear programming, policy & value iterations). each implementation comes with a test example.
TODO
- add rounds parameter for policy & value iteration
- add prioritize replay to DQN
- add automated tests
- add more complex algorithms like AC family, PG family, etc. along with example solutions.
- perhaps change the API a bit so that it looks more like stable baseline's.
- there's a good NEAT library in Python and perhaps i'd reproduce it for a lot of the examples. (parameter tuning can be quite time consuming though)
- another general purpose algorithm like NEAT is MuZero, which is very effective at atari games and board games. so perhaps implement that too. (need a beefier machine)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
methods		methods
not_working		not_working
ongoing		ongoing
others_code		others_code
stable_baselines		stable_baselines
standard_apis		standard_apis
.venv		.venv
README.md		README.md
__init__.py		__init__.py
ac_cartpole.py		ac_cartpole.py
bellman_cliffwalking.py		bellman_cliffwalking.py
bellman_copy.py		bellman_copy.py
bellman_frozenlake.py		bellman_frozenlake.py
bellman_taxiv3.py		bellman_taxiv3.py
dp_cliffwalking.py		dp_cliffwalking.py
dp_custom_gridworld.py		dp_custom_gridworld.py
dp_frozen_lake.py		dp_frozen_lake.py
mc_blackjack.py		mc_blackjack.py
requirements.txt		requirements.txt
sutton_notes_and_slns.txt		sutton_notes_and_slns.txt
td_approx.py		td_approx.py
td_cliffwalking.py		td_cliffwalking.py
td_frozenlake.py		td_frozenlake.py
td_taxi.py		td_taxi.py

Provide feedback