BrianSURE2021

Brian Ko (KyungMin Ko) work for SURE 2021

All of my codes are using pytorch and Wandb. Wandb is great tool for logging the machine learning experiment. You can checkout the detail in the link below. https://docs.wandb.ai/

REINFORCE

REINFORCE_Batch_Update: Implementation of REINFORCE algorithm and tested with different updating frequency. REINFORCE_Foward_Backward: It contains my implementation of REINFORCE algorithm with backward method which updates the theta from t= T-1 to t= 1, and tested on both CartPole and LunarLander-v2 envrionment

REINFORCE with Baseline

Implementation of REINFORCE with baseline with baseline using value function. Have tested with backward, forward, and average loss mehtod on Wandb.

TRPO

It contains the experiments on comparing TRPO with DQN ACKTR algorithm using Stable Baseline 3 implementation on Lunar Lander-v2

TD Actor Critic

Implementation of TD Actor Critic with sperate neural network, worked on comparing different optimizers with Adam and RMSprop.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
REINFORCE with baseline		REINFORCE with baseline
REINFORCE		REINFORCE
TD Actor Critic		TD Actor Critic
TRPO		TRPO
src		src
LICENSE		LICENSE
README.md		README.md
src.zip		src.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BrianSURE2021

REINFORCE

REINFORCE with Baseline

TRPO

TD Actor Critic

About

Releases

Packages

Languages

License

gt-coar/BrianSURE2021

Folders and files

Latest commit

History

Repository files navigation

BrianSURE2021

REINFORCE

REINFORCE with Baseline

TRPO

TD Actor Critic

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages