-
Notifications
You must be signed in to change notification settings - Fork 2
Implementation Details
Lucas Farris edited this page Jun 3, 2021
·
1 revision
The project is structured around 3 main components:
- An environment (which is abstracted by the
gym.Environment
class). The environment receives actions, and outputs states and rewards. - An agent (which is abstracted by our own
ReinforcementLearning
class). The agent receives states and outputs actions. - A manager (which is abstracted by our own
Manager
class). The manager coordinates the sending/receiving of actions and states. Managers help with training agents, hyper-parameter optimization, executing episodes, and printing overviews of the environment.
Besides these, there are other important abstractions that the project uses:
-
LearningStatistics
collects different metrics that agents may output during training. It facilitates ways to retrieving the metrics and plotting them. It allows for aggregation on many levels, like model, episode, time-step, environment, and metric. -
BaseNetwork
is a base class for pytorchnn.Module
subclasses, that provides functionality for saving/loading models, enabling GPU, soft parameter updates, freezing weights, plotting the architecture diagram, and running backpropagation. -
ExperienceBuffer
is an abstraction for agents that use the experience replay technique. It allows for experiences to be stored, sample, and for the buffer to tell if there are enough experiences to start making predictions.
pydeeprecsys.rl.manager.MovieLensFairnessManager
-
pydeeprecsys.rl.agent.dqn.DQNAgent
[WIP] pydeeprecsys.rl.agent.rainbow.RainbowDQNAgent
pydeeprecsys.rl.agent.reinforce.ReinforceAgent
pydeeprecsys.rl.agent.actor_critic.ActorCriticAgent
-
pydeeprecsys.rl.agent.soft_actor_critic.SoftActorCritic
[WIP]
pydeeprecsys.rl.networks.dueling.DuelingDDQN
pydeeprecsys.rl.networks.value_estimator.ValueEstimator
pydeeprecsys.rl.networks.policy_estimator.PolicyEstimator
-
pydeeprecsys.rl.networks.deep_q_network.DeepQNetwork
[WIP] -
pydeeprecsys.rl.networks.gaussian_actor.GaussianActor
[WIP] -
pydeeprecsys.rl.networks.q_value_estimator.TwinnedQValueEstimator
[WIP]
pydeeprecsys.rl.experience_replay.experience_buffer.ExperienceReplayBuffer
pydeeprecsys.rl.experience_replay.priority_replay_buffer.PrioritizedExperienceReplayBuffer
Note: buffer parameters are abstracted in separate classes, to improve code readability.
pydeeprecsys.rl.agents.epsilon_greedy.DecayingEpsilonGreedy
pydeeprecsys.rl.neural_networks.noisy_layer.NoisyLayer
Note: DecayingEpsilonGreedy
can be parametrized to behave as a normal ϵ-greedy method, by setting decay_rate=1
.