Releases · LovelyBuggies/MFG-RL-PIDL

01 Nov 02:04

LovelyBuggies

aamas-submission

f3ada66

Version 2.0.0 Latest

Latest

Source code for paper - A Hybrid Framework of Reinforcement Learning and Physics-Informed Deep Learning for Spatiotemporal Mean Field Games.
Pure PIDL and RL+PIDL algorithm.

Assets 2

17 Oct 02:02

LovelyBuggies

make-rl-pidl-work

573f834

Version 1.0.0

Make RL(DDPG) + PIDL work, now we are able to train three networks together for three rewards.

Contributions:

Using DDPG, the actor can output continuous speed.
On board with PIDL.
Using fictitious play to calculate the speed.
If not for plotting, no array is involved (all are networks).
Allow options like starting with a supervised critic, turning PIDL training off, smooth plotting, etc.
Provide notebooks for pure PIDL and the critic.

Vulnerabilities: The separable reward depends on the actor's initial weight, sometimes cannot get the expected results.

Assets 2

26 Sep 02:52

LovelyBuggies

get-rho-net-from-u

b6b87a0

0.3.1

Replace get_rho_from_u and one step supervised training to get_rho_network_from_u.

Notes:

Initialization is still using supervised learning.
rho array is not used in the outer loop, just for plotting.

Assets 2

22 Sep 22:56

LovelyBuggies

non-sep-rl

321e73c

Version 0.3.0

Make non-sep work with RL, training A, C, and rho_net together. But sep has some problems.

Alg.
Every tau step, train critic.
Train actor.
Train rho net.

Assets 2

15 Sep 17:08

LovelyBuggies

rho-u-V

2a08ec3

Version 0.2.2

Unlike V 0.2.1, this version replaces actor and critic with u and V but updates rho by one step in each iteration.

Notes:

Change MFG_VI.py line 10, value_iteration_ddpg.py line 90-92, 95-100, to switch LWR, Non-SEP, SEP.
Supervised rho network using x, t.
LWR needs a learning rate of 0.1, the other two need 0.001.
V 0.2.1 Non-SEP failure could be because of learning rate.

Assets 2

14 Sep 23:49

LovelyBuggies

rl-3together

1e1922e

Version 0.2.1

Like Tag - iterative rl, train 3 networks together or iterative RL both work for LWR and we can get results from the inaccurate rho (why ..1 comes from), but not for the non-separable case.

Assets 2

14 Sep 23:43

LovelyBuggies

make-non-sep-work

6ad681f

Version 0.2.0

For the ring road, like Tag - make-lwr-work, it updates the actor and critic synchronously (compared with V 0.1.0) based on an initially accurate rho.

Assets 2