Skip to content

Commit

Permalink
Merge pull request #19 from joseph-jnl/gridworld
Browse files Browse the repository at this point in the history
update ch3 docs with images
  • Loading branch information
joseph-jnl authored Jan 30, 2025
2 parents 4b02a23 + 4b75e1c commit eb348bd
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
4 changes: 3 additions & 1 deletion docs/chap3_finite_mdps.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,23 @@
```bash
python run.py
```
![image](https://github.com/user-attachments/assets/216229c6-5925-4479-9cd7-52a62f4ad4fb)
/// caption
Figure 3.2: State-value function for a random policy (equal probability for all directions). Config uses this example as a default.
///

```bash
python run.py grid=example_3_8_optimal_grid plots=example_3_8_optimal_grid
```

![image](https://github.com/user-attachments/assets/9b2a7a65-5f23-4b64-9217-f96b20dd8ebd)
/// caption
Figure 3.5: Optimal state-value function and policy for a gridworld.
///

```bash
python run.py grid.n_rows=10 grid.n_cols=10 grid.special_states=[[0,0,8,1],[1,3,7,9]] grid.special_states_prime=[[4,2,1,8],[1,3,7,1]] grid.special_states_rewards=[10,5,8,15] plots.policy=true
```
![image](https://github.com/user-attachments/assets/0a808f1d-e82b-4f31-9d4a-cc8b9833313d)
/// caption
Example of creating a new gridworld with arbitrary size and rewards.
///
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +band
Figure 2.3 (rlbook): The `+bandit.random_argmax=true` flag was used to switch over to an argmax implementation that randomizes between tiebreakers rather than first occurence used in the default numpy implementation to better align with the original example.
[Link to wandb artifact](https://api.wandb.ai/links/josephjnl/53gxgbcc)

Further details on experimental setup and results can be found corresponding chapter docs.
Further details on experimental setup and results can be found within the corresponding chapter docs.

0 comments on commit eb348bd

Please sign in to comment.