This repository is the DQfD baseline submission example with PFRL, originated from the main MineRL Competition submission template and starter kit.
For detailed & latest documentation about the competition/template, see the original template repository.
This repository is a sample of the "Round 1" submission, i.e., the agents are trained locally.
test.py
is the entrypoint script for Round 1.
Please ignore train.py
, which will be used in Round 2.
train/
directory contains baseline agent's model weight files trained on MineRLObtainDiamondDenseVectorObf-v0
.
After signing up the competition, specify your account data in aicrowd.json
.
See the official doc
for detailed information.
Then you can create a submission by making a tag push to your repository on https://gitlab.aicrowd.com/. Any tag push (where the tag name begins with "submission-") to your repository is considered as a submission.
If everything works out correctly, you should be able to see your score on the competition leaderboard.
This baseline consists of two main steps:
- Apply K-means clustering for the action space with the demonstration dataset.
- Apply the DQfD algorithm on the discretized action space.
Each of steps utilizes existing libraries.
K-means in the step 1 is from scikit-learn,
and DQfD in the step 2 is based on the DoubleDQN agent in PFRL,
which is a Pytorch-based RL library.
To train your agent you can call the main function in train.py
as done in the lines that were commented-out in run.py
mod/
directory contains all you need to train agent locally:
# Don't forget to set this environment variable
export MINERL_DATA_ROOT=<directory you want to store demonstration dataset>
The quick-start kit was authored by Shivam Khandelwal with help from William H. Guss
The competition is organized by the following team:
- William H. Guss (Carnegie Mellon University)
- Mario Ynocente Castro (Preferred Networks)
- Cayden Codel (Carnegie Mellon University)
- Katja Hofmann (Microsoft Research)
- Brandon Houghton (Carnegie Mellon University)
- Noboru Kuno (Microsoft Research)
- Crissman Loomis (Preferred Networks)
- Keisuke Nakata (Preferred Networks)
- Stephanie Milani (University of Maryland, Baltimore County and Carnegie Mellon University)
- Sharada Mohanty (AIcrowd)
- Diego Perez Liebana (Queen Mary University of London)
- Ruslan Salakhutdinov (Carnegie Mellon University)
- Shinya Shiroshita (Preferred Networks)
- Nicholay Topin (Carnegie Mellon University)
- Avinash Ummadisingu (Preferred Networks)
- Manuela Veloso (Carnegie Mellon University)
- Phillip Wang (Carnegie Mellon University)