🥷 Sekiro Reinforcement Learning Agent

Project Overview

This project implements a Deep Q-Network (DQN) agent to play Sekiro: Shadows Die Twice using reinforcement learning. The agent uses a pre-trained ResNet18 (EfficientNet-B0 is also supported) model as its backbone and learns to make optimal decisions in the game environment and we will try to integrate vision transformer/decision transformer in the future.

Installation

Clone the repository:

git clone https://github.com/yourusername/sekiro_rl.git
cd sekiro_rl

Install the required dependencies:

pip install torch torchvision numpy matplotlib

Ensure you have Sekiro: Shadows Die Twice installed and set up for the custom environment.

Project Structure

train.py: Main script for training the RL agent
network.py: Contains the DQN model architecture
env.py: Custom Sekiro environment (not provided in the snippets)
checkpoints/: Directory for storing model checkpoints

Usage

The training process is divided into two stages:

Train Behavior Cloning model

To start training the agent, run:

python train_bc.py

You can customize the training process using various command-line arguments. For example:

python train_bc.py --lr 0.0001 --batch_size 128 --epochs 500 --cuda

Train Reinforcement Learning model

Make you load the weights of the behavior cloning model you trained in the previous stage. To train the reinforcement learning model, run:

python train_rl.py

Run python train_rl.py --help to see all available options.

Training Process

Prerequisites:

Data Collection: Ensure you have collected and preprocessed data from Sekiro: Shadows Die Twice.
Label File: Use the label.csv file to map actions to specific frames.
Config File: Use the sekiro_config.json file to set the training parameters.
Image Folder: Ensure the images folder is in the correct path.
Game Resolution: Ensure you set the desired game resolution in the sekiro_config.json file. (In this repository, the resolution is set to 1280x720)

The training process involves the following steps:

Initialize the Sekiro environment and the DQN model.
For each episode:
- Reset the environment to get the initial state.
- For each step in the episode:
  - Select an action using an epsilon-greedy policy.
  - Perform the action and observe the next state and reward.
  - Store the transition in the replay buffer.
  - Optimize the model using a batch of experiences from the replay buffer.
- If the episode is done, move to the next episode.
Periodically save checkpoints of the model.

Customization

You can customize various aspects of the training process:

Learning rate (--lr)
Batch size (--batch_size)
Number of training epochs (--epochs)
Epsilon values for exploration (--eps_start, --eps_end, --eps_decay)
Discount factor for future rewards (--gamma)
Checkpoint interval (--checkpoint_interval)
Checkpoint directory (--checkpoint_dir)

Checkpointing

The training process automatically saves checkpoints at regular intervals. Each checkpoint contains:

Policy network state
Target network state
Optimizer state
Training arguments

You can use these checkpoints to resume training or evaluate the model at different stages of training.

Contributing

Contributions to this project are welcome! Please follow these steps:

Fork the repository
Create a new branch for your feature
Commit your changes
Push to your branch
Create a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

To Do

✅ Implement behavior cloning to slightly fine-tune the model before applying heavy reinforcement learning algorithms.
✅ Collect and preprocess expert gameplay data for behavior cloning.
✅ Integrate behavior cloning into the training pipeline.
Investigate the effect of different data-prerpocessing schemes (would center-crop be useful in my case?).
Integrate recurrent structure into the model to improve performance, such as ConvLSTM/Transformer.
Collect more training data (at least one magnitude more).
Find a workaround for imbalanced data.
Experiment with different reinforcement learning algorithms to improve agent performance.
Add more detailed logging and visualization of training progress.

For any questions or issues, please open an issue on the GitHub repository.

Acknowledgments

We would like to thank the authors of the following repositories for their contributions and inspiration:

Counter-Strike Behavioural Cloning by TeaPearce
Train Your Own Game AI by ricagj

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
assets		assets
checkpoints		checkpoints
configs		configs
data		data
dataset		dataset
misc		misc
.gitignore		.gitignore
LICENSE		LICENSE
env.py		env.py
network.py		network.py
readme.md		readme.md
train_bc.py		train_bc.py
train_rl.py		train_rl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥷 Sekiro Reinforcement Learning Agent

Table of Contents

Project Overview

Installation

Project Structure

Usage

Train Behavior Cloning model

Train Reinforcement Learning model

Training Process

Customization

Checkpointing

Contributing

License

To Do

Acknowledgments

About

Releases

Packages

Languages

License

RongKaiWeskerMA/sekiro_play

Folders and files

Latest commit

History

Repository files navigation

🥷 Sekiro Reinforcement Learning Agent

Table of Contents

Project Overview

Installation

Project Structure

Usage

Train Behavior Cloning model

Train Reinforcement Learning model

Training Process

Customization

Checkpointing

Contributing

License

To Do

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages