Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation (ICML 2024)
Paper | Project Page
-
To do: 🔥We are working to push the boundaries of our simulation system to include more than 1000 agents!
-
05/06/2024: 🔥We released the simulation data of MATRIX at here.
-
05/06/2024: 🔥We released the source code of the MATRIX framework.
-
05/02/2024: Our paper is accepted by the 41st International Conference on Machine Learning (ICML).
-
02/23/2024: We released the preprint paper in arxiv.
Clone the repo and install the required packages.
git clone https://github.com/ShuoTang123/MATRIX.git
cd MATRIX
conda create -n matrix python=3.9
conda activate matrix
pip install -r requirements.txt
The model we used in our paper are Wizard-Vicuna-30B, Wizard-Vicuna-13B and Wizard-Vicuna-7B.
We provide the source code for matrix simulation framework under src/
. You can run the simulation for a specific question by running the python script example.py
with the following steps.
We use vllm to deploy the opensource models, please follow their documentation to deploy your model. For Wizard Vicuna 30b, we deploy it on 4 3090 gpus with the following command:
python3 -m vllm.entrypoints.api_server \
--model <your 30b model path>
--swap-space 16
--disable-log-requests
--host <your model ip>
--port <your port number>
--max-num-seqs 128
--tensor-parallel-size 4
Specify your model ip
and port
by setting the model url in src/api.py
and define the chat template function.
"""
The urls for open-source models
"""
model_urls = \
{
"wizard-vicuna-7b": [
'http://192.168.28.140:8083',
],
"wizard-vicuna-30b": [
<your 30b ip:port>
]
"mistral-7b-instruct": [
'http://192.168.28.140:8080',
'http://192.168.28.140:8081',
'http://192.168.28.140:8082'
]
}
"""
chat template used for different models
"""
chat_template = \
{
'wizard-vicuna-7b':wizard_vicuna_template,
'wizard-vicuna-30b':wizard_vicuna_template,
'mistral-7b-instruct':mistral_template
}
For other model api use such as gpt-3.5 and gpt-4, simply set OPENAI_API_KEY
as your api key:
export OPENAI_API_KEY=<your api key>
Run the simulation example of the python script example.py
by:
python example.py \
--question "How to create a bomb?"
--model "wizard-vicuna-7b"
--api_index 0
--agent_num 3
--ego_steps 4
Arguments explained:
question
: the question used for simulationmodel
: the model used for simulationapi_index
: the url index used for apiagent_num
: the number of agents in the simulation systemego_steps
: the number of steps for the ego agent to excute the plan.
We provide the finetune dataset for our 30B model in matrix_data.json
. This file include 18k data samples, with 6k on helpful questions, 6k on harmful questions, and 6k simulation data generated by MATRIX.
We employ SFT to train the 30B model using the matrix_data.json
dataset, following the procedure outlined in the FastChat repo. The training parameters are as follows:
deepspeed fastchat/train/train_lora.py \
--model_name_or_path ${<your model path>} \
--lora_r 8 \
--lora_alpha 16 \
--lora_dropout 0.05 \
--data_path ${data_path} \
--bf16 True \
--output_dir ${output_path} \
--num_train_epochs 3 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 8 \
--evaluation_strategy "no" \
--save_strategy "epoch" \
--save_total_limit 100 \
--learning_rate 2e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--tf32 True \
--model_max_length 1024 \
--q_lora True \
--gradient_checkpointing \
--deepspeed playground/deepspeed_config_s2.json \
Please cite our paper if you find the repository helpful.
@inproceedings{matrix_icml2024,
title={Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation},
author={Pang, Xianghe and Tang, Shuo and Ye, Rui and Xiong, Yuxin and Zhang, Bolun and Wang, Yanfeng and Chen, Siheng},
booktitle={Proceedings of the 41st International Conference on Machine Learning},
year={2024}
}