Skip to content

wujwyi/PA-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization

Dataset

Training data

We provide the training data for PA-RAG, available at
https://drive.google.com/file/d/1agP7fi1iX-3qFK7XFBvRu6rC5X_-M8Iy/view?usp=drive_link

Include

  • sft_data.json: 58.9k instruction fine-tuning data
  • dpo_data_ri.json: 11.8k response informative preference data
  • dpo_data_rr.json: 13.4k response robustness preference data
  • dpo_data_cq.json: 22.5k citation quality preference data

The question used for constracting our training data are sourced from ASQA, WebQuestions, and Natural Questions. Detailed statistics are as follows:

IFT RI RR CQ
ASQA 1,714 1,046 962 631
WebQ 1,681 326 357 653
NQ 55,463 10,416 12,080 21,241
Sum 58,858 11,788 13,399 22,525

Evaluation data

The data for evaluation is available at
https://drive.google.com/file/d/1vn5O_PtUnV3rOC7CAbSsZITG6NQ1EZtx/view?usp=drive_link.
The qustions are sourced from the test split from ASQA, WebQustions, Natural Questions, and TriviaQA. The retrieved documents are retrieved by dense retriever GTR from Wikipedia dump from December 20, 2018.

Training

We use the framework LLaMA-Factory to train our models. We selected three general LLMs as the base RAG generator: Llama2-7b-chat, Llama2-13b-chat, Llama3-8b-instruct.
We utilized full fine-tuning for all training stages and employed the same hyperparameter settings for all models.
During the instruction fine-tuning phase, we set the batch size to 128, the learning rate to 2e-5, and trained for one epoch.
In the preference optimization phase, we set the batch size to 64 and trained for one epoch for all stages. For the optimization stages of response informativeness and response robustness, the learning rate is 2e-6. In the citation quality optimization stage, the learning rate is 2e-7.

Inference

Inference with zero-shot setting

CUDA_VISIBLE_DEVICES=0 python inference/inference_vllm.py \
    --model model_path \
    --prompt_file prompts/default.json \
    --eval_file data_path (e.g.data/asqa_dev.json) \
    --output_file output_path \
    --shot 0 \
    --ndoc 5 \

Evaluation

Download the NLI model TRUE before evaluate.

CUDA_VISIBLE_DEVICES=0 python inference/eval.py --f response_to_eval_path --no_rouge --citation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages