RankDVQA: Deep VQA based on Ranking-inspired Hybrid Training,
Chen Feng 1,
Duolikun Danier1,
Fan Zhang 1,
David Bull 1,
1Visual Information Laboratory, University of Bristol, Bristol, UK, BS1 5DD
in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024.
We propose new VQA methods based on a two-stage training methodology which motivates us to develop a large-scale VQA training database without employing human subjects to provide ground truth labels. This method was used to train a new transformer-based network architecture, exploiting the quality ranking of different distorted sequences rather than minimizing the difference from the ground-truth quality labels. The architecture of RankDVQA consists of two parts: the PQANet, which uses convolutional and SWIN transformer layers for feature extraction and local quality prediction, and the STANet, which refines the assessment using adaptive spatio-temporal pooling.
Python==3.8.6
Pytorch version: >= Pytorch1.10
GPU memory: >= 16G
Please fill the registration form to get access to the download link. I will then share the download link ASAP.
-
Download the RankDVQA dataset (Please fill the registration form).
-
Stage 1: Run the training codes of PQANet
python train.py --model=multiscale_v33 --expdir=./models/
- Extract the feature and predicted scores from PQANet
cd STANet
python data_generation.py --json_path=./path_to_database_json_file.json
- Stage 2: Run the training codes of STANet
python train.py --pretrained_model_path=./models/FR_model --data_path=./data_VMAFplus.pkl --save_path=./exp/stanet/
To make use of the test.py, the dataset folder names should be structured as follows.
└──── <data directory>/
├──── database/
| ├──── VMAFplus/
| | ├──── ORIG/
| | ├──── TEST/
| | └──── subj_score.json
Stage 1: Run the testing code of PQANet:
python test.py --database=./path_to_database/ --width=1920 --height=1080 --bitDepth=8
In stage 1, evaluate the results with a simple arithmetic average operation for patches and save the predieted quality scores of the patches.
Stage 2: Run the testing code of STANet:
cd STANet
python test.py --model_path=./exp/stanet/stanet_epoch_20.pth --json_path=./path_to_database_json_file.json
In stage 2, obtain the final sequence level quality score with adaptive spatio-temporal pooling, which accepts both patch-level score tensors and the distorted features maps (extracted in Stage 1)
@InProceedings{Feng_2024_WACV,
author = {Feng, Chen and Danier, Duolikun and Zhang, Fan and Bull, David},
title = {RankDVQA: Deep VQA Based on Ranking-Inspired Hybrid Training},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {1648-1658}
}
If you find any bugs in the code, feel free to send me an email: [email protected].
The authors appreciate the funding from the UKRI MyWorld Strength in Places Programme (SIPF00006/1), the University of Bristol, and the China Scholarship Council.