Skip to content

Latest commit

 

History

History
69 lines (54 loc) · 2.03 KB

README.md

File metadata and controls

69 lines (54 loc) · 2.03 KB

End-to-end dense video grounding via parallel regression (CVIU 2024)

Fengyuan Shi, Weilin Huang, Limin Wang

arXiv

Requirements

conda create -n prvg python=3.10
conda activate prvg
bash install.txt

Dataset

Visual Features on ActivityNet Captions

Please download the C3D features from the official website of ActivityNet: Official C3D Feature.

Visual Features on TACoS

Please download the C3D features for training set and test set of TACoS dataset.

Inference

Checkpoints

# ActivityNet Captions
export CUDA_VISIBLE_DEVICES=0 
python eval.py --verbose --cfg ../experiments/activitynet/acnet_test.yaml

# TACoS
export CUDA_VISIBLE_DEVICES=1 
python eval.py --verbose --cfg ../experiments/tacos/tacos_test.yaml

Training

# ActivityNet Captions
export CUDA_VISIBLE_DEVICES=0 
python main.py --verbose --cfg ../experiments/activitynet/acnet.yaml

# TACoS
export CUDA_VISIBLE_DEVICES=1 
python main.py --verbose --cfg ../experiments/tacos/tacos.yaml

Citation

If you make use of our work, please cite our paper.

@article{shi2024end,
  title={End-to-end dense video grounding via parallel regression},
  author={Shi, Fengyuan and Huang, Weilin and Wang, Limin},
  journal={Computer Vision and Image Understanding},
  volume={242},
  pages={103980},
  year={2024},
  publisher={Elsevier}
}

Acknowledgments

This project is built upon DepNet. Thanks for their contributions!