This repository is the official implementation of the paper: DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting. Please consider staring us if you find it interesting.
The paper is accepted by WSDM'24. You can view our project page.
main.py
is the implementation of DeSCo.
subgraph_counting
contains all the modules needed by python scripts.
baseline.py
is the implementation of two neural baselines (DIAMNet and LRP) that is compared with DeSCo in the paper.ablation_gnns.py
is used for the ablation study of the expressive power of SHMP. It implements other expressive GNNs.ablation_wo_canonical.py
is used for the ablation study of canonical partition. It implements DeSCo's neighborhood counting stage without canonical partition.
Python >= 3.9
To install requirements:
pip install -r requirements.txt
The neighborhood counting and gossip propagation model in our paper is trained on our synthetic dataset. Users can download our pre-trained model from here
To evaluate the trained models on real-world datasets, please run the following command:
python main.py --test_dataset COX2 --neigh_checkpoint ckpt/{checkpoint_path}/neigh/{model_name}.ckpt --gossip_checkpoint ckpt/{checkpoint_path}/gossip/{model_name}.ckpt --test_gossip
The above command gives an example of evaluating the trained models on COX2. The path of checkpoints should be replaced by the real path of your trained model checkpoints.
The code comes with analysis methods in subgraph_counting/workload.py
, which outputs the inference count of the model. Users should be able to get any desired metrics with these count easily.
Alternatively, if you wish to train your own model instead of using our pre-trained version, here are the instructions you may need.
To benefit future research, we release the large synthetic dataset with subgraph count ground-truth that we used in our pre-trained model. Users can download the dataset zip file from here and move the unziped folder under DeSCo/data/
to train from scratch.
If you desire to train with the official configuration of DeSCo, simply run this command:
python main.py --train_dataset Syn_1827 --valid_dataset Syn_1827 --test_dataset MUTAG --train_neigh --train_gossip --test_gossip
To train the model(s) in the paper with other configurations, please specifies the parameters in the command.
The bool parameters train_neigh
, train_gossip
, and test_gossip
, determine whether to train and to test the neighborhood counting and gossip propagation model.
Please refer to the Appendix for the detailed training parameters.
If you find our work useful, please consider citing:
@inproceedings{fu2024desco,
title={DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting},
author={Fu, Tianyu and Wei, Chiyue and Wang, Yu and Ying, Rex},
booktitle={Proceedings of the 17th ACM International Conference on Web Search and Data Mining},
pages={218--227},
year={2024}
}
Welcome to use the code or contribute to the project!