Skip to content

Python source code and data from the research article "To Share or Not to Share: Investigating Weight Sharing in Variational Graph Autoencoders" by G. Salha-Galvan and J. Xu (WWW 2025)

Notifications You must be signed in to change notification settings

KiboRyoku/ws_vgae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Investigating Weight Sharing in VGAE

This repository contains Python code and data for reproducing the experiments described in the research article titled "To Share or Not to Share: Investigating Weight Sharing in Variational Graph Autoencoders" co-authored by Guillaume Salha-Galvan and Jiaying Xu. This article has been accepted for publication as a short paper in the proceedings of the 2025 ACM Web Conference (WWW 2025) and will be available online soon.

Introduction

This paper investigates the understudied practice of weight sharing (WS) in variational graph autoencoders (VGAE), a powerful family of unsupervised node embedding methods for Web-related applications. WS presents both benefits and drawbacks for VGAE model design and node embedding learning, leaving its overall relevance unclear and the question of whether it should be adopted unresolved.

We rigorously analyze its implications and, through extensive experiments on a wide range of graphs and VGAE variants, demonstrate that the benefits of WS consistently outweigh its drawbacks. Based on our findings, we recommend WS as an effective approach to optimize, regularize, and simplify VGAE models without significant performance loss.

To encourage further research, we have made our source code publicly available in this GitHub repository. Our code builds upon the FastGAE repository by Deezer Research, which itself extends the original Tensorflow implementation of VGAE models by Thomas Kipf.

Installation

Please execute the following command lines to complete the installation.

git clone https://github.com/GuillaumeSalhaGalvan/ws_vgae
cd ws_vgae
python setup.py install

Requirements: Python 3.10, networkx 2.6, numpy 2.0.2, scikit-learn 1.5.2, scipy 1.14.1, tensorflow 2.18.

Datasets

We provide direct access to all public datasets used in our experiments in the data folder.

Run Experiments

Quick tests:

Navigate to the ws_vgae folder.

cd ws_vgae

Train a VGAE model (20 training iterations only) on the Blogs dataset with weight sharing (WS):

python train.py --model=gcn_vae --dataset=blogs --learning_rate=0.01 --iterations=20

Train a VGAE model (20 training iterations only) on the Blogs dataset without WS:

python train.py --model=gcn_vae_nows --dataset=blogs --learning_rate=0.01 --iterations=20

Reproduce experiments from Table 1 of the article:

Navigate to the ws_vgae folder, if needed.

cd ws_vgae

Train complete VGAE models with WS on all graph datasets:

python train.py --model=gcn_vae --dataset=cora --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=cora --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=citeseer --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=citeseer --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=pubmed --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=pubmed --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=webkd --task=link_prediction --features=False  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=webkd --task=link_prediction --features=True  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=blogs --task=link_prediction --features=False  --learning_rate=0.01 --iterations=200 --nb_run=1
python train.py --model=gcn_vae --dataset=hamster --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae --dataset=arxiv-hep --task=link_prediction --features=False  --learning_rate=0.05 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=cora-large --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=google-medium --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=google --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=sbm --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae --dataset=artists --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1

Train complete VGAE models without WS on all graph datasets:

python train.py --model=gcn_vae_nows --dataset=cora --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=cora --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=citeseer --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=citeseer --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=pubmed --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=pubmed --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=webkd --task=link_prediction --features=False  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=webkd --task=link_prediction --features=True  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=blogs --task=link_prediction --features=False  --learning_rate=0.01 --iterations=200 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=hamster --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=gcn_vae_nows --dataset=arxiv-hep --task=link_prediction --features=False  --learning_rate=0.05 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=cora-large --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=google-medium --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=google --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=sbm --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=gcn_vae_nows --dataset=artists --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1

Train complete Deep VGAE models with WS on all graph datasets:

python train.py --model=deep_gcn_vae --dataset=cora --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=cora --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=citeseer --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=citeseer --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=pubmed --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=pubmed --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=webkd --task=link_prediction --features=False  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=webkd --task=link_prediction --features=True  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=blogs --task=link_prediction --features=False  --learning_rate=0.01 --iterations=200 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=hamster --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae --dataset=arxiv-hep --task=link_prediction --features=False  --learning_rate=0.05 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=cora-large --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=google-medium --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=google --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=sbm --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae --dataset=artists --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1

Train complete Deep VGAE models without WS on all graph datasets:

python train.py --model=deep_gcn_vae_nows --dataset=cora --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=cora --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=citeseer --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=citeseer --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=pubmed --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=pubmed --task=link_prediction --features=True  --learning_rate=0.01 --iterations=300 --nb_run=1 --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=webkd --task=link_prediction --features=False  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=webkd --task=link_prediction --features=True  --learning_rate=0.005 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=blogs --task=link_prediction --features=False  --learning_rate=0.01 --iterations=200 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=hamster --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1
python train.py --model=deep_gcn_vae_nows --dataset=arxiv-hep --task=link_prediction --features=False  --learning_rate=0.05 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=cora-large --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=google-medium --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=google --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=sbm --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1
python train.py --model=deep_gcn_vae_nows --dataset=artists --task=link_prediction --features=False  --learning_rate=0.01 --iterations=300 --nb_run=1  --fastgae --nb_node_samples=5000 --alpha=1

Notes:

  • The above commands execute a single training run for each model. Use the --nb_run parameter to average scores over multiple training runs.
  • The above commands evaluate models on the link prediction task. To switch to community detection, replace --task=link_prediction with --task=community_detection.
  • Some of the above commands include --fastgae to leverage the FastGAE technique for scalability.
  • A complete list of parameters is available in the train.py file.

On VGAE Model Variants

This repository currently provides complete instructions for the VGAE and Deep VGAE models.

To run similar experiments on the other model variants mentioned in Table 2 of the article, simply incorporate the WS/no WS distinctions from our own model.py file into the model.py files of the following repositories:

Citation

Please cite our paper if you use this code in your own work:

@inproceedings{salhagalvan2025toshare,
  title={To Share or Not to Share: Investigating Weight Sharing in Variational Graph Autoencoders},
  author={Salha-Galvan, Guillaume and Xu, Jiaying},
  booktitle={Companion Proceedings of the 2025 ACM Web Conference},
  year={2025}
}

About

Python source code and data from the research article "To Share or Not to Share: Investigating Weight Sharing in Variational Graph Autoencoders" by G. Salha-Galvan and J. Xu (WWW 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages