Skip to content

Pipeline Scripts

Barry edited this page Jul 15, 2024 · 2 revisions

data_preprocess.ipynb

This notebook is responsible for loading network data, generating subsamples, and exporting them into a uniform dataset format.

Output location: /data/processed/networks

  • subsamples_edge_lists.csv
  • subsamples_metadata.csv

topology_features.py & topology_features.R

These scripts extract topological features from each network and its subsamples, as prepared in the data_preprocess.ipynb notebook.

Output location: /data/processed/features

  • features_R.csv
  • features_py.csv

run_other_models.R

This script runs various other predictive models (such as SBM) for comparative analysis.

Output location: /results/raw/

  • other_models.csv

run_predictions.ipynb

This notebook executes the predictive models specified for the research paper and exports the results for further analysis with R.

Output location: /results/ reduced

results_preprocessing.Rmd

This R Markdown file processes the results from run_predictions.ipynb, creating smaller lighter datasets for plots and tables for easier handling and collaboration.

results_figs.Rmd

This R Markdown file uses the processed results to generate visualizations for the paper.