-
Notifications
You must be signed in to change notification settings - Fork 0
Data
Raw data were taken from Brimacombe, C. (2023, March 30). Shortcomings of using freely available open species interaction networks produced by different publications. https://doi.org/10.17605/OSF.IO/MY9TV
- π link-predict
- π data
- π processed
- π features
- π features_py.csv
- π features_R.csv
- π features_py.csv
- π networks
- π subsamples_edge_lists.csv
- π subsamples_metadata.csv
- π subsamples_edge_lists.csv
- π features
- π processed
- π data
fields:
- link_ID - Auto generated ID of link (existing an non-existing)
Other fields are the features themselves, where they differ between the two files as different features are computed by two different scripts, a python script and a R script.
fields:
- subsample_ID - Auto generated ID of a sampled network
- name - Name of the network
- community - Ecological community (Plant-Pollinator, Plant-Seed Dispersers, etc..)
- fraction - Represent the proportion of observed links after sub-sampling. currently have only 0.8 (80% observed links) and 1.0 (Original network)
-
type- Deprecated -
layer- Deprecated -
repetition- Deprecated
fields:
- link_ID - Auto generated ID of link (existing an non-existing)
- subsample_ID - Auto generated ID of a sampled network
- higher_level - Name of species of the higher trophic level
- lower_level - Name of species of the lower trophic level
- weight - weight of the link, but currently not used so it is converted to binary so 1.0
- class - link (1), non-links(0), and subsampled-links(-1) which are converted to 1 or 0 depending on the step (0 for feature extraction, 1 in test set..)
/results/ directory is described by the following files tree. The folders and code files are ordered according to the execution steps.
- π results
-
π results_preprocess.Rmd
-
π results_figs.Rmd
-
π raw
- π results_domains.csv
- π results_models.csv
- π results_other_models.csv
- π feature_importance.csv
- π results_domains.csv
-
π intermediate
- π df_pred_heatmap.csv
- π metrics_df_long.csv
- π metrics_multi_df_long.csv
- π metrics_type_df_long.csv
- π compare_other_models_metrics_df.csv
- π network_lvl_features.csv
- π pr_df.csv
- π roc_df.csv
- π auc_df.csv
- π test_data.csv
- π bounds_summary_df.csv
- π pca_df.csv
- π df_pred_heatmap.csv
-
π final
- π communities.pdf
- π eval_all.pdf
- π features.csv
- π importance_pres.pdf
- π kruskal_wallis.csv
- π mann_whitney.csv
- π networks_table.csv
- π networks_summary_properties.csv
- π predictions.pdf
- π ROC.pdf
- π split_set.pdf
- π SI_community.pdf
- π
SI_complete - π SI_features_hist.pdf
- π SI_importance.pdf
- π SI_models.pdf
- π SI_probabilities.pdf
- π SI_sensitivity
- π SI_sensitivity_com
- π SI_tradeoff.pdf
- π communities.pdf
-
common fields in csvs:
- link_ID - Auto generated ID of link (existing an non-existing)
- community - Ecological community (Plant-Pollinator, Plant-Seed Dispersers, etc..)
- name - Name of the network
- fold - number of the cv fold the instance are from (usually between 1-5 or 1-3)
- model - name of the ML model used
- y_proba - probability of link of the instance, given by the model
- metric - name of the evaluation metric used
- feature - name of the feature
- importance - importance value of the feature
- SBM_Prob - probability of link of the instance, given by SBM model
- C_Prob - probability of link of the instance, given by connectance model
- type_train - links of which communities are forming the train data
- type_test - links of which communities are forming the test data