This is a collection of algorithms for learning network structure from effective resistances and other random walk-based similarities, as described in the paper Learning Networks from Random Walk-Based Node Similarities.
The repository includes methods for exact graph recovery, heuristic methods, and optimization-based approaches (both convex and non-convex). See the paper for details and comparision of these methods.
The code is in Matlab. A number of functions depend on files in the /utils
folder. Ensure that this folder is added to your Matlab path.
exactRecover.m: Given a full set of (n choose 2)
effective resistances, recovers the unique graph with these resistances. May also be used with regularization as a heuristic method to match a noisy or incomplete set of effective resistances. See Section 4.2 of the paper.
exactPageRankRecover.m: Given an n x n
matrix of all pairwise personalized PageRank scores, recovers the unique graph matching these scores. As with exactRecover.m
, this method may be used heuristically with a regularization parameter.
exactRecoveryDemo.m: Demonstrates how to use exactRecover.m
and exactPageRankRecover.m
to recover a graph from a full set of pairwise node similarities.
recoverMissing.m: Given an incomplete set of effective resistances, fills in the missing resistances via a shortest path heuristic described in Section 4.2 of the paper. The method then attempts to recover a set of edge weights using exactRecover.m
, possibly with regularization. Note that some of these edge weights may be negative. One option to clean them up is via utils/noisyRecoveryCleanup.m
.
We provide two gradient/stochastic coordinate descent based methods which attempt to solve the non-convex problem of finding a graph whose effective resistances are as close as possible to the given resistances in l2
norm. See Sections 3.3 and 4.2 of the paper for details.
Any effective resistance input as 0 is considered to be un-constrained. See comments in the code for details on tuning and optimization method options. Both methods allow for a regularization parameter lambda
, and minimize the distance to the target resistances plus lambda*tr(L)
.
effResGDSmall.m: Use this method for small graphs, where is is possible to fit an (n choose 2) x n
edge-vertex incidence matrix in memory.
effResGD.m: Use this method for large graphs. It will avoid computing an (n choose 2) x n
edge-vertex incidence matrix. For large graphs, it is generally advised to set batchSize << (n choose 2)
.
Note that both the above methods make use of parallelism via parfor
loops. You can set the number of parallel workers before running these methods, using a snippet like:
myCluster = parcluster('local');
myCluster.NumWorkers = 4;
parpool(4);
leastSquaresDemo.m: Demonstrates how to use graphGDSmall.m
and graphGD.m
on a small random k-nearest neighbors graph.
sdpRecover.m: Given a set of (n choose 2)
effective resistance constraints, finds the graph with minimal total degree with all effective resistances below these constraints, by solving the SDP described in Section 4.3 of the paper. Any resistance input as 0 is considered to be un-constrained. Requires CVX convex programming system to be installed.
@article{muscostsourakakis2018similarities, title={Learning Networks from Random Walk-Based Node Similarities}, author={Hoskins, Jeremy and Musco, Cameron, and Musco, Christopher, and Tsourakakis, Charalampos}, year={2018},