Ego-splitting Framework in C++
Implementation-flavor
https://dl.acm.org/doi/10.1145/3097983.3098054
C++, R
Data sets are provided int the Inputs directory we have tested our source code on 4 different datasets
This is small graph authors have used to explain the framework in the research paper. Graph with 9 nodes and 11 edges.
Data collected about Facebook pages (November 2017). These datasets represent blue verified Facebook page networks of different categories. Nodes represent the pages and edges are mutual likes among them.
The DBLP computer science bibliography provides a comprehensive list of research papers in computer science. We construct a co-authorship network where two authors are connected if they publish at least one paper together. Publication venue, e.g, journal or conference, defines an individual ground-truth community; authors who published to a certain journal or conference form a community. We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.
This is small graph authors have used to explain the framework in the research paper. Graph with 10 nodes and 28 edges.
we have implemented our project in C++ from scratch and we have not used any complex libraries any cpp compiler with right configuration would be able to run our script.
verify your G++ version, download it from https://formulae.brew.sh/formula/gcc or for Vs-code https://code.visualstudio.com/docs/languages/cpp
Set correct Input file path to "file_name" variable in the main function in ego_split.cpp file in order to execute our algorithm on the provided data set and plot get the desired splitted ego's for each node
Execute the script and save the output to log file as below command
g++ /path/to/file/ego_split.cpp -o /path/to/file/ego_split && /path/to/file/ego_split >> /path/to/log_file/tester_edges.log
We have uploaded our exectionlogs on the above 4 data sets in the outputs folder
We have uploade a Evalauation.ipynd file in our project. Our evaluation foloows the follow steps,
As our source code is completly developed in C++ we have created some modules in python and pasted our final Overlapping Partitions outupt from logfile to our Ipynb file and the generated communities from it.
Then we have generated communities for our Overlapping partitions generated from each dataset using the below function
Here is a sample output for the communities generated for Tester_edges data set