Repository containing various graph algorithm implementations using Flink's Gelly Graph API
All jobs were run on Flink 1.8.0.
Information about setting up Flink can be found in Flink's Official Documentation.
cd
into an algorithm implementation directory (e.g.cd PageRank
)- Run
mvn clean package
in order to build the project - The JAR is located within the
target
folder
To configure the system edit the conf/flink-conf.yaml
file
-
jobmanager.heap.size: Nm
(the heap size for the JobManager JVM - e.g. 4096m) -
taskmanager.heap.size: Nm
(the heap size for the TaskManager JVM - e.g. 4096m) -
taskmanager.numberOfTaskSlots: N
(the number of parallel operator or user function instances that a single TaskManager can run (DEFAULT: 1). If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. This value is typically proportional to the number of physical CPU cores that the TaskManager’s machine has - e.g. equal to the number of cores, or half the number of cores)
- To start the (local) Flink cluster run:
./Flink-1.8.0/bin/start-cluster.sh
-
You can check jobs through the web UI (default: http://localhost:8081/#/overview)
-
To stop the cluster run:
./Flink-1.8.0/bin/stop-cluster.sh
To run a Flink job via the Flink CLI execute the following command:
./Flink-1.8.0/bin/flink run path/to/JAR --links path/to/edgelist/csv/file
- Scala 2.12.8
- Java 1.8
- Maven 3.6.0
As stated in LICENSE.