Artifact Author: Adones Rukundo Contact: [email protected], [email protected], [email protected]
The artifact design is based on the adapted version of ASCYLIB framework. Memory is managed using ssmem library, a simple object-based memory allocator with epoch-based garbage collection.
TSLQueue: An Efficient Lock-free Design for Priority Queues. https://link.springer.com/chapter/10.1007/978-3-030-85665-6_24
The artifact contains three implantations of a priority queue abstract data structure:
- TSLQueue:
- TSLQueue: An Efficient Lock-free Design for Priority Queues.
- Linden:
- A skiplist-based concurrent priority queue with minimal memory contention.
- Adapted version of source code from https://github.com/jonatanlinden/PR
- Lotan:
- Skiplist-based concurrent priority queues.
- Adapted version of source code from https://github.com/LPD-EPFL/ASCYLIB/tree/master/src/priorityqueue-lotanshavit_lf
The following are the hardware and software requirements to run the artifact
- Computer Architecture:
- The artifact is designed and tested for x86_64 computer architectures
- OS Platform:
- Linux OS distribution
- Compiler:
- gcc compiler is required to compile the artifact tests
- Memory:
- Test machine should have at least 30 GB of RAM.
- Software:
- To automatically generate throughput graphs, the test machine must have python 2.7.5 and above installed together with matplot and numpy modules. The graphs are generated using matplot and data is read using numpy.
- Step 1: Unzip the artifact to any directory e.g Europar2021.
- Step 2: Navigate to the home directory of the artifact where the
script
directory is located - Step 3: Run
chmod -R 774 "scripts"
This grants you permission to run execution commands on the script files
benchmark
: In this directory you will find all the necessary files (source code) related to the implemented data structures and tests.benchmark/src
: Contains the data structures' source codebenchmark/includes
: Contains support filesbenchmark/common
: Contains make definitions filebenchmark/external
: Contains external libraries e.g the ssmem librarybenchmark/bin
: Contains the binary files for the compiled data structures
scripts
: In this directory you find some scripts that can be used to run batch tests, generate data tables and graphs.data
: The data generated by the scripts in the scripts directory is saved in this (data) directory.graphs
: The graphs generated by the scripts in the scripts directory are saved in this (graphs) directory.
- Step 1: Navigate to the benchmark directory
- Step 2: Run
make
This compiles all the implementations using the default make definitions
Binary files will be saved in benchmark/bin/
- Step 1: Navigate to the bin directory
benchmark/bin/
- Step 2: Run
./pq-tsl_wcas -n 2
or./pq-linden -n 2
or./pq-lotan -n 2
or./pq-tsl_cas -n 2
An output of the test results will be displayed.
- Step 1: Navigate to the benchmark directory
- Step 2: Run
make
Without arguments, the make command compiles all data structure tests with the default settings.make
can take arguments to define specific execution settings e.gmake VERSION=O3 GC=1 INIT=all
- make settings:
VERSION
defines the optimisation levels e.gVERSION=O3
. It takes one of the following five values:DEBUG
compile with optimisation zero and debug flagsSYMBOL
compile with optimisation level three and -gO0
compile with optimisation level zeroO1
compile with optimisation level oneO2
compile with optimisation level twoO3
compile with optimisation level three (Default)
GC
defines if deleted nodes should be recycledGC=1
(Default) or notGC=0
.INIT
defines if data structure initialisation should be performed by all active threadsINIT=all
(Default) or one of the threadsINIT=one
AFFINITY
defines the thread to core mapping,AFFINITY=dense
defines dense pinning whileAFFINITY=sparse
defines sparse pinning andAFFINITY=none
lets the system decide on which cores the threads should run.
- More information about available make settings options can be found in the make file
benchmark/common/Makefile.common
- make settings:
Different machines have different thread id layouts especially if the machine has more than one socket. For each affinity, dense or sparse, the machine thread id order must be defined e.g 0,2,4
means that pin the first thread to hardware thread id 0
, second thread to hardware thread id 2
and third thread to hardware thread id 4
. Machine affinity definitions are in file benchmark/includes/utils.h
.
- Run
lscpu
to determine the order of hardware thread ids. - Run
cat /proc/cpuinfo
to determine how the hardware thread ids are distributed to the core ids. - Use this information to define the order in which threads are pinned for either dense or sparse mapping.
dense
mapping should be defined to pin one thread per core close together.sparse
pinning should be defined to pin one thread per core far apart e.g different sockets/tiles. - In the file
benchmark/includes/utils.h
, theDEFAULT
definition can be replaced with the specific mapping/affinity. Or the specific machine definition can be added.
Binary files for each data structure implementation are in the directory benchmark/bin/
.
pq-tsl_wcas
TSLQueue compiled with double word CASpq-tsl_cas
TSLQueue compiled with single word CASpq-linden
Linden priority queuepq-lotan
Lotan priority queue
To run individual tests for each compiled data structure implementation navigate to the bin directory benchmark/bin/
.
-
Run e.g
./pq-tsl_wcas -h
to get the information about the different parameters. An output similar to the following will be displayedHelp information: Usage: ./pq-tsl_wcas [options...] Options: -h, --help Print this message -d, --duration <int> Test duration in milliseconds -i, --initial-size <int> Number of elements to insert before test -n, --num-threads <int> Number of threads -r, --range <int> Range of integer keys inserted in set -p, --put-rate <int> Percentage of put (insert) operations
-
Run e.g
./pq-tsl_wcas -n 6
to run test with six threads. An output similar to the following will be displayed:Initial, 12000 Range, 1073741824 Algorithm, TSLQueue Size of node padded, 64 B Size of initial, 750.00 KB is 0.73 MB [ALLOC] initializing allocator with fs size, 10000 objects BEFORE size is, 12000 AFTER size is, 42293 Inserting_count_total , 3497542 Inserting_count_total_succ , 3497444 Inserting_perc_succ , 100.0 Inserting_perc , 50.2 Inserting_effective , 50.2 DeleteMin_count_total , 3467151 DeleteMin_count_total_succ , 3467151 DeleteMin_perc_succ , 100.0 DeleteMin_perc , 49.8 DeleteMin_effective , 49.8 Number of Threads , 6 Million Operations per Second , 6.965 Operations per Second , 6964693.00
The default parameters are defined in the file benchmark/includes/common.h
The script file script/pq-experiments.sh
is used to run batch test executions . In this file, execution settings such as number of repeats can be set. Follow the following steps to run batch tests and generate graphs automatically.
- Step 1: Navigate to the scripts directory
- Step 2: Run
nohup ./pq-experiments.sh 1,4,8,12,16,20,24,28,32,36 &> log.out&
, where 1,4,8,..,36 is a list of number of threads per test. The list of threads can be modified to fit the tests and the machine on which the tests are being run. The longer the thread list the more time the batch execution test will take. The execution events are logged inlog.out
and saved in the scripts directory Data is saved in the data directory as.csv
files, while the graphs will be saved in the graphs directory as png files. The file name includes the put rate (-p), key range (-r), queue initial size (-i), affinity (dense) and experiment timestamp.
NOTE: The results in the paper "TSLQueue: An Efficient Lock-free Design for Priority Queues" are for dense mapping affinity (pinning threads one per core, filling one socket at a time.