This repository contains the experimental artefact used in the paper "Program Representations for Predictive Compilation: State of Affairs in the Early 20’s"
The repository is organized hierarchically. The directories are:
- Directory
datasets
contains all datasets used in the experiments; - Directory
scripts
contains scripts for executing the experiments (extracting features, executing models, plot charts, etc.); - Directory
reproduce-experiments
contains scripts for running the experiments with the pre-defined arguments - Directory
results
contains the experimental results; - Directory
tools
contains the support tools, including theYaCos
framework; - Directory
docker
contains the scripts to build and execute a Docker container environment.
A docker environment is given in order to reproduce the experiments. In order to create the environment, you must:
- Enter in
docker
directory. - Run
1.build_docker_image.sh
script which will create thecola-2022
docker image based on the Dockerfile. - Run
2.extract_yacos_data.sh
script which will extract YaCoS related data.
Running run_docker_command.sh
script will start a shell inside artifact's environment.
The directory reproduce-experiments
can be used in order to: generate the LLVM IR files, generate the embeddings and run the experiments. Each script starts with a number and must run from minor to major, in order to have an end-to-end pipeline.
Some notes:
0.create-ir.sh
will generate LLVM IR files from source files which is located atdatasets/src/
.- The scripts starting with prefix
1.*.sh
generates different embeddings based on LLVM IR files. It can be run it order or not.