Note: For the version of the repository that we used for the CASCON 22 paper, please refer to the cascon-release branch.
In this work, we create a human-in-the-loop pipeline to produce accurate annotation usage rules. The pipeline includes the following steps:
- Mining candidate annotation usage rules (see this repo for more details)
- Validating the candidate usage rules to produce confirmed usage rules (using the Rule Validation Tool)
- Using confirmed usage rules for misuse detection purposes (using the Violation Detector)Getting started
Published papers:
Mining Annotation Usage Rules: A Case Study with MicroProfile. Batyr Nuryyev, Ajay Kumar Jha, Sarah Nadi, Yee-Kang Chang, Emily Jiang, Vijay Sundaresan. ICSME'22: Industry Track.
A Human-in-the-loop Approach to Generate Annotation Usage Rules: A Case Study with MicroProfile. Mansur Gulami, Ajay Kumar Jha, Sarah Nadi, Karim Ali, Yee-Kang Chang, Emily Jiang. CASCONxEVOKE '22.
- Docker (tested on version 20)
- The build files make use of BuildKit for caching purposes which were introduced in version 18.09. This means that the minimum Docker version required is 18.09.
- Docker Compose (tested on version 1.25)
- The following ports need to be free to make sure the application is running correctly:
- 3306 - mysql
- 5000 - backend
- 8000 - ui tutorial
- 8888 - ui
The complete documentation is provided in DOCS.md
The pipeline has three major steps (i.e. mining, validating, creating the misuse detector) and there are commands available to perform each of these steps, and some other auxillary commands. In this section, we will quickly demonstrate each step using sample data. To run the pipeline on your own data, please refer to the above complete documentation.
To be able to use the pipeline, we need to build it first, and then run it. We can use build.sh
and run.sh
commands to achieve this as follows:
your-host-machine> ./build.sh && ./run.sh
# some build related output
# ...
========================================================================
This tool allows you to mine candidate annotation usage rules
from the target projects located in /pipeline/mining-sources.
Once you mine candidate rules, you can review and validate them
using the Rule Validation Tool (RVT). Confirmed rules can be exported
to be used for misuse detection.
Available commands:
mine - Mines candidate rules from target projects located in /pipeline/mining-sources
validate - Uploads the mined candidate rules into the RVT for validation
export-rules - Exports the validated correct rules from RVT
build-detector - Builds the misuse detector Maven plugin jar file, and provides installation directions
download-jars - Downloads the required jar files mentioned in /pipeline/config/configuration.json file
clone-projects - Clones the projects mentioned in the input file into /pipeline/mining-sources directory
info - Shows information about the available commands
pipeline>
A successful execution should land you in the bash shell, and information about all the available commands should be printed. Now the pipeline is ready!
Note: The mining phase (including downloading the input projects) takes a while. If you want to simply see what the output of the mining step is, please head over to the Validation section.
To be able to mine candidate annotation usage rules (a.k.a. candidate rules), we need two inputs:
- a set of Java projects that we will use for mining
- a set of JAR files for resolving the types that we are interested in
You can provide your own set of Java projects for mining (input 1) as well, but for the purpose of this demo, we will clone some predefined MicroProfile projects. To do this, please issue the following command:
pipeline> clone-projects --file /pipeline/examples/example_projects.txt
This will download all the projects defined in the example_projects.txt file into /pipeline/mining-sources
directory.
Next, we need to download the JAR files. To do this, please issue the following command:
pipeline> download-jars
This will download all the JAR files mentioned in the configuration.json
file into the /pipeline/lib-sources
directory.
Now, we are ready to mine! To mine candidate rules, we can simply use the following command:
pipeline> mine
Mining candidate rules might take a while, which is why we provide an example output file that can be used in the validation process which is described in the next section.
The next step after mining is to validate the candidate annotation usage rules. Usually, once the mining step is done, it is enough to issue the validate
command to use the newly produced candidate rules for validation.
However, if you have skipped the mining section and want to see how the validation works, we provide an example file with 6 candidate rules. To validate these rules, issue the following command:
pipeline> validate --file /pipeline/examples/candidate_rules_example.json
Using the following rules file for the validation: /pipeline/examples/candidate_rules_example.json
No username has been provided, generating a random one...
==========================================================
Successfully loaded the mined candidate rules!
To start validating candidate rules, please head over to:
http://localhost:8888
Username: magnetic-gallery
==========================================================
Now, all you need to do is to go to http://localhost:8888
and log in with the provided username, and start validating the candidate rules. The landing page will look something like this:
and once logged in, the rule validation page should look like this:
To get familiar with the validation tool and the domain-specific language used for validating the rules, please head over to the tutorial page (it is also accessible from the UI using the question mark (?) button at the top right corner).
Once you're done with validation (which does not mean validating all the candidate rules necessarily), you can simply close the browser tab.
Once the validation is done, we can move on to building the detector which consists of two steps.
- Exporting the confirmed rules
- Building the detector
Please note that if you have skipped the mining step entirely, and have not yet downloaded the jar files, please issue the following command to do it, as those jars are required for building the detector.
pipeline> download-jars
To perform both, simply run the command:
pipeline> export-rules && build-detector
However, if we simply want to build the detector from some predefined confirmed rules, issue the following command:
pipeline> build-detector --file /pipeline/examples/confirmed_rules_example.json
The predefined confirmed rules contain 2 confirmed rules with the names Rule-Foo
and Rule-Bar
.
Regardless of the rules file you have selected, after a successful build-detector
execution, head over to /pipeline/exports/detector
directory where you'll find install-plugin.sh
alongside a JAR file of the detector. The /pipeline/exports
directory is a volume, and by default it is mounted to a directory called exports
in the project root in the host machine. You can try installing the detector using the provided script in your host machine. Simply running the install-plugin.sh
should suffice.
If you have built the detector using the predefined confirmed rules, you can install the plugin within the pipeline and test it on a dummy project. To do it, you need to first install the plugin (1), and then go to the dummy project directory (2), and issue the scanning task (3):
pipeline> cd /pipeline/exports/detector
pipeline> ./install-plugin.sh # (1)
# it will install the plugin
pipeline> cd /pipeline/examples/example_project # (2)
pipeline> mvn ca.ualberta:violation-detector-maven-plugin:scan # (3)
After a successful execution, it should print out the misuses of Rule-Foo
.
File an issue on this repo and we will get back to you asap.
All credit related to RulePad goes to Sahar Mehrpour