Skip to content

ducalpha/PurPlianceOpenSource

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PurPliance

This repository contains the component that extracts privacy-statement tuples from policy sentences of PurPliance.

Run code

Download and extract the NER model

Download NER model en_core_web_lg.high_f1_data_org.model.tar.xz and extract the file to src/oppnlp/analyze/pded/models/:

src/oppnlp/analyze/pded/models $ tar -xvf en_core_web_lg.high_f1_data_org.model.tar.xz

Set up dependencies and run test code

To run the code, use a virtual environment and run the test as follows:

# Create new conda python environment.
conda create -n purpliance_oss python=3.8
conda activate purpliance_oss

# Install the current package.
pip install -e .
pip install -r requirements.txt

# Test: Extract privacy statements from each file in test/policies and output
# as json files in test/policies/stmt.
bash test/test.sh

# Extract privacy statements from each file in the $INPUT_DIR
# json files in $INPUT_DIR/stmt.
# INPUT_DIR contains plain sentencized text files *.txt to be analyzed.
# Each non-blank line in the file should contain one and only 1 sentence.
python src/runner/analyze/priv_stmt/run_priv_stmt_extractor.py $INPUT_DIR

The code was tested on Ubuntu 18.04 and MacOS 12.6.

Publication

Duc Bui, Yuan Yao, Kang G. Shin, Jong-Min Choi, and Junbum Shin.
Consistency Analysis of Data-Usage Purposes in Mobile Apps.
In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS '21).
Association for Computing Machinery, New York, NY, USA, 2824–2843.

License

PurPliance is licensed under the BSD-3-Clause License (See LICENSE.txt).

Acknowledgement

This repo uses code from PolicyLint/PoliCheck repository.

About

Source code of PurPliance analysis tool.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published