Skip to content
/ CLMLDP Public

Code repository for the PrivateNLP 2024 paper: "A Collocation-based Method for Addressing Challenges in Word-level Metric Differential Privacy"

License

Notifications You must be signed in to change notification settings

sjmeis/CLMLDP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Collocation Extractor

Code for the PrivateNLP 2024 paper: A Collocation-based Method for Addressing Challenges in Word-level Metric Differential Privacy

Getting Started

In this repository, you will find the following items:

  • CollocationExtractor.py: class code for the GST and MST algorithms, as described in the paper
  • MLDP.py: code for running MLDP mechanisms, based on the code from this code repository.
  • data: folder containing the data for bigram and trigram collocation extraction

In addition, we make our two trained embedding models public. Due to their large size, they must be downloaded at the following link: https://drive.google.com/drive/folders/1b_2QNSBBtmCuUAOLrQ2ZK41cKqHY-o-s?usp=sharing.

Note that for each embedding model, there are two necessary files. To load them, use KeyedVectors from the gensim package.

About

Code repository for the PrivateNLP 2024 paper: "A Collocation-based Method for Addressing Challenges in Word-level Metric Differential Privacy"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages