forked from uber-archive/sql-differential-privacy
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Noah Johnson
committed
Jul 1, 2017
0 parents
commit 9ab7d87
Showing
62 changed files
with
7,182 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
*.class | ||
|
||
# Eclipse | ||
.classpath | ||
.project | ||
.settings/ | ||
|
||
# Intellij | ||
.idea/ | ||
*.iml | ||
*.iws | ||
|
||
# Mac | ||
.DS_Store | ||
|
||
# Maven | ||
dependency-reduced-pom.xml | ||
target/ | ||
|
||
log/ | ||
tmp/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Copyright (c) 2017 Uber Technologies, Inc. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Overview | ||
|
||
This repository contains a full implementation of a differential privacy mechanism for SQL queries using elastic sensitivity, | ||
including the SQL analysis framework used to build it. | ||
|
||
|
||
Elastic sensitivity is an approach for efficiently approximating the local sensitivity of a query, which can be used to | ||
enforce differential privacy for the query. The approach requires only a static analysis of the query and therefore | ||
imposes minimal performance overhead. Importantly, it does not require any changes to the database. | ||
Details of the approach are available in [this paper](https://arxiv.org/abs/1706.09479). | ||
|
||
The framework for implementing elastic sensitivity is designed to perform dataflow analyses over complex SQL queries. | ||
It provides an abstract representation of queries, plus several kinds of built-in dataflow analyses tailored to this | ||
representation. This framework can be used to implement other types of dataflow analyses and will soon support additional differential privacy mechanisms for SQL. | ||
|
||
## Building & Running | ||
|
||
This framework is written in Scala and built using Maven. To build the code: | ||
|
||
``` | ||
$ mvn package | ||
``` | ||
|
||
## Example: Differential Privacy using Elastic Sensitivity | ||
|
||
Elastic sensitivity can be used to determine the scale of random noise necessary to make the results of a query | ||
differentially private. For a given output column of a query with elastic sensitivity *s*, to achieve | ||
differential privacy for that column it suffices to *smooth* *s* according to the smooth sensitivity approach to obtain | ||
*S*, then add random noise drawn from the Laplace distribution, scaled to *(S/epsilon)* and centered at 0, to the true | ||
result of the query. The smoothing can be accomplished using the smooth sensitivity approach introduced by [Nissim et al](http://www.cse.psu.edu/~ads22/pubs/NRS07/NRS07-full-draft-v1.pdf). | ||
|
||
Example code demonstrating this approach is available in `com.uber.engsec.dp.util.DPExample`. | ||
|
||
To run this example: | ||
``` | ||
mvn exec:java -Dexec.mainClass="com.uber.engsec.dp.util.DPExample" | ||
``` | ||
|
||
|
||
## Analysis Framework | ||
|
||
This framework can perform additional analyses on SQL queries, and can be extended with new analyses. | ||
Each analysis in this framework extends the base class `com.uber.engsec.dp.sql.AbstractAnalysis`. | ||
|
||
To run an analysis on a query, call the method `com.uber.engsec.dp.sql.AbstractAnalysis.analyzeQuery`. | ||
The parameter of this method is a string containing a SQL query, and its return value is an abstract domain representing | ||
the results of the analysis. | ||
|
||
The source code includes several example analyses to demonstrate features of the framework. The simplest example is `com.uber.engsec.dp.analysis.taint.TaintAnalysis`, which returns an abstract domain containing information about which output columns of the query might contain data flowing from "tainted" columns in the database. The database schema determines which columns are tainted. You can invoke this analysis as follows: | ||
|
||
```scala | ||
scala> (new com.uber.engsec.dp.analysis.taint.TaintAnalysis).analyzeQuery("SELECT my_col1 FROM my_table") | ||
BooleanDomain = my_col1 -> False | ||
``` | ||
|
||
This code includes several built-in analyses, including: | ||
|
||
- The elastic sensitivity analysis, available in `com.uber.engsec.dp.analysis.differential_privacy.ElasticSensitivityAnalysis`, returns an abstract domain (`com.uber.engsec.dp.analysis.differential_privacy.SensitivityDomain`) that maps each output column of the query to its elastic sensitivity. | ||
- `com.uber.engsec.dp.analysis.columns_used.ColumnsUsedAnalysis` lists the original database columns | ||
from which the results of each output column are computed. | ||
- `com.uber.engsec.dp.analysis.histogram.HistogramAnalysis` lists the aggregation-ness of each | ||
output column of the query (i.e. whether or not the output is an aggregation, and if so, which type). | ||
- `com.uber.engsec.dp.analysis.join.JoinKeysUsed` lists the original database columns used as equijoin | ||
keys for each output column of the query. | ||
|
||
## Writing New Analyses | ||
|
||
New analyses can be implemented by extending one of the abstract analysis classes and implementing *transfer functions* | ||
which describe how to update the analysis state for relevant query constructs. Analyses are written to update a | ||
specific type of *abstract domain* which represents the current state of the analysis. Each abstract domain type | ||
implements the trait `com.uber.engsec.dp.dataflow.AbstractDomain`. | ||
|
||
The simplest way to implement a new analysis is to use `com.uber.engsec.dp.dataflow.dp.column.AbstractColumnAnalysis`, | ||
which automatically tracks analysis state for each column of the query independently. Most of the example analyses are | ||
of this type. | ||
|
||
New analyses can be invoked in the same way as the built-in example analyses. | ||
|
||
## Reporting Security Bugs | ||
|
||
Please report security bugs through [HackerOne](https://hackerone.com/uber). | ||
|
||
## License | ||
|
||
This project is released under the MIT License. | ||
|
||
## Contact Information | ||
|
||
This project is developed and maintained by [Noah Johnson](mailto:[email protected]) and [Joe Near](mailto:[email protected]). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,148 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Copyright (c) 2017 Uber Technologies, Inc. | ||
~ | ||
~ Permission is hereby granted, free of charge, to any person obtaining a copy | ||
~ of this software and associated documentation files (the "Software"), to deal | ||
~ in the Software without restriction, including without limitation the rights | ||
~ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
~ copies of the Software, and to permit persons to whom the Software is | ||
~ furnished to do so, subject to the following conditions: | ||
~ | ||
~ The above copyright notice and this permission notice shall be included in | ||
~ all copies or substantial portions of the Software. | ||
~ | ||
~ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
~ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
~ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
~ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
~ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
~ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
~ THE SOFTWARE. | ||
--> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<groupId>com.uber.engsec</groupId> | ||
<artifactId>sql-differential-privacy</artifactId> | ||
<version>1.0-SNAPSHOT</version> | ||
<packaging>jar</packaging> | ||
|
||
<name>sql-differential-privacy</name> | ||
<description>Differential privacy for SQL queries</description> | ||
<url>https://github.com/uber/sql-differential-privacy</url> | ||
|
||
<licenses> | ||
<license> | ||
<name>MIT License</name> | ||
<url>http://www.opensource.org/licenses/mit-license.php</url> | ||
</license> | ||
</licenses> | ||
|
||
<properties> | ||
<maven.compiler.source>1.8</maven.compiler.source> | ||
<maven.compiler.target>1.8</maven.compiler.target> | ||
<encoding>UTF-8</encoding> | ||
<scala.version>2.12.1</scala.version> | ||
<scala.compat.version>2.12</scala.compat.version> | ||
</properties> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.scala-lang</groupId> | ||
<artifactId>scala-library</artifactId> | ||
<version>${scala.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>com.facebook.presto</groupId> | ||
<artifactId>presto-parser</artifactId> | ||
<version>0.148</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>com.fasterxml.jackson.dataformat</groupId> | ||
<artifactId>jackson-dataformat-csv</artifactId> | ||
<version>2.9.0.pr3</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>com.fasterxml.jackson.dataformat</groupId> | ||
<artifactId>jackson-dataformat-yaml</artifactId> | ||
<version>2.9.0.pr3</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>com.fasterxml.jackson.module</groupId> | ||
<artifactId>jackson-module-scala_${scala.compat.version}</artifactId> | ||
<version>2.9.0.pr3</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.apache.calcite</groupId> | ||
<artifactId>calcite-core</artifactId> | ||
<version>1.13.0</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.slf4j</groupId> | ||
<artifactId>slf4j-api</artifactId> | ||
<version>1.7.5</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.slf4j</groupId> | ||
<artifactId>slf4j-simple</artifactId> | ||
<version>1.6.4</version> | ||
</dependency> | ||
|
||
<!-- Test --> | ||
<dependency> | ||
<groupId>junit</groupId> | ||
<artifactId>junit</artifactId> | ||
<version>4.12</version> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<build> | ||
<sourceDirectory>src/main/scala</sourceDirectory> | ||
<testSourceDirectory>src/test/scala</testSourceDirectory> | ||
<plugins> | ||
<plugin> | ||
<groupId>net.alchim31.maven</groupId> | ||
<artifactId>scala-maven-plugin</artifactId> | ||
<version>3.2.0</version> | ||
<executions> | ||
<execution> | ||
<goals> | ||
<goal>compile</goal> | ||
<goal>testCompile</goal> | ||
</goals> | ||
<configuration> | ||
<args> | ||
<arg>-dependencyfile</arg> | ||
<arg>${project.build.directory}/.scala_dependencies</arg> | ||
</args> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-surefire-plugin</artifactId> | ||
<version>2.18.1</version> | ||
<configuration> | ||
<useFile>false</useFile> | ||
<disableXmlReport>true</disableXmlReport> | ||
<!-- If you have classpath issue like NoDefClassError,... --> | ||
<!-- useManifestOnlyJar>false</useManifestOnlyJar --> | ||
<includes> | ||
<include>**/*Test.class</include> | ||
</includes> | ||
</configuration> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
</project> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
databases: | ||
- database: "my_database" | ||
dialect: "postgres" | ||
tables: | ||
- name: "my_table" | ||
columns: | ||
- name: "col1" | ||
- name: "col2" | ||
|
35 changes: 35 additions & 0 deletions
35
src/main/scala/com/uber/engsec/dp/analysis/columns_used/ColumnsUsedAnalysis.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
/* | ||
* Copyright (c) 2017 Uber Technologies, Inc. | ||
* | ||
* Permission is hereby granted, free of charge, to any person obtaining a copy | ||
* of this software and associated documentation files (the "Software"), to deal | ||
* in the Software without restriction, including without limitation the rights | ||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
* copies of the Software, and to permit persons to whom the Software is | ||
* furnished to do so, subject to the following conditions: | ||
* | ||
* The above copyright notice and this permission notice shall be included in | ||
* all copies or substantial portions of the Software. | ||
* | ||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
* THE SOFTWARE. | ||
*/ | ||
|
||
package com.uber.engsec.dp.analysis.columns_used | ||
import com.uber.engsec.dp.dataflow.column.DataflowGraphColumnAnalysis | ||
import com.uber.engsec.dp.dataflow.domain.SetDomain | ||
import com.uber.engsec.dp.sql.dataflow_graph.relation.DataTable | ||
|
||
/** Returns a set of all data table columns influencing each output column. | ||
*/ | ||
class ColumnsUsedAnalysis extends DataflowGraphColumnAnalysis(new SetDomain[String]) { | ||
override def transferDataTable(d: DataTable, idx: Int, fact: Set[String]): Set[String] = { | ||
val qualifiedColName = s"${d.name}.${d.getColumnName(idx)}" | ||
fact ++ Set(qualifiedColName) | ||
} | ||
} |
Oops, something went wrong.