Skip to content

Commit

Permalink
Merge pull request #4 from vmware/master
Browse files Browse the repository at this point in the history
pull from upstream
  • Loading branch information
remysucre authored Jul 3, 2019
2 parents 1ec1c54 + c66f104 commit 84530c4
Show file tree
Hide file tree
Showing 27 changed files with 430,543 additions and 799 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ env:
- TOOL=stack TEST_SUITE='-p path'
- TOOL=stack TEST_SUITE='-p ovn'
- TOOL=stack TEST_SUITE='-p modules'
- TOOL=stack TEST_SUITE='-p span_string'
- TOOL=stack TEST_SUITE='-p redist'

#addons: {apt: {packages: [ghc-8.2.2], sources: [hvr-ghc]}}

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[![Build Status](https://travis-ci.com/ryzhyk/differential-datalog.svg?branch=master)](https://travis-ci.com/ryzhyk/differential-datalog)
[![Build Status](https://travis-ci.com/vmware/differential-datalog.svg?branch=master)](https://travis-ci.com/vmware/differential-datalog)

# Differential Datalog (DDlog)

Expand Down
143 changes: 118 additions & 25 deletions doc/tutorial/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -729,7 +729,7 @@ The `not` operator in the last line eliminates all endpoint values
that appear in `Blacklisted`. In database terminology this is known
as *antijoin*.

### Assignments in rules
#### Assignments in rules

We can directly use assignments in rules:

Expand Down Expand Up @@ -1459,41 +1459,51 @@ library [`std.dl`](../../lib/std.dl), which defines types like `Vec`, `Set`, `Ma
This library is imported automatically into every DDlog program; therefore the path to the
`lib` directory must always be specified using the `-L` switch.

## Advanced topics
## Using DDlog programs as libraries

**TODO** probably these should be moved out of the tutorial into a more
detailed reference document.
The text-based interface to DDlog, described so far, is primarily intended for
testing and debugging purposes. In production use, a DDlog program is typically
invoked as a library from a program written in a different language (e.g., C or Java).

### Using DDlog programs as libraries
Compile your program as explained [above](#compiling-the-hello-world-program).
Let's assume your program is `playpen.dl`.
When compilation completes, the `playpen_ddlog` directory will be created,
containing generated Rust crate for the DDlog program. You can invoke this
crate from programs written in Rust, C/C++, Java, or any other language that
is able to invoke DDlog's C API through a foreign function interface.

Place your program in the `test/datalog_tests` folder. Let's assume your
program is `playpen.dl`.
Run `stack test --ta '-p playpen'` to compile the `playpen` program.
When compilation completes the following artifacts are
produced in the same directory with the input file:

1. Three Rust packages (or "crates") in separate directories:
* `./differential_dataflow/`
* `./cmd_parser/`
* `./playpen/` (this is the main crate, which imports the other two)
1. **Rust**
In order to use the generated crate directly from another Rust program,
add it as a dependency to your `Cargo.toml`, e.g.,
```
[dependencies]
playpen = {path = "../playpen_ddlog"}
```
And invoke it through the API defined in the `./playpen_ddlog/api.rs` file.

1. If you plan to use this library directly from a Rust program, have a look at the
`./playpen/lib.rs` file, which contains the Rust API to DDlog.
**TODO: link to a separate API document**

**TODO: link to a separate document explaining the structure and API of the Rust project**
1. **C/C++**
The compiled program will contain a static library `playpen_ddlog/target/release/libplaypen_ddlog.a`
that can be linked against your C or C++ application. The C API to DDlog is defined in
`playpen_ddlog/ddlog.h`. To generare a dynamic library instead, pass the
`--dynlib` flag to `ddlog` to generate a `.so` (or `.dylib` on a Mac) file,
along with `--no-staticlib` to disable generation of the static library.

1. If you plan to use the library from a C/C++ program, your program must link against the
`./playpen/target/release/libplaypen.so` library, which wraps the DDlog program into a C API. This
API is declared in the auto-generated `./playpen/playpen.h` header file.
1. **Java**
If you plan to use the library from a Java program, make sure to use
`--dynlib` and `--no-staticlib` flags to generate a dynamically linked library.
The Java bindings to the DDlog API are imlemented by classes in the `java/ddlogapi`
directory. See an example Java project in `java/test`.

**TODO: link to a separate document explaining the use of the C FFI**
**TODO: link to a separate API document**

1. The text-based interface is implemented by an
auto-generated executable `./playpen/target/release/playpen_cli`. This interface is
auto-generated executable `./playpen_ddlog/target/release/playpen_cli`. This interface is
primarily meant for testing and debugging purposes, as it does not offer the same performance and
flexibility as the API-based interfaces.

### Input/output to DDlog
## Input/output to DDlog

DDlog offers several ways to feed data to a program:

Expand All @@ -1509,7 +1519,7 @@ DDlog offers several ways to feed data to a program:

In the following sections, we expand on each method.

#### Specifying ground facts statically in the program source code
### Specifying ground facts statically in the program source code

This method is useful for specifying ground facts that are guaranteed to hold in every instantiation
of the program. Such facts can be specified statically to save the hassle of adding them manually
Expand All @@ -1520,3 +1530,86 @@ declarations to the program to pre-populate `Word1` and `Word2` relations:
Word1("Hello,", CategoryOther).
Word2("world!", CategoryOther).
```

## Profiling

DDlog's profiling features are designed to help the programmer to understand
what parts of the DDlog program use the most CPU and memory. DDlog supports two
commands related to profiling (also available through Rust, C, and Java APIs):

1. `profile cpu on/off;` - enables/disables recording of CPU usage info in
additition to memory profiling. CPU profiling is not enabled by default, as
it can slow down the program somewhat, especially for large programs
that handle many small updates.

1. `profile;` - returns information about program's CPU and memory usage. CPU
usage is expessed as the total amount of time DDlog spent evaluating each operator,
assuming CPU profiling was enabled. For example the following CPU profile
record:
```
CPU profile
...
0s005281us ( 112calls) Join: DdlogDependency(.parent=parent, .child=child), LabeledNode(.node=parent, .scc=parentscc), LabeledNode(.node=child, .scc=childscc) 165
...
```
indicates that the program spent `5,281` microseconds in 112 activations of the
join operator that joins the prefix of the rule (`DdlogDependency(.parent=parent, .child=child), LabeledNode(.node=parent, .scc=parentscc)`)
with the `LabeledNode(.node=child, .scc=childscc)` literal.

Memory profile reports current (at the time when the profile is being generated)
and peak (since the start of the program) number of records in each DDlog
*arrangement*. An arrangement is similar to an indexed representation of a
relation in databases. Arrangements are responsible for the majority of memory
consumption of a DDlog program. For example, the following memory profile
fragment:
```
Arrangement peak sizes
...
451529 Arrange: LabeledNode{.node=_, .scc=_0} 136
372446 Arrange: LabeledNode{.node=_0, .scc=_} 132
```
indicates that the program contains two different arrangements of the `LabeledNode`
relation, indexed by the second and first fields, whose peak size is
451,529 and 372,446 records respectively (the numbered variables, e.g., `_0`)
indicate one or more fields used to index the relation by.

## Replay debugging

When using DDlog as a library, it may be difficult to isolate bugs in the DDlog
program from those in the application using DDlog. Replay debugging is a DDlog
feature that intercepts all DDlog invocations made by a program and dumps them
in a DDlog command file that can later be replayed against the DDlog program
running standalone, via the CLI interface. This has several advantages. First,
it allows the user to inspect DDlog inputs in human-readable form. Second, one
can easily modify the recorded command file, e.g., to dump relations of interest
or to print profiling information in various points throughout the execution, or
to simplify inputs in order to narrow down search for an error. Third, one can
replay the log against a modified DDlog program, e.g., to check that the program
behaves correctly after a bug fix. As another example, it may be useful to dump
the contents of intermediate relations by labeling them as `output relation` and
adding a `dump <relation_name>;` command to the command file.

Finally, by replaying recorded commands, one can obtain a relatively accurate
estimate of the amount of CPU time and memory spent in the DDlog computation
in isolation from the host program. For example, here we use the UNIX `time`
program (note: this is not the same as the `time` command in `bash`)
to measure time and memory footpint of a DDlog computation:
```
/usr/bin/time playpen_ddlog/target/release/playpen_cli -w 2 --no-print --no-store < replay.dat
```
where `-w 2` runs DDlog with two worker threads,
`--no-print` stops DDlog from printing every update to output tables on `stdout`,
`--no-store` tells DDlog not to cache the content of output relations (which takes
time and memory), and
`replay.dat` is the name of the file that contains recorded DDlog commands.

To enable replay debugging, call the `ddlog_record_commands()` function in C (see
`ddlog.h`), `DDlogAPI.record_commands()` method in Java (`DDlogAPI.java`) or
the `HDDlog.record_commands()` method in Rust right after starting the
DDlog program, and before pushing any data to it.

**TODO: checkpointing feature**

## Logging

** TODO **
27 changes: 25 additions & 2 deletions java/ddlogapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -115,15 +115,16 @@ JNIEXPORT jlong JNICALL Java_ddlogapi_DDlogAPI_ddlog_1run(
}

JNIEXPORT jint JNICALL Java_ddlogapi_DDlogAPI_ddlog_1record_1commands(
JNIEnv *env, jobject obj, jlong handle, jstring filename) {
JNIEnv *env, jobject obj, jlong handle, jstring filename, jboolean append) {
int ret;
int fd;

const char *c_filename = (*env)->GetStringUTFChars(env, filename, NULL);
if (c_filename == NULL) {
return -1;
}
fd = open(c_filename, O_CREAT | O_WRONLY | O_TRUNC, S_IRUSR | S_IWUSR);
fd = open(c_filename, O_CREAT | O_WRONLY | (append ? O_APPEND : O_TRUNC),
S_IRUSR | S_IWUSR);
(*env)->ReleaseStringUTFChars(env, filename, c_filename);

if (fd < 0) {
Expand All @@ -143,6 +144,28 @@ JNIEXPORT jint JNICALL Java_ddlogapi_DDlogAPI_ddlog_1stop_1recording(
return 0;
}

JNIEXPORT jint JNICALL Java_ddlogapi_DDlogAPI_ddlog_1dump_1input_1snapshot(
JNIEnv *env, jobject obj, jlong handle, jstring filename, jboolean append) {
int ret;
int fd;

const char *c_filename = (*env)->GetStringUTFChars(env, filename, NULL);
if (c_filename == NULL) {
return -1;
}
fd = open(c_filename, O_CREAT | O_WRONLY | (append ? O_APPEND : O_TRUNC),
S_IRUSR | S_IWUSR);
(*env)->ReleaseStringUTFChars(env, filename, c_filename);

if (fd < 0) {
return fd;
} else {
ret = ddlog_dump_input_snapshot((ddlog_prog)handle, fd);
close(fd);
return ret;
}
}

JNIEXPORT jint JNICALL Java_ddlogapi_DDlogAPI_ddlog_1stop(
JNIEnv *env, jobject obj, jlong handle, jlong callbackHandle) {

Expand Down
12 changes: 9 additions & 3 deletions java/ddlogapi/DDlogAPI.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,9 @@ public class DDlogAPI {
* The C ddlog API
*/
native long ddlog_run(boolean storeData, int workers, String callbackName);
static native int ddlog_record_commands(long hprog, String filename);
static native int ddlog_record_commands(long hprog, String filename, boolean append);
static native int ddlog_stop_recording(long hprog, int fd);
static native int ddlog_dump_input_snapshot(long hprog, String filename, boolean append);
native int dump_table(long hprog, int table, String callbackMethod);
static native int ddlog_stop(long hprog, long callbackHandle);
static native int ddlog_transaction_start(long hprog);
Expand Down Expand Up @@ -136,7 +137,8 @@ public int getTableId(String table) {

// Record DDlog commands to file.
// Set `filename` to `null` to stop recording.
public int record_commands(String filename) {
// Set `append` to `true` to open the file in append mode.
public int record_commands(String filename, boolean append) {
if (this.record_fd != -1) {
DDlogAPI.ddlog_stop_recording(this.hprog, this.record_fd);
this.record_fd = -1;
Expand All @@ -145,7 +147,7 @@ public int record_commands(String filename) {
return 0;
}

int fd = DDlogAPI.ddlog_record_commands(this.hprog, filename);
int fd = DDlogAPI.ddlog_record_commands(this.hprog, filename, append);
if (fd < 0) {
return fd;
} else {
Expand All @@ -154,6 +156,10 @@ public int record_commands(String filename) {
}
}

public int dump_input_snapshot(String filename, boolean append) {
return DDlogAPI.ddlog_dump_input_snapshot(this.hprog, filename, append);
}

public int stop() {
/* Close the file handle. */
if (this.record_fd != -1) {
Expand Down
20 changes: 19 additions & 1 deletion java/test/SpanTest.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import java.io.IOException;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.*;
Expand Down Expand Up @@ -102,6 +104,8 @@ public static class SpanParser {
String terminator;
/// List of commands to execute
List<DDlogCommand> commands;
/// `true` when command recording is enabled
boolean recording;

private final DDlogAPI api;
private static boolean debug = true;
Expand All @@ -115,7 +119,6 @@ public static class SpanParser {
if (localTables) {
//this.api = new DDlogAPI(1, r -> this.onCommit(r));
this.api = new DDlogAPI(1, r -> this.onCommitDirect(r), false);
this.api.record_commands("replay.dat");
this.ruleSpanTableId = this.api.getTableId("RuleSpan");
this.containerSpanTableId = this.api.getTableId("ContainerSpan");
this.ruleSpan = new TreeSet<RuleSpan>(new SpanComparator());
Expand Down Expand Up @@ -297,6 +300,21 @@ void parseLine(String line)
this.exitCode = this.api.commit();
this.checkExitCode();
this.checkSemicolon();

// Start recording after the first commit. Dump current
// database snapshot to the replay file first
if (!this.recording) {
try {
Files.write(Paths.get("./replay.dat"), "start;\n".getBytes());
this.api.dump_input_snapshot("replay.dat", true);
Files.write(Paths.get("./replay.dat"), "commit;\n".getBytes(), StandardOpenOption.APPEND);
this.api.record_commands("replay.dat", true);
this.recording = true;
} catch (Exception ex) {
ex.printStackTrace();
throw new RuntimeException(ex);
}
}
break;
case "insert":
case "delete":
Expand Down
4 changes: 1 addition & 3 deletions lib/intern.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
extern crate lazy_static;

pub use self::lazy_static::lazy_static;
use lazy_static::lazy_static;
use std::marker;
use std::thread;
use std::vec;
Expand Down
2 changes: 0 additions & 2 deletions lib/intern.toml

This file was deleted.

2 changes: 1 addition & 1 deletion lib/std.rs
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ pub fn std_vec_push<X: Ord+Clone>(v: &mut std_Vec<X>, x: &X) {
v.push((*x).clone());
}

pub fn std_vec_insert_imm<X: Ord+Clone>(v: &std_Vec<X>, x: &X) -> std_Vec<X> {
pub fn std_vec_push_imm<X: Ord+Clone>(v: &std_Vec<X>, x: &X) -> std_Vec<X> {
let mut v2 = v.clone();
v2.push((*x).clone());
v2
Expand Down
Loading

0 comments on commit 84530c4

Please sign in to comment.