Skip to content

Commit

Permalink
Ad hoc polymorphism.
Browse files Browse the repository at this point in the history
This commit implements RFC #670.

We already have support for argument polymorphism, e.g.:

```
function vec_length(v: Vec<'X>): usize
```

Here we introduce ad hoc polymorphism, that additionally allows defining
functions with the same name for different types, e.g.,:

```
function size(v: Set<'X>): usize {...}
function size(v: Map<'X>): usize {...}
```

The compier uses the number of arguments and the type of the _first_ argument to
disambiguate the callee:

```
// (1)
function size(v: Vec<'X>): usize {...}

// OK: different number of arguments
function size(v: Vec<'X>, foo: 'Y): usize {...}

// OK: different type of the first argument.
function size(v: Set<'X>): usize {...}

// ERROR: conflicts with (1)
function size(v: Vec<bool>): usize {...}
```

This improves DDlog code in a coupe of ways:
- We no longer need to prefix function names with type or module names
- Once we support object-oriented function call notation, it will be
  nicer to write `x.length()` vs `x.vec_length()`
- We will be able to get rid of the horrible hack with string
  conversions functions where the function name is computed by
  addint `2string` to the type name, and just name all of them
  `to_string()`.

Implementation details:

- Within a module, we always report conflicts regardless of whether
conflicting functions are invoked anywhere in the program.  When
conflicting declarations appear in different modules, we postpone the
check until the user makes a function call that cannot be unambiguously
resolved to one of the candidates, i.e., we won't complain even if the
user imports both modules unqualified unless they try calling the
function.

- Rust does not allow functions with the same name unless they appear
inside different modules or impl blocks. Both options don't work well
for us. Instead the compiler mangles names of polymorphic functions by
adding the type of the first argument and the number of arguments to the
name. This is invisible to the user, except in extern functions, which
must be named-mangled manually. This is ugly and fragile; hence the
preferred solution is to make sure that extern function names are
unique, and define aliases to them in DDlog. E.g, in std.dl we keep old
function names like string_substr and add new functions, e.g.,
substr(string,...) that invoke them.

This commit also adds support for recursive functions.
  • Loading branch information
ryzhyk committed Jul 15, 2020
1 parent c07fbe3 commit a5cdc35
Show file tree
Hide file tree
Showing 54 changed files with 2,322 additions and 691 deletions.
79 changes: 58 additions & 21 deletions doc/tutorial/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -436,9 +436,9 @@ Other string operations are implemented as library
[standard library](#the-standard-library), e.g.,
```
extern function string_len(s: string): usize
extern function string_contains(s1: string, s2: string): bool
extern function string_join(strings: Vec<string>, sep: string): string
extern function len(s: string): usize
extern function contains(s1: string, s2: string): bool
extern function join(strings: Vec<string>, sep: string): string
...
```
Expand Down Expand Up @@ -758,11 +758,39 @@ EndpointString(addr_port(ip, proto, preferred_port)) :-
### Functions
DDlog functions are pure (side-effect-free) computations. A function
may not modify its arguments. The body of a function is an expression
whose type must match the function's return type. A function call can
be inserted anywhere an expression of the function's return type can
be used. DDlog currently does not allow recursive functions.
We have already encountered several functions in this tutorial. This section
gives some additional details on writing DDlog functions.
#### Polymorphic functions
DDlog supports two forms of polymorphism: parametric and ad hoc polymorphism.
The following declarations from `std.dl` illustrate both:
```
function size(s: Set<'X>): usize {...}
function size(m: Map<'K, 'V>): usize {...}
```
Parametric polymorphism allows declaring functions generic over their argument
types. The `size` functions above work for sets and maps that store values of
arbitrary types. This is indicated by using type arguments (`'X`, `'K`, `'V`)
instead of concrete argument types.
Ad hoc polymorphism allows multiple functions with the same name but different
arguments. The two `size()` functions above do not introduce any ambiguity,
since the compiler is able to infer the correct function to call in each case
from the type of the argument. Specifically, the compiler uses the number of
arguments and the type of the **first** argument to disambiguate the callee.
#### Modifying function arguments
By default, function arguments cannot be modified inside the function. Writable
arguments can be declared using the `mut` qualifier:
```
// This function modifies its first argument.
function insert(m: mut Map<'K,'V>, k: 'K, v: 'V): () { ... }
```
> #### Legacy function syntax
>
Expand All @@ -781,14 +809,13 @@ be used. DDlog currently does not allow recursive functions.
> ```
### Extern functions
#### Extern functions
Functions that cannot be easily expressed in DDlog can be implemented as
*extern* functions. Currently these must be written in Rust; the Rust
implementation may in turn invoke implementations in C or any other language.
For instance, DDlog does not provide a substring function. We can
declare such a function as `extern`:
Example:
```
extern function string_slice(x: string, from: bit<64>, to: bit<64>): string
Expand All @@ -815,6 +842,24 @@ pub fn string_slice(x: &String, from: &u64, to: &u64) -> String {
DDlog will automatically pickup this file and inline its contents in the
generated `lib.rs`.
#### Functions with side effects
Functions implemented completely in DDlog without calls to any extern functions
are pure (side-effect-free) computations. It is however possible to declare
extern functions with side effects. The DDlog compiler needs to know about these
side effects, as they may interfere with its optimizations. The programmer is
responsible for labeling such functions with the `#[has_side_effects]` attribute,
e.g., the following function is defined in the `log.dl` library:
```
#[has_side_effects]
extern function log(module: module_t, level: log_level_t, msg: string): ()
```
The compiler automatically infers these annotations for non-extern functions
that invoke extern functions with side effects, so only extern functions must
be annotated.
### Advanced rules
#### Negations and antijoins
Expand Down Expand Up @@ -900,19 +945,11 @@ are *generic* types that can be parameterized by any other
DDlog types, e.g., `Vec<string>` is a vector of strings, `Map<string,bool>` is
a map from strings to Booleans.
Let us assume that we have an extern function that splits a string
We will use a DDlog standard library function that splits a string
into a list of substrings according to a separator:
```
extern function split(s: string, sep: string): Vec<string>
```
The Rust implementation can be as follows:
```
pub fn split_ip_list(s: &String, sep: &String) -> Vec<String> {
s.as_str().split(sep).map(|x| x.to_string()).collect()
}
function split(s: string, sep: string): Vec<string>
```
We define a DDlog function which splits IP addresses at spaces:
Expand Down
8 changes: 4 additions & 4 deletions lib/graph.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ pub fn graph_SCC<S, V, E, N, EF, LF>(
where
S: Scope,
S::Timestamp: Lattice + Ord,
V: Val + 'static,
V: differential_dataflow::Data,
N: differential_dataflow::ExchangeData + std::hash::Hash,
E: differential_dataflow::ExchangeData,
EF: Fn(V) -> E + 'static,
Expand Down Expand Up @@ -53,7 +53,7 @@ pub fn graph_ConnectedComponents<S, V, E, N, EF, LF>(
where
S: Scope,
S::Timestamp: Lattice + Ord,
V: Val + 'static,
V: differential_dataflow::Data,
N: differential_dataflow::ExchangeData + std::hash::Hash,
E: differential_dataflow::ExchangeData,
EF: Fn(V) -> E + 'static,
Expand Down Expand Up @@ -81,7 +81,7 @@ where
S: Scope,
S::Timestamp: Lattice + Ord,
u64: From<N>,
V: Val + 'static,
V: differential_dataflow::Data,
N: differential_dataflow::ExchangeData + std::hash::Hash,
E: differential_dataflow::ExchangeData,
EF: Fn(V) -> E + 'static,
Expand Down Expand Up @@ -114,7 +114,7 @@ pub fn graph_UnsafeBidirectionalEdges<S, V, E, N, EF, LF>(
where
S: Scope,
S::Timestamp: TotalOrder + Lattice + Ord,
V: Val + 'static,
V: differential_dataflow::Data,
N: differential_dataflow::ExchangeData + std::hash::Hash,
E: differential_dataflow::ExchangeData,
EF: Fn(V) -> E + 'static,
Expand Down
41 changes: 41 additions & 0 deletions lib/internment.dl
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,44 @@ extern function istring_trim(s: istring): string
extern function istring_to_lowercase(s: istring): string
extern function istring_to_uppercase(s: istring): string
extern function istring_reverse(s: istring): string

function contains(s1: istring, s2: string): bool {
istring_contains(s1, s2)
}

function join(strings: Vec<istring>, sep: string): string {
istring_join(strings, sep)
}
function len(s: istring): usize {
istring_len(s)
}
function replace(s: istring, from: string, to: string): string {
istring_replace(s, from, to)
}
function split(s: istring, sep: string): Vec<string> {
istring_split(s, sep)
}
function starts_with(s: istring, prefix: string): bool {
istring_starts_with(s, prefix)
}
function ends_with(s: istring, suffix: string): bool {
istring_ends_with(s, suffix)
}
function substr(s: istring, start: usize, end: usize): string {
istring_substr(s, start, end)
}
function to_bytes(s: istring): Vec<u8> {
istring_to_bytes(s)
}
function trim(s: istring): string {
istring_trim(s)
}
function to_lowercase(s: istring): string {
istring_to_lowercase(s)
}
function to_uppercase(s: istring): string {
istring_to_uppercase(s)
}
function reverse(s: istring): string {
istring_reverse(s)
}
2 changes: 1 addition & 1 deletion lib/json.dl
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ function jval_get(v: JsonValue, attr: istring): Option<JsonValue> =
function jval_get_or(v: JsonValue, attr: istring, def: JsonValue): JsonValue =
{
match (v) {
JsonObject{o} -> option_unwrap_or(map_get(o, attr), def),
JsonObject{o} -> unwrap_or(map_get(o, attr), def),
_ -> def
}
}
Expand Down
6 changes: 6 additions & 0 deletions lib/regex.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ impl Deref for regex_Regex {
}
}

impl Default for regex_Regex {
fn default() -> Self {
Self::new("").unwrap()
}
}

impl PartialOrd for regex_Regex {
fn partial_cmp(&self, other: &regex_Regex) -> Option<std::cmp::Ordering> {
self.as_str().partial_cmp(other.as_str())
Expand Down
Loading

0 comments on commit a5cdc35

Please sign in to comment.