Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow TypeArg for a row of types #787

Closed
Tracked by #416
acl-cqc opened this issue Jan 8, 2024 · 11 comments · Fixed by #804
Closed
Tracked by #416

Allow TypeArg for a row of types #787

acl-cqc opened this issue Jan 8, 2024 · 11 comments · Fixed by #804
Assignees

Comments

@acl-cqc
Copy link
Contributor

acl-cqc commented Jan 8, 2024

TypeParam::List / TypeArg::List allows one to declare a list of types, and provide them as type args (also other kinds of list, not relevant here). But there's no way to use these outside a custom binary function. We should allow should a parameter to be used inside a tuple, sum, or function type (inputs or outputs), or anywhere a row is expected, so one can say e.g. unpack_tuple has a TypeParam::List(TypeParam::Type) and then inputs Tuple(TypeArg::new_variable(0, TypeParam::List(TypeParam::Type)) and outputs that type variable.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 15, 2024

Discussion from #804:
[ @acl-cqc ]:
The value of the feature is to allow us to make operations like UnpackTuple, MakeTuple, etc., as library operations in the prelude, rather than the current LeafOp enum of ops which (without this PR) cannot be expressed in the extension system...

[ @doug-q ]:
I disagree. There is only aesthetic value in making these ops library operations.

I instead understand the value to be to allow operations with variadic inputs/outputs to be defined. The cost of not allowing these to be defined is that one is forced, by their absence, to instead work with tuples. The requisite UnpackTuple operations introduce barriers to parallelism.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 15, 2024

There is only aesthetic value in making these ops library operations.

I instead understand the value to be to allow operations with variadic inputs/outputs to be defined. The cost of not allowing these to be defined is that one is forced, by their absence, to instead work with tuples. The requisite UnpackTuple operations introduce barriers to parallelism.

So the concern here is, if MakeTuple etc. need to be "special" (defined in core Hugr) because they can't be defined in an extension - who says we've got all the other operations that also need to be special?

In particular, the core does not include such Tierkreis staples as partial and eval. One could imagine defining eval as taking a function with only a single argument type (perhaps but not necessarily a tuple), and a single value of that type. However, isn't one then going to need a "special" operation to turn a function of many ops into a function of just one?

partial seems even harder in that current Tierkreis allows providing (or not) arguments in any order, so we might need operations to pull an arbitrary argument to the front of a row. (So partial is (Function[A, B -> C], A -> Function[B -> C]) where all arguments other than the first have been tupled into the B.)

However, we can do arbitrary computation (using List<Type> params etc.) in custom binary compute_signature functions. This may let us do much or all of this, but on the condition that any all "row variables" have arity fixed at "compile time" (graph-definition time). I think pretty sure that this will rule out some higher-order graphs that we would be able to write otherwise (including in current Tierkreis), but maybe we need to exhibit some examples to see what we would be missing here.

@doug-q
Copy link
Collaborator

doug-q commented Apr 15, 2024

So the concern here is, if MakeTuple etc. need to be "special" (defined in core Hugr) because they can't be defined in an extension - who says we've got all the other operations that also need to be special?

This is a good point. However we are free to add row variables at any time without breaking anything. We can change from builtin ops to prelude ops without breaking serialisation, although this would likely break rust users. If we do add row variables there is no strict obligation to remove the builtin ops. There is no urgency if we do not have a use case.

partial and eval: I have been envisioning these not as ops, but as "compilation patterns". Each source level partial compiles to a "closure" and a FuncDefn that takes the "closure" and the "remaining args" and calls function using args from the "closure" or the "remaining args" as appropriate.

However, isn't one then going to need a "special" operation to turn a function of many ops into a function of just one?

One can write a FuncDefn per eval, that has the appropriate (one-input, one-output) shape and has UnpackTuple -> Call -> MakeTuple in the body.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 15, 2024

I have been envisioning these not as ops, but as "compilation patterns". Each source level partial

This is an interesting possibility! At the moment the "TierkreisGraph" protobuf closely matches the Rust structs internal to the interpreter (modulo things like types, perhaps), and one option is that this becomes the Hugr IR. I think that whatever input (or "source") format we use, is probably what error-mitigation protocols and other composable algorithm components are written in (i.e. is an interchange format), and that this format includes "partial" and "eval" at a similarly high level of abstraction as they are now. (Rather than much lower-level representation by mangling to FuncDefns etc.) However perhaps instead we translate from a high-level input format (that is not Hugr) to a lower-level Hugr format that is only internal to the runtime. (Indeed things like Tierkreis type inference might happen in this translation.) @ss2165

One possible downside is that this might lead to us wanting to do rewrites (as optimization) at both levels, which would seem like a shame compared to doing them only at the Hugr level.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 15, 2024

So perhaps, given I think we can do #816 without this, we should just put this issue "on hold" until we've figured out the basics of Tierkreis. (I guess #797 !)

@doug-q
Copy link
Collaborator

doug-q commented Apr 15, 2024

I have been envisioning these not as ops, but as "compilation patterns". Each source level partial

This is an interesting possibility! At the moment the "TierkreisGraph" protobuf closely matches the Rust structs internal to the interpreter (modulo things like types, perhaps), and one option is that this becomes the Hugr IR. I think that whatever input (or "source") format we use, is probably what error-mitigation protocols and other composable algorithm components are written in (i.e. is an interchange format), and that this format includes "partial" and "eval" at a similarly high level of abstraction as they are now. (Rather than much lower-level representation by mangling to FuncDefns etc.) However perhaps instead we translate from a high-level input format (that is not Hugr) to a lower-level Hugr format that is only internal to the runtime. (Indeed things like Tierkreis type inference might happen in this translation.) @ss2165

One possible downside is that this might lead to us wanting to do rewrites (as optimization) at both levels, which would seem like a shame compared to doing them only at the Hugr level.

My view is that if partial and eval need to be reasoned about as-is then they need to be ops, and we need these row vars. Surely another non-Hugr ir is unacceptable.

I don't see now that they do need to be reasoned about as-is, but I wouldn't necessarily expect to :).

So perhaps, given I think we can do #816 without this, we should just put this issue "on hold" until we've figured out the basics of Tierkreis. (I guess #797 !)

is 816 right here? My understanding is that we surely do need row vars for that(and that in any case it is nothing but a tidying up). In any case, I do agree we should delay a decision here until we understand Tierkreis.

EDIT: I think you mean writing compute_signature for those ops?

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 15, 2024

Yes, writing compute_signature (in place of the current e.g.

fn signature(&self) -> FunctionType {
FunctionType::new(vec![self.ty.clone()], vec![self.ty.clone()])
) is what it'd take to move those out of the core and into the prelude, I think, and I think that is what #816 is.

@ss2165
Copy link
Member

ss2165 commented Apr 15, 2024

However perhaps instead we translate from a high-level input format (that is not Hugr) to a lower-level Hugr format that is only internal to the runtime. (Indeed things like Tierkreis type inference might happen in this translation.)

I think this issue highlights a current tension in the design: HUGR is very machine friendly and hence can use fairly arcane expressions over the small-ish spec to express complicated things (e.g. the "compilation patterns" @doug-q refers to). However, Tierkreis is currently human writable and readable. If we instead say that guppy is going to take on the writing role, what about reading? For example, when viewing a tierkreis computation in progress in a nexus visualisation, does the compilation pattern for eval suffice...? (I think it would look very ugly)

My hope is that adding these higher order operations with compute_signature in an extension is enough to allow the "simple" expressivity in tierkreis read/write contexts, which can be "compiled away" using such patterns if we are compiling down away from tierkreis. Therefore I want to reinforce

but maybe we need to exhibit some examples to see what we would be missing here.

to see what this wouldn't cover.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Apr 16, 2024

So let's think about having say an eval op, parameterized by two lists of types: eval[*A,*B](Function[*A -> *B], *A) -> (*B). And then let's think about writing a Hugr for the "map" function (mapping a function over a list): map[A,B](Function[A -> B], List[A]) -> (List[B]).

Without #787, eval has to compute its type by binary compute_signature. The restriction that TypeArgs given to binary compute_signature cannot include type variables bites us here, as the graph for map[A,B] cannot have an op eval[List(A), List(B)] 👎 (i.e. passing two singleton lists of type).

However, what we can do is - anywhere we'd have a row of types - we statically pass just the length to compute_signature, which computes away and returns a type scheme polymorphic (via infallible substitution, rather than binary computation) over that many types. So eval has two usize params, and eval[2,3] returns forall A,B,C,D,E. (Function(A,B -> C,D,E), A, B) -> (C, D, E). Now map[A,B] can have an op that's eval[1,1,A,B] (the first two TypeArgs are given to the binary compute_signature as per defn of eval, the remaining TypeArgs are fed into substitution).

I think this "trick" gets us a reasonable distance, and applies to #816 also. Obviously all arities must be fixed statically, and this'll be the main restriction on what we can express. (Never mind that this is not the easiest interface!)

@doug-q
Copy link
Collaborator

doug-q commented Apr 16, 2024

I think this issue highlights a current tension in the design: HUGR is very machine friendly and hence can use fairly arcane expressions over the small-ish spec to express complicated things (e.g. the "compilation patterns" @doug-q refers to). However, Tierkreis is currently human writable and readable. If we instead say that guppy is going to take on the writing role, what about reading? For example, when viewing a tierkreis computation in progress in a nexus visualisation, does the compilation pattern for eval suffice...? (I think it would look very ugly)

This is a good point. I don't think readiblity should be a design goal of HUGR. I claim readability can and should be accomplished by labelling nodes with metadata, and tooling that uses that metadata to reconstruct a view that is useful to a user. Surely this is sufficient to cover eval and partial, but perhaps there are more difficult examples. This brings to the fore the problem of preserving metadata through transformations, which is a hard problem.

@doug-q
Copy link
Collaborator

doug-q commented Apr 16, 2024

So let's think about having say an eval op, parameterized by two lists of types: eval[*A,*B](Function[*A -> *B], *A) -> (*B). And then let's think about writing a Hugr for the "map" function (mapping a function over a list): map[A,B](Function[A -> B], List[A]) -> (List[B]).

Without #787, eval has to compute its type by binary compute_signature. The restriction that TypeArgs given to binary compute_signature cannot include type variables bites us here, as the graph for map[A,B] cannot have an op eval[List(A), List(B)] 👎 (i.e. passing two singleton lists of type).

However, what we can do is - anywhere we'd have a row of types - we statically pass just the length to compute_signature, which computes away and returns a type scheme polymorphic (via infallible substitution, rather than binary computation) over that many types. So eval has two usize params, and eval[2,3] returns forall A,B,C,D,E. (Function(A,B -> C,D,E), A, B) -> (C, D, E). Now map[A,B] can have an op that's eval[1,1,A,B] (the first two TypeArgs are given to the binary compute_signature as per defn of eval, the remaining TypeArgs are fed into substitution).

I think this "trick" gets us a reasonable distance, and applies to #816 also. Obviously all arities must be fixed statically, and this'll be the main restriction on what we can express. (Never mind that this is not the easiest interface!)

I like this. Don't we also need bounds on each of these type variables though?

The point of friction with row variables is that an op must have a fixed number of {in,out}ports, this idea seems to capture this sharply.

One issue with row variables which I have not previously noted is that (I claim) we do not want to find ourselves in the position of needing the type system to inspect type rows (e.g. length, uncons). A static length helps here.

github-merge-queue bot pushed a commit that referenced this issue May 22, 2024
* Add a RowVariable variant of Type(Enum) that stands for potentially
multiple types, created via Type::new_row_var_use.
* This can appear in a TypeRow, i.e. the TypeRow is then of variable
length; and can be instantiated with a list of types (including row
vars) or a single row variable
* Validation enforces that RowVariables are not used directly as
wire/port types, but can appear *inside* other types
* OpDef's may be polymorphic over RowVariables (allowing "varargs"-like
operators equivalent to e.g. MakeTuple); these must be replaced by
non-rowvar types when instantiating the OpDef to an OpType
* FuncDefn's/FuncDecl's may also be polymorphic over RowVariables as
long as these are not directly argument/result types
* Also add TypeParam::new_sequence

closes #787

BREAKING CHANGE: Type::validate takes extra bool (allow_rowvars);
renamed {FunctionType, PolyFuncType}::(validate=>validate_var_len).

---------

Co-authored-by: doug-q <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants