Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs arrears #152

Merged
merged 26 commits into from
Mar 19, 2024
Merged

Docs arrears #152

merged 26 commits into from
Mar 19, 2024

Conversation

countvajhula
Copy link
Collaborator

Summary of Changes

Address some review comments by @benknoble prior to merging the compiler work in #146 . A few other improvements and updates.

Public Domain Dedication

  • In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

@countvajhula
Copy link
Collaborator Author

countvajhula commented Jan 21, 2024

This docs PR is ready for review! Tagging Michael as well since the latest bit talks about the 2-level architecture and the use of Syntax Spec (actually, looking at it now, there's a lot in here that would benefit from your review @michaelballantyne ).

@michaelballantyne
Copy link
Collaborator

The bit re architecture and syntax-spec looks good to me.

Copy link
Collaborator

@benknoble benknoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go back to see if this addresses all my old comments, but this looks like a great set of changes. Some minor nits in the comments below.

qi-doc/scribblings/field-guide.scrbl Outdated Show resolved Hide resolved
@@ -396,6 +396,8 @@ Yet, either implementation produces the same output: @racket[(list 1 9 25)].

So, to reiterate, while the output of Qi flows will be the same as the output of equivalent Racket expressions, they may nevertheless exhibit a different order of effects.

If you'd like to ensure a specific order of effects, use @racket[effect] at the appropriate points in your flow. If you'd like to use Racket's order of effects, define your flow using @racket[esc] (although this would lose any Qi compiler optimizations).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to "define" a flow "using esc"? I think I understand you to mean to do something like (define-flow foo (esc …)), but I'm not sure I understand the connection.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Define" here is intended to mean something like "specify", without the connotation of binding. That is, in (~> a b (esc (lambda ...))), we are considering the third flow in the thread to be defined using Racket instead of defined using Qi. Would "specify" be more clear here, or maybe just "write"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either of the latter 2 words would be clearer to me; "define" has such a specific meaning in Racket documentation to my ears.

qi-doc/scribblings/forms.scrbl Show resolved Hide resolved
qi-doc/scribblings/principles.scrbl Outdated Show resolved Hide resolved
qi-doc/scribblings/qi.scrbl Outdated Show resolved Hide resolved

In functional programming, "effects" refer to anything that the function does that is not captured in its inputs and outputs. This could include things like printing to the screen, writing to a file, or mutating a global variable.

A @tech{flow} should either be pure (that is, free of such side effects), or its entire purpose should be to fulfill a side effect. It is considered inadvisable to have a function with sane inputs and outputs (resembling a pure function) that also performs a side effect. It would be better to decouple the effect from the rest of your function (@seclink["Use_Small_Building_Blocks"]{splitting it into smaller functions}, as necessary) and perform the effect explicitly via the @racket[effect] form, or otherwise escape from Qi using something like @racket[esc] (note that @seclink["Identifiers"]{function identifiers} used in a flow context are implicitly @racket[esc]aped) in order to perform the effect. This will ensure that there are no surprises with regard to @seclink["Order_of_Effects"]{order of effects}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like "considered inadvisable" here. I think it would be better to make the reasons explicit up front. Something like, "Qi may reorder operations not marked as effects, so it's better to separate effects from other computations to ensure they work as expected".

I don't think I understand how esc and effect relate to suppressing effect re-ordering. Are you promising that neither will ever be reordered, or are you promising that effect will never be reordered but also suggesting that pairs of effects inside the same esc in Racket code will of course not be re-ordered relative to each other, but not making any promises re: pairs of effects in separate esc forms?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like the latter. I think we are promising:

  • If you use esc, anything inside that will be untouched (but it may still be reordered wholesale at a higher level)
  • If you use effect, the effect will happen along with the annotated flow, whenever that happens (but the whole side-effecting flow (effect f g) may itself be moved around)

To take an example, with this normalization rule:

[(thread _0 ... (pass f) (amp g) _1 ...)
 #'(thread _0 ... (amp (if f g ground)) _1 ...)]

It would rewrite:

(~> (pass (ε displayln positive?)) (amp sqr)) --> (amp (if (ε displayln positive?) sqr ⏚))

Here, the effect stays with the flow it annotated, but it happens at a different time.

So what we are promising may be some kind of "effect locality" rather than any particular order of effects. Tbh I'm not sure if this is a good way to think about it and what exactly our guarantees imply. I will think about it some more.

On a side note, the former of these bullets (re: esc) makes me feel that matching literal uses of racket/list's map, filter, etc. is probably not the right long term thing, and that we'd probably want a qi/list language that actually includes map and filter forms. This would ensure that we do not cross the esc boundary to do optimizations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only way to really guarantee order of effects is to wrap each flow in effect? Maybe? [I don't think this will matter most of the time to most people, and it suggests that highly effectful programs benefit from a more "direct" imperative style.]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if wrapping every flow with effect would guarantee an order, except that doing so could mean that no compiler pattern would match (but in this case, if there are just a lot of empty effect forms, we would likely normalize that away in any case, like (effect ground flo) should, I think, be rewritten to flo).

Here's a first attempt at formalizing our guarantees about effects:

For two flows f and g, we could define a relation "g is downstream of f" as the outputs of f are used, either directly or transitively, as inputs to g.

Then, our guarantees about effects are:

  1. Any effects on f will occur before any effects on g.
  2. effects on any flow φ will happen when φ is called.

That is, we guarantee that an effect will never be separated from the flow it annotates (what we could call "locality"), though when this is invoked is not guaranteed, aside from point (1) above.

Do we feel this is a useful characterization?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might seem silly, but do we optimize either of the flows within (effect f g)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we would optimize both.


Consider the Racket expression: @racket[(map sqr (filter odd? (list 1 2 3 4 5)))]. As this invokes @racket[odd?] on all of the elements of the input list, followed by @racket[sqr] on all of the elements of the intermediate list, if we imagine that @racket[odd?] and @racket[sqr] print their inputs as a side effect before producing their results, then executing this program would print the numbers in the sequence @racket[1,2,3,4,5,1,3,5].

The equivalent Qi flow is @racket[(~> ((list 1 2 3 4 5)) (filter odd?) (map sqr))]. As this sequence is @seclink["Don_t_Stop_Me_Now"]{"deforested" by Qi's compiler} to avoid multiple passes over the data and the memory overhead of intermediate representations, it invokes the functions in sequence @emph{on each element} rather than @emph{on all of the elements of each list in turn}. The printed sequence with Qi would be @racket[1,1,2,3,3,4,5,5].
The equivalent Qi flow is @racket[(~> ((list 1 2 3 4 5)) (filter odd?) (map sqr))]. As this sequence is @seclink["Don_t_Stop_Me_Now"]{deforested by Qi's compiler} to avoid multiple passes over the data and the memory overhead of intermediate representations, it invokes the functions in sequence @emph{on each element} rather than @emph{on all of the elements of each list in turn}. The printed sequence with Qi would be @racket[1,1,2,3,3,4,5,5].

Yet, either implementation produces the same output: @racket[(list 1 9 25)].

So, to reiterate, while the output of Qi flows will be the same as the output of equivalent Racket expressions, they may nevertheless exhibit a different order of effects.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement is wrong. The output of a function can depend on effects, so there's no guarantee the output of a flow using effectful functions will be the same as if you hadn't reordered things with deforestation.

e.g. mapping with a function add-count that uses a global variable:

(define add-count
  (let ([v 0])
     (lambda (arg)
        (set! v (+ v 1))
        (+ arg v))))

(~> (list 1 2 3) (map add-count) (map add-count))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this, thanks! I'm still thinking about how best to phrase the other section re: your other comment and what the precise guarantees about effects are that we provide.

Decouple the discussion about separating effects from other
computations from the discussion about order of effects.
@countvajhula
Copy link
Collaborator Author

I retitled one of the sections "separate effects from other computations" to keep it focused on why you should do this and what benefits it brings, unrelated to order of effects, which is emerging as a different, orthogonal concern. Depending on whether we feel the ideas on "effect locality" are reasonable, I could add some explanation about that in the section on "order of effects" (e.g. "in a threading form, upstream effects are guaranteed to happen before downstream effects").

Copy link
Collaborator

@benknoble benknoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets better each time I look at it! 🎉 I don't have much specific to say about the current state. If you feel any of my previous comments are addressed, please feel free to "Resolve" them.

@benknoble
Copy link
Collaborator

I can't recall: are we still hashing out the effects stuff before merging this? If so, would it be worth it to cherry-pick the non-effects related bits and get those merged sooner?

@countvajhula
Copy link
Collaborator Author

@benknoble Sorry for the delay on this, and thanks for staying on top of it! It is indeed still waiting on clarification of the effects stuff but we did resolve that a few weeks ago when I was on vacation. I've been down with flu since my return so I haven't gotten around to writing it out, but I'm aiming to get to it this week and then this should be ready to go. If I can't get to it this week then I'll aim to separate it out and merge the rest as you suggested.

@benknoble
Copy link
Collaborator

Understood! I must have missed that you were sick; hope your trip went well and that you recover in peace.

qi-doc/scribblings/principles.scrbl Outdated Show resolved Hide resolved
Comment on lines 110 to 112
In the above example, @racket[filter] and @racket[map] are obviously ordered by @racket[~>] in this way, so that @racket[(filter my-odd?)] is upstream of @racket[(map my-sqr)]. But it's not so obvious how @racket[my-odd?] and @racket[my-sqr] should be treated. These are employed "internally" by the higher-order flows @racket[filter] and @racket[map], and are not directly ordered by the @racket[~>] form. Should @racket[my-odd?] be considered to be upstream of @racket[my-sqr] here?

This is where the distinction between flows and flow invocations comes into play. In fact, not all invocations of @racket[my-odd?] are upstream of any particular invocation of @racket[my-sqr]. Rather, specific invocations of @racket[my-sqr] that use values computed by individual invocations of @racket[my-odd?] are downstream of those invocations, and notably, these invocations involve the individual elements of the input list rather than the entire list, so that the computational dependency expressed by this relation is as fine-grained as possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very enlightening, thank you!


@definition["Effect locality"]{For @tech{flow} invocations @${f} and @${g} and corresponding effects @${ε(f)} and @${ε(g)},

@$${f < g ⇒ ε(f) < ε(g)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a LaTeX-style context (e.g., MathJax), should we use \lt and \implies here?


@$${f < g ⇒ ε(f) < ε(g)}

where @${<} on the left denotes the relation of being upstream, and @${<} on the right denotes one effect happening before another.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punning these symbols is convenient, but it can be bothersome (it requires the reader to work out types to know which symbol is referencing which operation). How about a subscript or a different operator for each relation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine it could be confusing (but I think I'd need to do actual proofs or something to see if one is easier than the other!), but for the moment I feel punning on the familiar order symbol is easier on the reader since we don't actually use the RHS < anywhere else besides here, and subscripts could make it more technical than we currently have a use for.

Comment on lines 154 to 165

In the earlier example, with an input list @racket[(list 1 2 3)], Racket's order of effects follows the invocation order:

@racketblock[
(my-odd? 1) (my-odd? 2) (my-odd? 3) (my-sqr 1) (my-sqr 3)
]

Qi's order of effects is:

@racketblock[
(my-odd? 1) (my-sqr 1) (my-odd? 2) (my-odd? 3) (my-sqr 3)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be helpful to repeat the programs here for context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it helps a lot!

@countvajhula
Copy link
Collaborator Author

OK, I think this is ready! Final review?

Comment on lines 430 to 451
@subsubsection{Schrodinger's Probe}

Another curious thing to watch out for is that use of the @seclink["Using_a_Probe"]{probe debugger} can affect the @seclink["Order_of_Effects"]{order of effects}, as it could suppress optimizations that would be otherwise be performed if the @tech{flow} were unobserved.

Consider this example:

@racketblock[
(define-flow foo
(~> (pass (effect E₁ odd?))) readout (>< (effect E₂ sqr)))

(probe (foo 1 2 3))
]

Here, with the @racket[readout], all the effects E₁ would occur first, followed by all of the E₂ effects. Without the @racket[readout], the flow would be deforested by the compiler to:

@racketblock[
(>< (if (effect E₁ odd?) (effect E₂ sqr) ⏚))
]

… and the effects E₁ and E₂ would be interleaved. Either order is consistent with @seclink["Effect_Locality"]{locality} but they are @emph{different}. Indeed, the @racket[readout] in this case would not even represent a valid point in the a priori optimized program.

So it's important to bear in mind that one cannot observe a flow using @racket[probe] without changing the program being observed, a change which in some cases has no observable impact, and which in other cases is significant. But now that you understand this phenomenon, you can develop intuition for the nature of such changes, and how best to use the tool to find the answers you are looking for.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, though I'm curious if the following train of thought elucidates any simpler explanations.

I wonder: isn't readout a kind of effect? (I forget what happens if you omit probe, so let me ignore that for now.)

If so, that would suggest adding readout changes the effects present in the flow, so (of course!) it changes the observed effects. This might answer "when is it significant to add readout": whenever there are other effects.

The bit about readout affecting optimizations might also be explainable by saying that it's a kind of effect; but we could also probably wave our hands a bit and say that readout introduces a kind of optimization barrier across which the Qi compiler cannot reason (interesting follow-up: why? could some reasoning be possible?).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very interesting point and I hope we'll explore it in more depth over time as I suspect Qi could provide a natural setting for understanding this (if it isn't already well understood in the field, which it well might be!).

First re: probe and readout, readout is a simple lambda (and thus a flow) that escapes into a continuation bound to a parameter (the parameter is used so that it is in scope everywhere during evaluation). If probe is omitted then readout escapes into the continuation that the parameter was defined with, which just raises an error saying that probe wasn't used.

The only thing probe does is take a fresh continuation and wrap the flow with it, leaving it otherwise intact.

I don't know what to consider escaping into a continuation, from the perspective of effects. By the loose definition we have in the Field Guide, "an effect is anything a function does that isn't reflected in its inputs and outputs." Escaping prior to returning almost seems to match this definition, except that doing so also seems to make the outputs undefined in the original context of evaluation. So, I don't know if it's an effect or something else, that we could perhaps have a pretty interesting formal model for, which we could maybe think about in a clean functional way.

I am basically making all this up, but this is just to say, I think it will be interesting to explore! I don't know if, for now, we have enough to go on to simplify the explanation of the "Schrodinger's probe" phenomenon, though :)

re: the compiler reasoning across the readout barrier, the issue is that due to optimizations, sometimes readouts can be placed in source programs in places that don't represent valid points in the target program, so technically, the values read out at that point in the source program would never be encountered in the target program (e.g. an intermediate list that would not be constructed in a deforested sequence). Yet, conceptually those values do exist just in terms of the logical structure of the program, and as you said, in the absence of effects, whether the values are defined there or not in the running program are irrelevant to the actual output. It's all a bit confusing! I agree we can probably make this clearer.

This discussion may be of interest to @jairtrejo and @dzoep .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still seems that the explanation in your reply is too much about the mechanism (continuations) and not about the semantics (reading out values). One of the things that caught my eye is that the documentation text pins down particulars of deforestation in the example.

Anyway, I'm not going to be a stick in the mud about it—just wanted to voice a question and some concerns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no question of being a stick in the mud! Your commitment to a high standard for the docs is always appreciated.

But I think I agree with your earlier assessment that the effects stuff is potentially quite involved, and best tackled separately from the more surface level stuff that's in this PR. I will aim to separate that out into a new PR soon so we can continue discussions there while unblocking the other stuff that's in here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, since you've forced me to come up with clearer explanations 😜 , I've started a new PR dedicated to developing Qi's "theory of effects" ( #165 ). It captures discussions on the topic so far and also starts to develop it more precisely. In thinking about it more, I felt that what we've been calling "effect locality" is actually (at least) two distinct notions -- what the PR now calls "effect locality" and "well-ordering of effects." I also revised the explanation of Schrodinger's probe to hopefully make it clearer but I'm not sure if I've captured your feedback.

For anyone following this discussion, please follow #165 (not sure if there is a way to do that without explicitly commenting on it - but if you thumbs-up this comment I will tag you there. I will probably tag you there in any case if your input is needed, but there is no particular time pressure on this new PR and we can take the time we need so it's reasonably accurate).

@countvajhula
Copy link
Collaborator Author

OK, I've removed everything about effects from this PR and will start a new one for that shortly. This PR should be ready to go!

@countvajhula countvajhula mentioned this pull request Mar 19, 2024
1 task
@benknoble
Copy link
Collaborator

Sounds good to me!

@countvajhula
Copy link
Collaborator Author

OK, I will merge 👍 Thank you for reviewing @benknoble and all!

@countvajhula countvajhula merged commit 1bc0721 into drym-org:main Mar 19, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants