Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deforest within the core language #2

Conversation

countvajhula
Copy link

Summary of Changes

This introduces the #%deforestable core form and starts to make some of the changes we talked about.

Planned work:

  • initial proof of concept just introducing the #%deforestable core form
  • hardcoded support for core list operations like map, filter, foldl, foldr, and range
  • discern a usable generic core form syntax from these initial forms
  • change the core form syntax to the generic version discerned and reimplement the list forms, still hardcoded in the compiler
  • write Qi macros for each list operation and expand to the new core form
  • make it more generic, using compile time datatype, etc.
  • TBD

I hope to give us a head start with this PR so that we can pick up on it in next week's meeting.

Public Domain Dedication

  • In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

@countvajhula
Copy link
Author

FYI @dzoep @michaelballantyne @benknoble

This is WIP. For next steps, I'd like to:

  • make the forms use Qi syntax in the f or op positions (e.g. (map f) and (foldl op init)). I think this would involve modifying the codegen step but also invoking codegen on demand while applying the deforestation rewrite rules
  • move the matching (which is very clean and modular thanks to your "rockin' refactor" @dzoep ! Has been so easy to modify) up to the expander

Other things to talk about:

  • the syntax of forms like range when they are macros rather than functions. Range accepts 3 arguments, and all of them are numbers. Should we support template syntax even though, as a macro, it isn't a partial application of a function anymore? What are the alternatives?
  • what should the generic syntax of #%deforestable be, that would allow it to be extensible (instead of explicitly encoding qi/list into the code generation step as it currently does in this WIP)?

Some of these are longer term discussions and we don't need to have answers at this very moment. The main immediate goal is to structure qi/list so that it (1) exhibits forms with Qi syntax and (2) does not do any host language transformations. Then deforestation of remaining forms can proceed in a manner decoupled from these other design considerations, even if they remain closely coupled in qi-lib in the immediate future.

@dzoep dzoep force-pushed the racket-list-deforestation branch 2 times, most recently from 121b212 to 016ab41 Compare July 27, 2024 21:21
@benknoble
Copy link

benknoble commented Jul 28, 2024

Presumably once Dominik's branch settles down, you'll rebase this onto there? I was able to try it myself with something like (my remotes are named dominik and sid):

git log --oneline main..sid/deforestable-core-form --author=sid | tail
# copy the hash of the first commit you made on the branch
git rebase --onto=dominik/racket-list-deforestation <hash>^ sid/deforestable-core-form

(The ^ after the pasted <hash> is important; otherwise that commit disappears.) Right now the <hash> is for feeb211 (Introduce a #%deforestable core form, 2024-06-28), and its parent is ce0bc49 (Add attach-form-property to both places needed by current deforestation (CPS) implementation., 2024-06-28).

My reflog also contains enough fetches of Dominik's force-pushes that this is all equivalent to git rebase --fork-point dominik/racket-list-deforestation sid/deforestable-core-form, which also worked for me locally. To see if you can use the short form, try git merge-base --fork-point dominik/racket-list-deforestation sid/deforestable-core-form and confirm it's the same as git rev-parse <hash>^ (i.e., that it gives you the "parent" commit you'd specify with --onto).

@countvajhula
Copy link
Author

@benknoble That is some next level Git fu! 🥋 I will try it out when it's time to rebase.

@countvajhula countvajhula force-pushed the deforestable-core-form branch from 04ff858 to 10d2d85 Compare August 2, 2024 23:21
dzoep and others added 24 commits August 2, 2024 17:13
Deforest all variants of cad*r:

- car
- cadr
- caddr
- cadddr
- caddddr
- cadddddr

Deforest (using the same underlying implementation) list-ref as well.
- split syntax matching from syntax production
- improve naming of syntax classes
- remove unused template variables
- preliminary splitting of the compiler into separate modules for separate passes
- update tests to reflect new paths
- rename compiler "passes" subdirectory to "compiler"
- strip the passes modules file name pass- prefix
- scribblings for qi/list module
- scribble the new literals for matching in deforestation pass
- ensure for-label bindings in the generated documentation
- new bindings.rkt module
…ler meeting on 2024-06-21.

- add detailed explanation for inline-consing syntax
- use Racket's conventions for parentheses
- add description of fsp-, fst-, and fsc- prefixes
- move define-and-register-deforest-pass and related to separate module, add comments
This form is intended to express any deforestable expression, allowing
the core language to express deforestation semantics, which, formerly,
we were not able to do within the language and thus resorted to
matching, and optimizing, host language syntax, leading to a "host" of
problems.

This new form is groundwork to enable compiler optimizations being
defined purely on the core language, thus representing a clean
boundary, or contract, between Qi and the host (Racket).

The initial implementation here just introduces the form, and code
generation for `filter` specifically, as a proof of concept for the
more generic and extensible planned implementation.

See the meeting notes for more, e.g.:
https://github.com/drym-org/qi/wiki/Qi-Meeting-Jun-21-2024#implementing-it
The function positions in deforestable operations are Racket
expr positions, but we want them to be Qi floe positions instead. This
modifies the code generation step to recursively invoke codegen on
these nested floe positions.
Based on recent discussions, as a general maxim:

  Our core language should be rich enough
  to express desired optimizations.

Initially, as this wasn't the case, we were performing deforestation
by matching host language forms. This of course meant that we were
constrained to Racket syntax in such functional operations. Now that
we are broadening our core language to express deforestation, in
keeping with the above maxim, we would prefer to support Qi syntax in
function positions in these operations.

Towards this goal, this new syntax for the `#%deforestable` core form
introduces support for `floe` positions.

Right now, it simply segregates arguments into `expr` and `floe`
positions so that these are appropriately expanded. The code
generation still matches the name of the functional list
transformation (e.g. `map`, `filter`) and "hardcodes" the known
invocation of the corresponding underlying operation. Eventually we
hope to make deforestation user-extensible to arbitrary functional
list (at least) operations. At that stage, we wouldn't have this kind
of standard information that we could leverage during code generation,
so we will need to modify the syntax of `#%deforestable` to encode
enough information to be able to perform appropriate code generation
for arbitrary user-defined operations. We are not there yet :)
As the deforestation pass generates escaped Racket, we need to compile
any higher-order `floe` positions in the fusable list operations at
this stage, since the regular code generation step at the end of
compilation would not operate on these resultant escaped expressions.
@benknoble
Copy link

Is the base branch wrong? Should this PR be trying to merge into drym-org/qi:deforest-all-the-things?

@benknoble
Copy link

Ah, drym-org#180 suggests yes.

Now that these semantics tests are simply testing the behavior of
newly defined Qi forms rather than ensuring low level rewriting of
host language syntax, they can have ordinary unit tests validating
their semantics, just like any other built-in Qi macros.
We now deforest via the `#%deforestable` core form and don't need host
language (yet provided by Qi) bindings for this purpose anymore.
Use Qi equivalents as lambda is no longer valid in this position (at
least until/unless drym-org#177 is merged).
Coverage was reporting this case uncovered
In the provisional syntax of Qi's `range`, we expect the range to be
specified syntactically, as it compiles to a lambda accepting no
arguments.
- use left-threading in most tests
- one test using right-threading to validate deforestation is
  invariant to threading direction
- use `range` with syntactically specified arguments; remove tests
  using templates
- consolidate `deforest-pass` tests since we no longer have a separate
  test suite for individual applications of the deforestation rewrite
  rule (should we?)
When a nested form has a different chirality (threading direction)
than a containing form, normalization would not collapse them, but
deforestation may not care about the difference.

Possible approaches:

  A. Introduce normalization rules designed to detect
     when change of chirality is irrelevant.
  B. Look for patterns in the deforestation pass involving
    differing threading directions

Probably (A) is the right approach, and we could introduce a set of
chirality normalization rules that "trim" forms on either end of a
nested form which could be collapsed into the containing form. This
would include anything that isn't a host language function
application (which is the only case where chirality matters).

Actually, thinking again, chirality is already represented in the core
language simply as the presence of a blanket template in a function
application form, and nested threading is already collapsed by
normalization, so, I'm not sure anymore why this test is
failing ¯\_(ツ)_/¯
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants