Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC007] Migrate the typechecker to the new AST - Part I #2121

Merged
merged 24 commits into from
Dec 23, 2024

Conversation

yannham
Copy link
Member

@yannham yannham commented Dec 9, 2024

Migrate the typechecker to the new AST (Part I)

Content

After the parser, this PR undertakes the migration of the typechecker to operate on the new AST.

Actually swapping the typechecker isn't done as part of this PR, because this requires even more changes - in particular migrating the parts of NLS that handle the term visitor pattern, and keeping part of the mainline contract equality checking code because it's still used at runtime for contract deduplication.

In the meantime, this PR duplicates the whole typechecker under bytecode::typecheck temporarily, so that we can iterate there.

How to review

Most of those changes are pretty mechanic: we switched type signatures to use components from bytecode::ast::* instead of term::*, and then fixed the myriad of compiler errors. Because we duplicated the code, it's not easy to see the difference with the original version either; but most of the changes should really be boring.

Some parts that might be worth looking at:

  • all the Traverse mechanic has been moved to a new dedicated top-level module traverse (it was previously defined in term). As for combine::Combine, we added a second version of the trait for arena-allocated values: TraverseAlloc, and implemented it for all the new AST subcomponents.
  • typechecking records as defined in the new AST is not trivial, because all we have now is a list of path-defined fields, that might overlap in different ways. The machinery to handle record is now implemented in bytecode::typecheck::record. Most of it is a clean implementation of what we did in the parser before RFC007: transform a list of path-defined fields to nested maps. The typechecking part of bytecode::typecheck::record has been mostly taken from typecheck, and hasn't been substantially modified.
  • While I tried to keep this PR focused on doing the minimal changes so that the repo compiles again, saving more substantial refactoring/cleaning for later, it was just too tempting for typecheck::eq, which was previously complicated by the fact that it needs to handle both typechecking-time and run-time contract equality check. the new bytecode::typecheck::eq has been re-organized around traits instead of free-standing functions, plus other simplifications coming from the fact that runtime contract equality will have to be done entirely differently in the bytecode VM, so we can afford to be less generic here.

All in all, I believe that we'll have a net reduction of LoC once we scrap the original typechecker, because many edge cases just don't happen with the new, simpler AST.

@yannham yannham force-pushed the rfc007/typechecking branch from 39c022a to f5fca42 Compare December 20, 2024 17:35
@yannham yannham changed the title [RFC007] Migrate the typechecker to the new AST [RFC007] Migrate the typechecker to the new AST - Part I Dec 20, 2024
Copy link
Contributor

github-actions bot commented Dec 20, 2024

@yannham yannham marked this pull request as ready for review December 20, 2024 19:24
@yannham yannham requested a review from jneem December 20, 2024 19:33
Copy link
Member

@jneem jneem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of changes! I left a comment mostly just to show that I read something 😉

Seriously, though, I read the parts you mentioned in the description, and skimmed the rest.

#[derive(Default)]
pub(super) struct ResolvedRecord<'ast> {
/// The static fields of the record.
pub stat_fields: IndexMap<LocIdent, ResolvedField<'ast>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not important in this context, but in principle there are multiple locations for each field, right? It looks like ResolvedRecord::combine is just choosing one. (The LSP certainly wants them all, but I guess it isn't using this resolution anyway...)

Copy link
Member Author

@yannham yannham Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. For now I'm just maintaining the old behavior, which was to use TermPos::None as soon as the parser had to merge multiple definitions either statically or dynamically, precisely because the RichTerm interface requires that you provide at most one position, which isn't possible. If you look at ResolvedField::pos,it uses a XOR so that we return a defined position iff there is only one position defined; if there is none or there are multiple ones, we bail out and return None.

We can certainly do better in the future, I suppose. Although having a multi-pieces definitions in a statically typed block is currently un-ergonomic enough (since we can't type merging correctly, everything has to be of type Dyn) that I don't expect it to happen much. Still, I expect to re-use the record resolution part for compilation, where you might want all the positions. And one day we might type & more precisely as well.

@yannham yannham added this pull request to the merge queue Dec 23, 2024
Merged via the queue into master with commit 002f0fc Dec 23, 2024
6 checks passed
@yannham yannham deleted the rfc007/typechecking branch December 23, 2024 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants