-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Move type checking code and make it bidirectional #58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just some clarifying questions.
I think the checker modules (at least) could benefit from a module level doc string/comment which explains what they're targeting and what assumptions are being made about the AST before it reaches them. For example when it's expected that control flow operations should have already been removed from the AST
globals = ctx.globals | ||
|
||
# Check if the body contains a recursive occurrence of the function name | ||
if func_def.name in cfg.live_before[cfg.entry_bb]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's checking that the function name is defined as a variable in the outer scope. Is that the mechanism for recursive calls? What decides whether this is true or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg.live_before[cfg.entry_bb]
stores the variables that are used in the CFG but not defined in the CFG. In other words, this checks if the function name occurrs freely in the CFG. This is just to make sure that we don't get a false positive when a user shadows the name of the function with a local variable.
The code inside the if
statement then determines whether we make the name available as a global variable or as an implicit argument, depending on the presence of captured arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg.live_before[cfg.entry_bb]
stores the variables that are used in the CFG but not defined in the CFG. In other words, this checks if the function name occurrs freely in the CFG. This is just to make sure that we don't get a false positive when a user shadows the name of the function with a local variable.
Could you add this to the comment please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits for now, will try to read the code in more detail tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. I do see that I need to play more with guppy and hugr to have any nuanced feedback.
pred.successors[num_output] = compiled[bb] | ||
|
||
checked_cfg.bbs = list(compiled.values()) | ||
checked_cfg.exit_bb = compiled[cfg.exit_bb] # TODO: Fails if exit is unreachable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to catch a KeyError
exception here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should never occur since Guppy doesn't let users write functions that don't return:
def main() -> int:
while True:
pass
return 0 # Unreachable statement, but still required by Guppy
This is because Guppy program analysis doesn't take the semantics of True
and False
into account. As a result, the exit BB in a CFG will always be reachable in the graph.
In the future, we might allow users to write infinitely looping functions that don't require the return
statement at the end. Then we would need to be more careful about this line
# For the entry BB we have to separately check that all used variables are | ||
# defined. For all other BBs, this will be checked when compiling a predecessor. | ||
if bb == cfg.entry_bb: | ||
assert len(bb.predecessors) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failures on assert
s should probably raise internal errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I typically use asserts as a concise check and documentation of invariants. We could catch AssertionErrors
on the top-level and turn them into InternalGuppyErrors
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine. ruff tells me it's usually a good idea to avoid them in production code as they might get erased with certain compilation flags. Unsure how much of a concern are they in your case.
if var.ty.linear and var.used: | ||
raise GuppyError( | ||
f"Variable `{x}` with linear type `{var.ty}` was " | ||
"already used (at {0})", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"already used (at {0})", | |
"already used (at 0)", |
would this mean something different? Confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this line is a regular string, not an fstring, so it is the actual char sequence "{0}"
.
When referring to code locations in error messages, we put the format placeholders {0}
, {1}
, etc and store the corresponding AST nodes in the error, instead of baking the location into the string. The reasoning is that this gives us more flexibility in how we want to display source locations when printing the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I must have misread that as part of an f-string. Thanks for the explanation.
Most of this code is copied from previously existing type checking code:
guppy/expression.py
->guppy/checker/expr_checker.py
guppy/statement.py
->guppy/checker/stmt_checker.py
guppy/function.py
->guppy/checker/func_checker.py
guppy/cfg/cfg.py
->guppy/checker/cfg_checker.py
The main new feature is that expressions can synthesize their type (what we did before), but also be checked against a type. Checking is used for function arguments, return values, and annotated assignments. At the moment, checking just synthesizes a type and compares it to the expected type, but in the future we can use it to instantiate type arguments.
Note that this PR does not pass tests on its own, only once everything is merged into #57