-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Yul AST nodes #163
Comments
I started working on this here: https://github.com/d1ll0n/solc-typed-ast/tree/yul-ast |
Hello @d1ll0n. First of all, thanks for the interest in the project and your time investments. Also, sorry for late response - due to business, I'm not able to invest monitor incoming messages from all of the projects. I would try to do this more regularly. Now, to your questions:
We originally implemented Yul support for writing AST back to source. There were no additial requirements aside of this, and we also did not invested much time due to lack of functionality requests or ideas on Yul processing. The root in Solidity AST is
So beware that AST support would be limited to certian compiler version... Otherwise it would require to write a parser to also handle strings with Yul content. The only other Yul-related component is located in ast-to-source writer that also would need to be updated. UPD: I think that we should skip scope resolution changes for AST due to it would impact BC for consumers (mainly, https://github.com/ConsenSys/scribble). The option here is to introduce a standalone function for scope resolution for
I'm have no ideas also. I tried to do the same thing some time ago, but there was something that prevented me to do this. AST node contains some references to identifiers, that are used in assembly (
I do not mind to pull documentation, but beware that solc-typed-ast does not always have an access to the source. We pull documentation if source is accessible. Also our current approach is not perfect. Also a would suggest to go with iterations and do not try to introduce all-at-once. Another this that I'm think of, is that the
This was a particular reason why we picked compiler AST over ANTLR some time ago. Compiler AST already produces In Yul this is a compilcated thing... Mostly because it would require to model how compiler does this. We sometimes look under-the-hood to figure out what compiler does in certain cases. But here I guess it would need to invest time into test harnesses to be sure that AST matches the compiler rules.
+1 for not touch legacy compilers yet. Also related to what I tried to explain in answer to question 1. Investing time into a parser for legacy code without a reason seem to be an overkill (at least for now).
We can start with this and reconsider if there would be a reason. For now it seem to be fine decision. Other options that I can think of is to rename Regards. Let me know what you think. Thanks. |
Also, I guess @cd1m0 could also review your points and share an opinion. We are trying carefully test changes to not break dependent packages much. |
@blitz-1306 Thanks a lot for getting back to me on this! Couple of follow-ups if you don't mind:
Sorry, what does BC stand for? I think it makes sense to separate scope resolution for Yul nodes, particularly since you need to handle lookups slightly differently. Would you still be fine with running this inside of a post-processor for all modern Yul AST nodes? As long as it works properly I don't think it'd negatively impact any dependent packages.
At the moment I think it'll be easiest to have it as a descendant to ensure all the other tools continue to work, e.g. it's useful to be able to do a child search from a normal
Ha, indeed. I've been reading the yul package in solc and going through changelogs / testing stuff in remix, and there are lots of little undocumented oddities. One example is that in 0.6 yul identifiers could have a period, but the prefix couldn't shadow existing declarations unless they were functions. Aside from things that cause compiler errors though, the basic scope resolution process isn't too bad. Should have something ready next week.
I was thinking about this today, and I think the only way to do this properly is with code evaluation. If I add support for yul AST to the constant evaluation methods, we could test the scope resolution for Yul nodes by comparing the evaluation output to the output from @ethereumjs/vm when actually running the code. |
Hello @d1ll0n.
Backward compatibility. If we change something, even if we maintain API, some of these changes may break or impact dependent packages. In this case we should be careful, if we change
And here I'm +1 with idea to develop other function for Yul scope resolution.
Post-processing idea is fine if you want to compute only initial referenced declarations. If you want different, that would also work during tree modifications - you might need something more powerful (something like
Beware that other tools may not expect to get new nodes. But lets see what it may lead into.
We have some VM-related tests in Scribble. I suggest to try to come up with a sample source, that would serve as a proof that references are fine and code behaves as expected. Still, the implementation have higher priority now. We can keep testing ideas in mind for now, then discuss when time would be more appropriate. Regards. |
Totally understand. I'm going to continue to try to modify as little behavior as I can with the existing AST nodes so that the only change is to add features rather than introduce any new issues in existing ones. I'm making rapid progress on the basic functionality needed for these changes, once I get a little further along I will focus more on unit testing. So far all of the tests pass other than the formatting output for
Ah so to clarify on this point, I just meant run it in the post-processor to populate the initial
Perfect, thanks! I made a pr at #166 to track the progress on this. One thing I'd love to get feedback on is the way I am currently handling Yul node ids e7bc8ca. Some of the copy tests check for hard-coded ids in the nodes after copying them, so if we generate ids sequentially from the highest one in the original AST, we'd need to modify the tests. At the moment, I am just sidestepping this by offsetting all yul ids by 1e5, but I wrote a utility a8167f3 which allows the ASTReader to locate the highest node id when it first reads in an AST from the compiler, then use that as the first ID for all new nodes. This ensures there are no conflicts with any of the AST nodes in the original sources imported, however it would prevent importing fragments of AST data sequentially since IDs would already be taken. Given that the ASTContext is separate for every compilation, I think this should be ok, but I am not familiar with how other projects are using this. Can you think of any reason that it would be important for backwards compatibility (or generally) to be able to read in additional compiler outputs after the fact? They would have to already have ids consistent with the previous ones, so I would assume there is never going to be a situation where someone would do Aside from doing multiple imports, which seems unlikely, and rigid tests that could be modified, is there any reason that the specific values of AST nodes' ids (other than the ones initially loaded from the compiler output) would be relevant? |
@d1ll0n Hey, Dillon. Sorry it took me a week to take a look into what you submitted. I left one very raw idea here. I would try to separate ID spaces, due to compiler is free to do whatever it wants and bring complete havok in terms of ID generation. We have two languages, so closest thing I think we can do is to try to reflect / replicate the language ID space difference in the logic. Still, I'm unsure if consequences would be applicable practially. It worth a reasearch... |
Description
I've made my own kind of janky mixins to force solc-typed-ast to work with Yul in my projects, but I'd like to directly integrate it in the package so that it works smoothly with all the tooling you've made for solidity. I'm assuming the current lack of support for Yul is due to time and priorities rather than intentionally being excluded, but please let me know if I'm wrong.
I have a good idea of what all I need to update, but there are a few questions with regard to your preferences and general repository understanding. I'd love any additional feedback on this or requirements you would have to get these changes merged into the repo.
For clarity, I imagine adding support for Yul isn't a top priority for you, so I'm not requesting any time from your team to integrate these features beyond any input you'd be willing to give and then a code review when it's ready.
Todo
make
andcopy
support to ASTNodeFactoryYulBlock
,YulForLoop
,YulFunctionDefinition
as scopes and recognizeYulVariableDeclaration
andYulFunctionDefinition
as definitions.SrcDesc
outputs with documentation like the other writers, rather than strings.Questions
I have a couple specific things I'd like additional input on:
1. Missing requirements
Am I missing anything in the todo list above re: critical areas that will need to be updated?
2. External reference expressions
solc does not produce any specific details about the expression used to reference external identifiers, e.g. if you set a calldata value's offset with
x.offset := 10
the AST node will simply be aYulIdentifier
with{ name: "x.offset" }
, whereas a solidity AST node would wrap that with aMemberAccess
. Any thoughts for how to handle this? I am planning to add areferencedDeclaration
field toYulIdentifier
, but this should ideally always point to an actual declaration. Maybe there could be another propertyexternalReferenceExpression
with a full AST node for the expression used?I also have some questions regarding your (Consensys team's) preferences:
3. Documentation
I don't believe solc recognizes natspec/documentation for Yul nodes. Are you generally opposed to extending the AST node classes beyond the original types, or would it be alright to add documentation support to these nodes? It looks like there is already a post-processor to pull additional documentation beyond what solc produces, so I assume this is fine.
4. Referenced declarations
solc does not record any information about referenced declarations for Yul identifiers. I was planning to integrate an identifier resolution step as a post-processor to populate that information in the nodes. Do you have any issue with this? I ask because it'd add a fair number of additional steps to generate an AST beyond what most of the processors do.
5. Older versions of solc
Since older versions of solc simply produce strings for yul blocks, how should this be handled? I'd assume you'd want to leave those as is and only process into objects for modern AST outputs.
6. Directory structure
I was thinking I'd add a new
yul
directory undersrc/ast/implementation
withstatement
andexpression
subdirectories. Any preferences?Context
Yul AST nodes currently have minimal support in the library, the data is just copied from solc's output into the parent
InlineAssembly
object and cast to a very vagueYulNode
type. Because they are supported by the writer, one can still manipulate yul nodes manually and translate back to solidity code, but because they are not instantiated with their own classes the way other nodes are, none of the tools in solc-typed-ast can be applied to them without significant work-arounds.The text was updated successfully, but these errors were encountered: