Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outline work needed for cljs and cljc support #195

Open
julienfantin opened this issue Apr 15, 2017 · 39 comments
Open

Outline work needed for cljs and cljc support #195

julienfantin opened this issue Apr 15, 2017 · 39 comments

Comments

@julienfantin
Copy link
Contributor

julienfantin commented Apr 15, 2017

I'd like to use this issue to start a discussion on what needs to be done in order to add support for cljs and cljc.

Most of what I consider to be the important functionality involves the analyzer and is unavailable outside of clj files.

I'm positive others would like to improve this state too, but for someone who's not familiar with the project trying to tackle these issues seems a bit hopeless.

It would be helpful to start outlining what the current problems are, what solutions could work and lay a path for people who are interested in working on this but don't know where to start.

Some questions I currently have in mind:

  1. Why does the current approach cannot work for cljs?
    This is not specific to refactor-nrepl, but some pointers on how exactly does the analysis process differ between clj and cljs would make a good resource.

  2. In principle, are all the existing middlewares applicable to a cljs ast?
    It is unclear if there are platform-specific blockers that would prevent this from happening.

  3. What are some potential solutions for implementing the existing functionality in cljs?
    Some high level on how the cljs runtime could integrate with the rest of the middleware.

  4. Are there any potential problems with having to deal with two distinct analysis results?

Cheers!

@benedekfazekas
Copy link
Member

awesome you opened this. let me collect my thoughts and previous discussions around this before answering in detail

@julienfantin
Copy link
Contributor Author

julienfantin commented Apr 15, 2017

  1. What strategies could be used for testing?
    How do we prevent regressions and assert that any given feature works in all three contexts? There seems to be a strategy in place already with rest/resources but it's not clear whether this would scale? Also the current model could use some documentation if there is indeed a method to it.

@benedekfazekas
Copy link
Member

Just after a bit of analysis the list of refactor-nrepl functions using ASTs and the list of cljr features using these refactor-nrepl functions.

  • refactor-nrepl.find.find-locals

    • promote fn
    • extract function
    • inline symbol
    • find usages/rename symbol
    • create-fn from example
    • change fn signature
  • refactor-nrepl.find.find-symbol

    • find local and global sybmol
    • rename local and global symbol
  • refactor-nrepl.find.find-used-publics

    • replace refer all

and now your questions

Why does the current approach cannot work for cljs?
This is not specific to refactor-nrepl, but some pointers on how exactly does the analysis process differ between clj and cljs would make a good resource.

Essentially the current approach does not work for cljs because the analyzer we've choosen (clojure contrib analyzer: org.clojure/tools.analyzer) does not have a working cljs/js implementation. The existing cljs/js implementation has been abandoned due to lack of time to keep up with cljs, see https://github.com/clojure/tools.analyzer.js#this-project-has-been-abandoned.

So as simple as that we decided to "buy" an analyzer solution and build features on top of it expecting the cljs part working too which is not the case unfortunately.

There is one exception, clean-ns which uses its own means to gather information about the namespace it works on. It does not use the AST and supports clj, cljs and cljc.

This answers your 2nd question too I suppose. In the context of clojure.contrib analyzer there is no such thing as cljs AST at the moment unfortunately.

What are some potential solutions for implementing the existing functionality in cljs?
Some high level on how the cljs runtime could integrate with the rest of the middleware.

In short find an other solution the get the necessary information about the clj/cljs code we work with. Apart from the missing cljs story of the currently used analyzer there are other problems with it. A good discussion of these problems by some much brighter ppl than myself can be found here: #134.

The above discussion touches on a load of things so perhaps a very careful thinking over all those points raised could be benefitial. Might raise new questions etc.

(I guess it would be also possible to pick up the maintenance of org.clojre/tools.analyzer.js No idea how much work is that but I am guessing quite a big amount. Also because of the other problems of the tools.analyzer in the context of tooling I would not prefer this way.)

A bit about options on potential solutions

One option is to run our own minimalistic analyzer. clean-ns does this in a very very limited sense. It collects all symbols in a file using tools.reader, see details in the refactor-nrepl.find.symbols-in-file namespace. We could go down this route and build our own minimal ASTs collecting only the info we need. This is in a way what cursive does.

As the cljs story is really getting more and more interesting these days we might have a look on self hosted cljs repls like lumo and planck if they could help us out for cljs.

Also joker looks very interesting. If I understand the impliciations there it should support both clj and cljs. Also as implemented in go it does not need a running REPL. On the flip side tho there are limitations and it is mentioned in the project's non-goals section that feature parity with clojure is not in scope of the project. this may bite us later if we go down this way. we should also consider the fact that going this way would break the architecture compatibility between cider and clj-refactor/refactor-nrepl. this could be a good thing in the sense if this works out that might be a way for cider to go as well in the future (talking about a really huge rewrite now). but also it would very likely defeat my goal of the past year of merging cljr gradually into cider.

disclaimer: although i am very excited to do a POC with joker i have very limited time nowadays to hack on these things unfortunately.

disclaimer: this is more like a vision like thing than anything else and mainly mine. other people in the team and around all things clojure in emacs may not agree with this last section.

i appreciate i might not answered all your questions but hopefully gave some food for thoughts. let me know what you think and if this in any way helpful or just even more confusing...

@julienfantin
Copy link
Contributor Author

julienfantin commented Apr 16, 2017

@benedekfazekas thanks fot the great response.

Please allow me to not address your points right away, because I realize there are some big gaps in my understanding of the various analyzer toolchains.

I was under the impression we'd have no choice but to use https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/analyzer/api.cljc or maybe https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/analyzer.cljc

I found an old post that does clarify some of the goals of tools.analyzer here and puts thnigs in perspective.

It is not clear to me whether using the cljs analyzer directly would be possible. I guess this is a downside of the data-driven approach right here: I cannot look at the refactor-nrepl codebase and easily determine what analyzer functionality it depends on. There is just one or two calls into the library but the rest of the project depends on specific ast keys.

I think what would go a long way towards determining the right path forward would be to have an exhaustive reference of the ast data that's actually used throughout the project. Then we could look at the outputs of different alternatives and easily determine what's different and/or missing. Am I correct that this would help?

If so, maybe a test namespace where each middleware would have to assert against an analysis pass checking for its dependencies, or maybe better, a set of specs for the ast?

Edit: I'm going off another assumption here, which is that with some undetermined amount of shims, we'd be able to build upon the clojurescript analyzer directly and recreate the ast that refactor-nrepl currently requires. I now realize sounds a lot like rewriting a subset of tools.analyzer.js, and without familiarity with those tools it's unclear how much work this represents.

@expez
Copy link
Member

expez commented Apr 16, 2017

I think what would go a long way towards determining the right path forward would be to have an exhaustive reference of the ast data that's actually used throughout the project. Then we could look at the outputs of different alternatives and easily determine what's different and/or missing. Am I correct that this would help?

I don't think so. The AST is the API. That was part of the reason in choosing tools.analyzer to begin with, that we could get a general AST (whether the source was clj, cljs or cljc) and then do work on that AST.

If that's no longer viable, we should look into replicating the capabilities afforded by tools.analyzer. @benedekfazekas listed those above.

I'm not aware of another library that does analysis, like tools.analyzer, but we might be able to get the same capabilities by running the cljs compiler.

This takes us to point 4. in your OP:

Are there any potential problems with having to deal with two distinct analysis results?

No, there isn't. And this is starting to look like the only way forward. The lack of progress is mostly because neither @benedekfazekas or myself write cljs in our day to day job and nobody else has shown any interest in helping us out (until now? :))

@julienfantin
Copy link
Contributor Author

I don't think so. The AST is the API. That was part of the reason in choosing tools.analyzer to begin with, that we could get a general AST (whether the source was clj, cljs or cljc) and then do work on that AST.

Right, the point I was trying to make is: if I run a cljc file through both tools.analyzer.jvm and clojurescript.analyzer I'll get different asts even if the file has no reader conditionals. How different is the big question here, and I haven't checked yet, but I'm guessing substantially different given how much code there is in tools.analyzer.js.

So imagine we were to diff those two asts and make fixing those discrepancies our todo list, we potentially could ignore a bunch of those if we limited ourselves to what refactor-nrep uses, the API would then become a subset of the tools.analyzer AST.

Does that make any sense or am I missing something?

@expez
Copy link
Member

expez commented Apr 16, 2017

I think you're right that the ASTs will be nothing alike, and that's fine. Even if it was possible to find common ground in the two ASTs and normalize them into some more generalized thing, I don't think that would be a good idea. We'd have to track changes in both projects to keep our normalization step in sync.

I think a much better approach would be to just add a new code path to e.g. find-symbol which uses clojurescript.analyzer, or something else, to find symbol occurrences.

@expez
Copy link
Member

expez commented Apr 16, 2017

If you want to talk in real-time, you can find us both on gitter. I'm around on freenode and I think @benedekfazekas still hangs out on slack (in the clojure channel)

@julienfantin
Copy link
Contributor Author

So now, what about: https://github.com/clojure/jvm.tools.analyzer#usage-clojurescript?

The landscape is even more confusing than I initially though, thinking out loud here in hopes that you guys can correct me if I'm missing something:

  • tools.analyzer is an umbrella project that was aiming to provide a consistent, platform agnostic ast. What I didn't know is that it seems to re-implement the compilation process of the target. Which actually would explain why I sometimes get weird analysis error messages for code that seems to compile just fine.

  • tools.analyzer.jvm is the clojure backend refactor-nrepl currently uses.

  • tools.analyzer.js would have been our best option if it were maintained since it's another backend for tools.analyzer. However it's no wonder it's been abandoned if they had to re-implement the compiler given how fast clojurescript changes.

  • clojurescript.analyzer is a possible integration target, though it will require multiplying code paths in refactor-nrepl and might be difficult to support across clojurescript version?

  • jvm.tools.analyzer is yet another analyzer project that we haven't mentioned yet. Despite its name it seems to also support cljs and uses both the regular clojure compiler and clojurescript.analyzer.

At first glance jvm.tools.analyzer seems interesting since it's the only project that could give us a unified api? There are various gotchas listed in the readme, like the analysis causing code to be eval'd, which I think relates to #134 mentioned earlier, and I'm not sure how different its output is from tools.analyzer.jvm.

@expez
Copy link
Member

expez commented Apr 16, 2017

Your concerns for clojurescript.analyzer would be true for jvm.tools.analyzer as well, since that's just wrapping clojurescript.analyzer in addition to the jvm clojure compiler.

For that reason, we might as well use jvm.tools.analyzer to access clojurescript.analyzer.

If you want to get involved @julienfantin, I suggest the following route forward:

  1. Write new code to replicate the current clj functionality for cljs, using jvm.tools.analyzer.
  2. Once we're feature complete, investigate what it would take to use the newly written code to replace the old code (differences in the ASTs, caching / other performance concerns)
  3. Throw out tools.analyzer.

I don't think we should care about duplicating code while investigating. I'd rather have some extra stuff, than break the currently working clj features while adding cljs support.

It's also too early to worry about supporting multiple versions of cljs. Tracking the latest major release is miles better than no support at all. Most likely the cljs analyzer isn't moving very fast, even if cljs itself is!

@julienfantin
Copy link
Contributor Author

Ok that sounds like a plan!

I do have a couple of remaining questions and a comment:

  1. I'm not familiar with the project's architecture and design so when you talk about duplicating code, I'm not sure what you have in mind and where this duplication would start. Would it be entirely on the clojure side, say something like modifying an operation's nrepl message handler to branch between calling one, the other, or both analysis processes depending on the file-type? Or did you mean defining separate refactorings all the way to the emacs client side?

  2. Something we haven't discussed yet, but that sounds like a good first step to me, would be to ensure that we can make a jvm analysis pass on cljc files while ignoring all the cljs reader conditionals. This is certainly not something we would want to release since it could cause all sorts of issues post-refactoring (like stale symbols in cljs files and so on), but it'd help make refactor-nrepl useful for people working in cljc codebases and willing to get involved with an in-progress fork/PR. Are there any reasons besides the one I mentioned why we wouldn't want to do that? If not could you expand on what would need to happen for this to work?

Lastly I'll say that I have very limited yak-shaving time over the next few months, so progress on this will be slow.

@expez
Copy link
Member

expez commented Apr 16, 2017

  1. Let's say you want to start with adding support for cljs for the refactorings that use find-used-locals. This would enable quite a few refactorings (see the list earlier in the thread). You'd basically branch here which checks if the file is a cljs file, and create two new functions:
  • One containing the existing clojure code, which we're leaving completely unchanged
  • A new function which contains all sorts of wonderful and magical new code to make this all work for cljs, using jvm.tools.analyzer.
  1. This sounds like a valuable step to do during research / early implementation. We've been using snapshots to dogfood new features, so as soon as something is in a reasonable state, with some benefit, releasing it to early adopters is cool.

@benedekfazekas is planning to cut a stable release sometime later this month, which makes it easy for people to opt-out of any alpha-quality features that are coming to cljs.

Lastly I'll say that I have very limited yak-shaving time over the next few months, so progress on this will be slow.

Slow is infinitely better than nothing!

@bbatsov
Copy link
Member

bbatsov commented Apr 17, 2017

And there's the option of writing a simplified portable parser - something that won't really try to expand macros and so on. On having more of the refactorings depend on just the loaded code instead of analyzing all the source code.

In general, there's also the bigger question of how to better support ClojureScript in CIDER as well. Most of the limitations come from the stagnation in nREPL's development - in theory if first call support for ClojureScript was added there instead of relying on piggieback a lot of things would become simpler. Also in theory - supporting some evaluation backend that nREPL would simplify the ClojureScript situation a lot. But that's also a lot of work that no one really wants/has the time to tackle.

@benedekfazekas
Copy link
Member

I like that fact that there is a plan formulating how attack this. and agree with @expez that find-used-locals is a nice candidate for POCs. are you guys sure about jvm.tools.analyzer tho? it seems to be an inactive and out of date project to me.. last commit with some real content is 8 months old... am I missing something?

@bbatsov
Copy link
Member

bbatsov commented Apr 18, 2017

Guess you can always ping the maintainer about its state.

@benedekfazekas
Copy link
Member

also there are several self hosted cljs options now. they might worth a check.

@expez
Copy link
Member

expez commented Apr 18, 2017

it seems to be an inactive and out of date project to me.. last commit with some real content is 8 months old... am I missing something?

It's just a wrapper around the clojure and cljs compiler, so unless the API of those two change, the wrapper doesn't have to be updated.

@benedekfazekas
Copy link
Member

ah right. thanks for clarification @expez

@julienfantin
Copy link
Contributor Author

julienfantin commented May 25, 2017

Unfortunately haven't had time to work on this at all, but I did gain some insight on what will need to happen, so I'll leave some notes here, if not only for myself:

Add support for end-column and end-line in the ClojureScript analyzer

The ClojureScript analyzer currently doesn't include end-line and end-column in the AST's :env, but refactor-nrepl's own analyzer namespace relies on that information, notably for node-at-loc?.

I don't know that there is a way to replicate the existing logic around node-at-loc? without being able to bound an ast node with a buffer position, and even if there were it would probably be non-trivial and prevent reusing analyzer logic between clj and cljs.

There is a simpler solution I think, which consist in writing a simple patch for the Clojurescript analyzer. This patch will need to replicate getting the line data from the environment for end-line and end-column and should take care of updating all the call sites (about a dozen or so) to propagate that information throughout the AST.

I mentioned this on the cljs-dev slack channel and it seems like it would be fairly easy to get this merged.

Use clojure.tools.reader.reader-types/indexing-push-back-reader

jvm.tools.analyzer uses a vanilla pusback reader, which I don't think would get us the end-line and end-column metadata that the patch described above would use. Fortunately the analyze-ns function is quite small and easy to change for that purpose.

It is not clear at this point whether there is anything else in that project that we would use and would warrant patching the project rather than rolling our own wrapper for the cljs analyzer.

@julienfantin
Copy link
Contributor Author

ClojureScript patch submitted: https://dev.clojure.org/jira/browse/CLJS-2051

@benedekfazekas
Copy link
Member

awesome!

@arichiardi
Copy link
Contributor

also there are several self hosted cljs options now. they might worth a check.

I have to say that when I was wondering how to do this as well I basically ended up considering the above the easiest approach. In particular, Antonio Monteiro has already ported the ClojureScript analyzer to self-host (node) and this would be a very valuable library to spin off.

I might be the only one here, but I think, and I am really trying hard in orchard, that a better approach would be to have something that works with or without the JVM so that eventually it could be used for static analysis and therefore compete with Cursive in terms of functionality like jump to def and other things without waiting n seconds for a REPL. Purely personal battle I know 😄

Spinning off a new analyzer library has so many upsides that I am considering doing it myself...maybe merging code from tools.analyzer.js as well. However, who knows when I will have time 🕐

I have also looked into https://github.com/thunknyc/crntl and I had an old idea of using universal-ctags for static analysis -> thunknyc/crntl#9. All well and good but no POC so far 😞.

By the way this is a great discussion so thanks @julienfantin!

@kommen
Copy link

kommen commented Jan 5, 2019

The conforming work of the ClojureScript analyzer AST to the tools.analyzer spec which shipped as part of ClojureScript 1.10.439 doesn't help here?

https://dev.clojure.org/jira/browse/CLJS-1461
https://github.com/clojure/clojurescript/blob/master/changes.md#110439

@expez
Copy link
Member

expez commented Jan 5, 2019

The conforming work of the ClojureScript analyzer AST to the tools.analyzer spec which shipped as part of ClojureScript 1.10.439 doesn't help here?

Think that patch was news to all of us :) It should be largely plug and play, I think!

@benedekfazekas
Copy link
Member

yeah sounds like it. good shout @kommen. we discussed this with @vemv in the #cider room on clj slack.

@benedekfazekas
Copy link
Member

while playing with this, realised that https://dev.clojure.org/jira/browse/CLJS-2051 actually never got merged

@benedekfazekas
Copy link
Member

after working around the missing end-line, end-column information the next hurdle seems to be that the stages of macro expansion are not persisted in the cljs AST, there is no raw-forms tag. the code expects this tag to be present to be able to look at the original form in case macro expansion happened.

benedekfazekas added a commit that referenced this issue Apr 13, 2019
as outlined in #195, see specially
#195 (comment)

Take aways:
- one integration test is duplicated for cljs (together with the
source file)
- `cljs.analyzer` used directly instead of `jvm.tools.analyzer` --
latter errored with latest cljs, did not investigate further
- workarounds needed for
  - no `end-line`, `end-column` info in CLJS AST (also see
  https://dev.clojure.org/jira/browse/CLJS-2051)
  - no `:raw-forms` in cljs AST containing the stages of macro
  expansion including the original form
  - :op = `:binding` nodes in CLJS ASTs seems to be missing
  `:children` entry so the AST can not be walked properly
@benedekfazekas
Copy link
Member

see POC for find-locals support cljs on cljs-support branch. thoughts?

@arichiardi
Copy link
Contributor

I like the POC 😄! Also that piece of functionality could be a very good candidate for the orchard!

@bbatsov
Copy link
Member

bbatsov commented Apr 15, 2019

@arichiardi Although maybe we can achieve pretty much the same result using rewrite-clj and just ignoring some of the complexity that goes alongside tools.jvm.analyzer and ClojureScript's own analyzer. Looking at projects like clj-kondo it seems you can go pretty far with rewrite-clj and without having to evaluate any code to build the AST.

@benedekfazekas
Copy link
Member

Thanks @arichiardi

also recorded my finding here https://dev.clojure.org/jira/browse/CLJS-1461?focusedCommentId=51581#comment-51581

next step is doing a POC for the same find-locals using a rewrite-clj based not evaling parser

benedekfazekas added a commit that referenced this issue Apr 25, 2019
as outlined in #195, see specially
#195 (comment)

Take aways:
- one integration test is duplicated for cljs (together with the
source file)
- `cljs.analyzer` used directly instead of `jvm.tools.analyzer` --
latter errored with latest cljs, did not investigate further
- workarounds needed for
  - no `end-line`, `end-column` info in CLJS AST (also see
  https://dev.clojure.org/jira/browse/CLJS-2051)
  - no `:raw-forms` in cljs AST containing the stages of macro
  expansion including the original form
  - :op = `:binding` nodes in CLJS ASTs seems to be missing
  `:children` entry so the AST can not be walked properly
@benedekfazekas
Copy link
Member

and a PoC with a rewrite-clj based, not evaling analyzer here: https://github.com/benedekfazekas/trin/tree/master/trin

@vemv
Copy link
Member

vemv commented Jul 3, 2021

https://github.com/clojure-lsp/clojure-lsp/ became a thing in the meantime. Since very recently it features an API.

It works with clojurescript.

So I'd propose that refactor-nrepl is defined in terms of defprotocols, with typically two possible implementations: "traditional" (i.e. refactor-nrepl's current one) and LSP.

  • traditional for projects targeting the JVM
  • LSP for projects (also) targeting cljs, or for projects too hard/slow to analyze with t.ana
  • There could be extra options/heuristics for making the right choice on a per-project basis.

@expez , what would you think?

Personally I'd see it as one of the very few options ahead. As usual in OSS "who's doing the work" is an important factor: refactor-nrepl only has sporadic maintainance and tools.analyzer less so. So delegating the work seems sensible.

Protocols have a certain cleanliness to them anyway. e.g. what if the clojurescript compiler itself becomes a nice "ast backend" in a future :)


Final note:

a defprotocol allows us to even not bother with integrating LSP ourselves at all. It could be done in an external project gluing our protocol to an impl.

@bbatsov
Copy link
Member

bbatsov commented Jul 3, 2021 via email

@vemv
Copy link
Member

vemv commented Jul 3, 2021

Yes I'd implement the functionality purely in terms of protocols which would abstract over the choice of tools.analyzer, clojure-lsp or others.

Does that answer the q?

@bbatsov
Copy link
Member

bbatsov commented Jul 3, 2021 via email

@expez
Copy link
Member

expez commented Jul 4, 2021

I agree with you both :) That said I haven't done enough work to have an opinion about the best way to add support for cljs.

The alternative of using something like rewrite-clj to add this functionality doesn't excite me, though. The naive approach is already extremely well covered by everything in the wgrep family. Those tools work especially well for clojure(script) if you have some discipline around not using refer and not re-using aliases.

@vemv
Copy link
Member

vemv commented Jul 4, 2021

I meant I’d prefer if this functionality in Clojure LSP was detached from
LSP. Now the dep is going to be bigger than it could be. Not a big deal for
me, just a concern.

Personally I like that clojure-lsp with its new API is kind of "just a lib". The fact that it was originally meant for LSP is incidental.

If we were to write some sort of broader full-flown protocol compatible with all sort of LSP impls, wouldn't we be reinventing the LSProtocol?

I'd say there's already https://github.com/emacs-lsp/lsp-mode / etc for that

@vemv
Copy link
Member

vemv commented Jul 4, 2021

The alternative of using something like rewrite-clj to add this functionality doesn't excite me, though. The naive approach is already extremely well covered by everything in the wgrep family. Those tools work especially well for clojure(script) if you have some discipline around not using refer and not re-using aliases.

Yes I get what you say and share the sentiment to some degree. wgrep could be yet another impl.

refactor-nrepl could bundle whatever impls seem high-quality and still leave the door open for user-provided impls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants