Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rust] Sync with Rust Enhanced #2305

Merged
merged 6 commits into from
Jan 8, 2022
Merged

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented Mar 1, 2020

This is an attempt to sync the official rust-lang syntax definition from https://github.com/rust-lang/rust-enhanced/ with the changes made here in the sublimehq repository.

@jrappen
Copy link
Contributor

jrappen commented Apr 9, 2020

@ehuss FYI:

In case you have questions, I suggest joining the SublimeHQ Discord:

SublimeHQ Discord

Rust/Rust.sublime-syntax Outdated Show resolved Hide resolved
Rust/Rust.sublime-syntax Outdated Show resolved Hide resolved
Rust/Rust.sublime-syntax Outdated Show resolved Hide resolved
Rust/Rust.sublime-syntax Outdated Show resolved Hide resolved
Rust/Rust.sublime-syntax Outdated Show resolved Hide resolved
@wbond
Copy link
Member

wbond commented Jul 16, 2020

This is going to be a lot of work to accept since all of the tests and a significant amount of the syntax changed at once. From a review standpoint I've got to go through it like it is a ground-up rewrite.

In the future, if there are large changes, it is much easier to accept if tests are modified via:

  • additions
  • commits where tests must be changed with an explanation of why

@ehuss
Copy link
Contributor Author

ehuss commented Jul 16, 2020

I completely understand and recognize it will be difficult. There was unfortunately quite a lot of changes since things forked 3 or 4 years ago, and trying to preserve each change on top of the latest master would have also meant more or less rewriting everything. FWIW, there is a list of changes from sublimehq to rust-lang here. I did what I could to preserve the sublimehq changes and conventions when there were conflicts.

I also intend, if this is accepted, to upstream each change individually, to avoid big dumps of changes.

I can maybe try to write a summary of all the PRs that changed things on the rust-lang side, would that help?

@wbond
Copy link
Member

wbond commented Aug 3, 2020

I can maybe try to write a summary of all the PRs that changed things on the rust-lang side, would that help?

Not really. With the scope of the changes and the tests changing it will need to be a thorough, all-details review, including making sure that appropriate tokens are using the correct choices from https://www.sublimetext.com/docs/scope_naming.html.

@rwols
Copy link
Contributor

rwols commented Dec 4, 2020

@ehuss this syntax breaks the greedy punctuation.accessor trick: scope . and :: as punctuation.accessor as soon as possible in order to trigger auto-complete.

In fact, this syntax doesn't account for punctuation.accessor at all.

use std::io;
//     ^^ missing punctuation.accessor

fn main() {
    io.into();
    //^ missing punctuation.accessor
    println!("Hello, world!");
}

@rwols
Copy link
Contributor

rwols commented Dec 4, 2020

Note that the current syntax in master does apply this idea correctly.

@FichteFoll
Copy link
Collaborator

The snytax also doesn't allow using union as an identifier and highlights it as storage.type.union everywhere.

The following is legal code (example for HashSet::union):

use std::collections::HashSet;
let a: HashSet<_> = [1, 2, 3].iter().cloned().collect();
let b: HashSet<_> = [4, 2, 3, 4].iter().cloned().collect();

// Print 1, 2, 3, 4 in arbitrary order.
for x in a.union(&b) {
    println!("{}", x);
}

let union: HashSet<_> = a.union(&b).collect();
assert_eq!(union, [1, 2, 3, 4].iter().collect());

playground

@FichteFoll
Copy link
Collaborator

FichteFoll commented Dec 6, 2020

Otherwise, simply the very much more consistent highlighting of types makes this syntax basically a must have for me. There are still some edges (like those two issues reported above), but I believe it to be a solid base for further improvements. If an all-details review is required for merging it, then I can't promise when I would find the time to go through it, since it takes quite a bit of time to review 4k lines of a syntax definition, and a diff no less.

Instead, I would probably go through the definition as if it was brand new and annotate things that aren't aligned with the current guidelines. Unfortunately, our guidelines themselves are constantly moving since we haven't standardized quite a few aspects of syntaxes, notably storage, but I'd just treat them as-is and not comment on them unless they are obviously wrong.

@FichteFoll
Copy link
Collaborator

FichteFoll commented Dec 10, 2020

I forgot to mention that every function call is scoped support.function.rust, which is definitely against our standards. That should be variable.function.rust instead.

Also, every occurance of an Uppercase identifier is scoped as storage.type. We haven't decided on how to proceed with the different occurances of type names, unfortunately (#1842), but currently this seems unprecedented.

@ehuss
Copy link
Contributor Author

ehuss commented Dec 15, 2020

this syntax breaks the greedy punctuation.accessor trick:

@rwols Can you clarify what you are seeing? The examples you gave seem to set punctuation.accessor:

image

image

The snytax also doesn't allow using union as an identifier

It works correctly in some cases (such as fn union()), but not all. This is a known issue (rust-lang/rust-enhanced#399), and I think will be quite difficult to fix because it is context dependent, and the current syntax does not distinguish those contexts.

I forgot to mention that every function call is scoped support.function.rust, which is definitely against our standards. That should be variable.function.rust instead.

Can you clarify, support.function seems to be used for built-in and standard library functions in the other syntax definitions, is that the correct usage? That is, in Rust, should support.function be used by standard library functions? There aren't many free functions, so I'm not sure if it's worth having support.function at all. There are dozens (hundreds?) of common methods exposed by traits, and I doubt those should be scoped support.function, but maybe they should?

Also, every occurance of an Uppercase identifier is scoped as storage.type

In practice, those are always types. That will take some time to untangle, since I'm not really sure how types should be scoped in the different circumstances they are used. (This looks to be inherited from the tmLanguage definition from over 5 years ago.)

@deathaxe
Copy link
Collaborator

... built-in and standard library functions in the other syntax definitions, is that the correct usage?

That's correct in general, while the "standard library" part may be a beast for some syntaxes. There's no real rule what to include and what not here, I guess.

The difficulty of this rule maybe the difference of syntaxes. Some define real sets of builtin functions while others implement those in standard libraries only but interpret them as defacto builtins.

While fundamental functions of a core part of a standard library whould probably be good to be scoped support.function it maybe somewhat cumbersome to try to add hundreds of thouthands of functions of a huge standard library or even try to highlight methods of classes of a standard library this way.

We should find a sane balance here. When in doupt, a syntax should use variable.function to scope function call identifiers in general and support.function for most important core parts.

@rwols
Copy link
Contributor

rwols commented Dec 15, 2020

Can you clarify what you are seeing? The examples you gave seem to set punctuation.accessor:

You're correct. I don't know where I got my syntax from then.

Overall I think this is indeed definitely mergeable!

@FichteFoll
Copy link
Collaborator

and I think will be quite difficult to fix because it is context dependent, and the current syntax does not distinguish those contexts.

As long as it's not a regression, this is definitely fine with me and can be worked on later.

Can you clarify, support.function seems to be used for built-in and standard library functions in the other syntax definitions, is that the correct usage?

Generally, yes, but this syntax applies this scope to all functions, even those I defined myself or bogus names, which makes this separation inapplicable.

The problem with highlighting built-in functions in a language like Rust that heavily uses method chaining is that a static syntax highlighter simply cannot know which function is being executed, since it has no information on the underlying type (or trait). Now, it could try to only match function names that can be found in std, but that will inevitably lead to false positives so mach that I'd rather not have built-in highlighting at all.

You can do this for Languages like Python or Php, which have a set of default global functions, but for Rust (or Java, for comparison), I simply believe this to not be applicable. (For Java, you could argue for Object methods, since all objects will have those, but then you might as well not bother.)

@ehuss
Copy link
Contributor Author

ehuss commented Dec 29, 2020

I removed the use of support.function, that seemed clearly incorrect.

I also added support for const generics which will be on the stable channel in March.

I spent some time trying to better understand the concern about storage.type. I'm not sure if there is a clear alternative to that, as different languages seem to treat type constructors differently. For example, in C#, new MyStruct {} scopes MyStruct as storage.type.cs, which is equivalent in Rust to MyStruct {}. It isn't clear to me how to narrow the usage of storage.type, since there are various things that are similar, like MyStruct() or SomeEnumVariant or even a bare SomeTypeConstructor. I'd be fine with trying to figure out something better, but it is not clear to me from the existing docs and syntaxes what would be better.

@FichteFoll
Copy link
Collaborator

I'd be fine with trying to figure out something better, but it is not clear to me from the existing docs and syntaxes what would be better.

You're not alone with this. I say, until we find a common ground for this topic, just leave it as-is. (As I mentioned before.)

@deathaxe
Copy link
Collaborator

deathaxe commented Jan 5, 2021

I spent some time trying to better understand the concern about storage.type.

I think that's the very appropriate scope for all kinds of user defined data types. Nothing to be concerned about.

Many syntaxes used to scope class names support.type or support.class. One major reason might have been distincting them from keywords like struct, enum, class etc., which used storage.type.

This conflict has been resolved by scoping those keywords keyword.declaration rather than storage.type. Keywords like fun, trait, etc. may be renamed this way to comply with the most recent developments, but otherwise using storage.type for all user defined classes/types is ok. Java and Haskell also follow that scheme of C# by doing so. Others will follow.

scope: support.macro.rust

- match: '{{support_type}}'
scope: support.type.rust
- include: support-type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This context is already included in line 135. Is there a reason for it being here again?

pop: true
- match: '(crate|in|self|super)'
scope: keyword.other.rust
- match: '::'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good candidate for punctuation.accessor.

- match: '::'
scope: meta.path.rust
- match: '{{identifier}}'
scope: meta.path.rust
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would variable.namespace or variable.other be a good scope here?


attribute-call:
- match: \)
scope: meta.function-call.rust meta.group.rust punctuation.section.group.end.rust
pop: true
- match: '({{identifier}})(\()'
- match: '({{identifier}})\s*(\()'
scope: meta.function-call.rust
captures:
1: variable.function.rust
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is meta.group probably missing in line 292? I'd expect meta.group to cover the whole function arguments list.

- match: '\]'
scope: punctuation.section.group.end.rust
pop: true
- match: '[ \t]+'
- include: strings
- include: comments

attribute-call:
- match: \)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attribute-call is included in line 264, which already defines a pattern for \) to pop a context. This makes this one being a duplicate then. Would this pattern probably be better placed before the -include: atttribute-call in row 293?

@jwortmann
Copy link
Contributor

... storage.type for all user defined classes/types is ok. Java and Haskell also follow that scheme of C# by doing so.

Java and C# seem to use support.class and support.type currently.

I was under the impression that storage.type is meant to be used for a closed set of intrinsic/primitive types, to distinguish them from user-defined types. From a quick look at some of the syntaxes in this repo, storage.type is used for intrinsic types in C#, C++, Java, while support.type or support.class being used for user-defined classes in C#, Java and JavaScript. Python on the other hand uses support.type for type annotations with built-in types and meta.generic-name for user-defined classes. As far as I can tell, Go seems to use storage.type for all types.

So overall there doesn't seem to be a common convention which scope name to use for (user-defined) types.

But for example the Mariana color scheme uses the same color for keyword and for storage.type, so it might not be the best choice to use this scope for all kind of user-defined types and classes. I already commented on this recently in #2643 (comment).
(Of course everyone can choose a different color scheme as they like, but since Mariana is (one of) the default color scheme(s) and probably widely used, it should at least be considered in this decision. And I would guess that various 3rd-party color schemes apply a similar style as well, because as already mentioned, storage.type was often used for declaration keywords before.)
Actually, I neither think that support.* really fits for user-defined types, but there doesn't seem to be a good alternative yet.

@deathaxe
Copy link
Collaborator

deathaxe commented Jan 6, 2021

And support.type has the same color as variable.function, which is not much better.

This is what it looks like in rewritten Java with classes using storage.type.

grafik

I don't find it too bad, honestly.

@jwortmann
Copy link
Contributor

Definitely doesn't look bad.

But what about this:

img1

Wouldn't it be nicer if it looks like this:

spoiler <- click me

img2

So if e.g. built-in classes get storage.type and user-defined classes get support.class, you can distinguish between them, for example to easily spot typos like above. In general it's a nice typing experience if the highlighting color for a type/class or built-in function changes after typing its last letter. This isn't possible if only a single storage.type scope is used for all kind of types and classes, including user-defined ones. And that way you can't even visually recognize primitive types as "type keywords" (for example int, boolean, etc. are reserved words in Java).

But I must mention that the current Java syntax doesn't distinguish between built-in and user-defined classes either. Some syntaxes like PHP seem to maintain huge lists of built-in classes, functions and constants. Maybe it's infeasible in Java due to the large amount of built-in classes in the Java core libraries?

JavaScript uses support.class.builtin.js for a handful of predefined classes and only support.class.js for unrecognized class names. Perhaps that's an even better approach, i.e. keep storage.type for actual reserved words?

@deathaxe
Copy link
Collaborator

deathaxe commented Jan 7, 2021

I don't want to hijack this PR for general discussions, but...

I agree with you in general and follow/second your intention. I just don't get why we want to use support.function for builtin or library functions with the argument of support being for such kinds of things and do the opposite for types. If support is meant for builtins and library stuff, it should so for all kinds of entities (functions and types). When scoping library classes support.class or support.type and user defined ones storage.type you can find spellings such as your Strign as well.

Type keywords may still be distinguished by a 3rd level scope, even though they look the same as user defined functions by default then. But I don't find that too bad, as it is just a data type as well. Of course, it would make sense then to put those under support.type as well, if we want to be consequent. But as scoping library stuff is somewhat optional and not even possible with sane efforts for some languages, I don't find it too bad to just say I have a storage.type.[primitive|class|struct|enum] etc.

@jrappen
Copy link
Contributor

jrappen commented Aug 18, 2021

@ehuss As things are moving again in this repo, maybe you could:

  • re-sync with rust-enhanced, this should also re-run tests with the Build 4112 test binary
  • provide some numbers regarding performance changes using Tools > Build with ... > Syntax Tests - Performance?

support.function is intended only for functions that are provided by the
standard library. Rust doesn't really have very many free functions
(`drop` is about the only one in the prelude). variable.function is
supposed to be used for function calls.
@ehuss ehuss force-pushed the sync-rust-enhanced branch from 21c26ba to 0ea7b78 Compare September 26, 2021 16:20
@ehuss
Copy link
Contributor Author

ehuss commented Sep 26, 2021

Sorry about the delay. I rebased, but it looks like CI requires approval.

I combined all the tests into one file (3046 lines) and ran the performance tests on my machine. There was a bit of variability, but in general the old syntax was about 13.5ms and the new one was about 15ms.

@jrappen
Copy link
Contributor

jrappen commented Sep 29, 2021

First of all, thanks for this.


After a quick glance at this, a few remarks:

  • Snippets have not been updated for extended functionality in newer Builds.
  • Comments from deathaxe's review from January have not been addressed, yet.
  • This rewrite of the syntax does not include supporting any new functionality from the 4xxx Builds of Sublime Text at all.
  • This rewrite ignores some new syntax definition guidelines discussed in RFC discussions like meta.* scopes.

Is this meant to be a first sync with the 3rd party package to be then worked on in a second step? If yes, this seems fine.

Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this syntax definition yet another time for my advent of code solutions in Rust and it did a good job, definitely better than the current built-in one. There are a few minor scope inconsistencies, but those are easily fixed in follow-ups and I have made some notes for them that I intend to address myself later, including the remaining open discussions by @deathaxe.

I didn't do a comparison with RustEnhanced in its current state. I believe the PR to be mergable as-is, because it is definitely an improvement over the current default syntax and it's kinda shameful it took this long. Any further issues, however small they are, can be fixed later.

@deathaxe deathaxe merged commit 79b81b5 into sublimehq:master Jan 8, 2022
@deathaxe
Copy link
Collaborator

deathaxe commented Jan 8, 2022

So let's go ahead.

varungandhi-src added a commit to sourcegraph/Packages that referenced this pull request Jan 14, 2022
Additions:
- Added documentation on making changes for JS/JSX/TS/TSX,
  along with a helper shell script.

Conflicts fixed:
- README.md
  - slimsag/master had a bunch of additional information on how to
    add new definitions, information on licensing etc.
  - In upstream, there is a short description on how to submit PRs.
  - Fix: I moved all the additional information into MakingChanges.md,
    which is local to this fork. This prevents future merge conflicts
    from popping up in the README.
- Rust.sublime-syntax:
  - slimsag/master had moved the file to RustEnhanced.sublime-syntax
    in 789a76e due to problems with the definitions upstream,
    and made some minor tweaks.
  - In upstream, there was a recent fix (sublimehq#2305)
    for improving Rust highlighting.
  - Fix: We delete RustEnhanced.sublime-syntax and use Rust.sublime-syntax
    from upstream.
- Clojure.sublime-syntax:
  - slimsag/master use Clojure highlighting for .cljs and .cljx files.
  - Upstream did not use cljs and cljx for Clojure highlighting.
    Instead, it supported .cljs via a separate ClojureScript syntax,
    which itself is based on Clojure syntax.
  - The common ancestor did use Clojure highlighting for .cljs files.
    This was removed in cc5ce9b.
  - Fix: I added cljx to ClojureScript.sublime-syntax and removed it
    from Clojure.sublime-syntax.
- JavaDoc.sublime-syntax:
  - slimsag/master largely matched with the common ancestor, whereas
    upstream had made some changes. I picked most of the changes
    from usptream.
  - There is a regex for 'escape:' which was changed:
    - common ancestor: \*/
    - slimsag/master: \s*\*/
    - upstream: \*+/
    - merged (maximally lenient): \s*\*+/
- JavaScript related changes:
  - slimsag/master was based on a generated version of
    Thom1729/Sublime-JS-Custom.
  - Upstream had updated to a version based off Thom1729/Sublime-JS-Custom
    (see 40ec1f2), but it differs significantly from the
    generated syntax.
  - Additionally there are filename differences (the old syntax
    definitions in slimsag/master used "React" in the name).
  - Fix: I regenerated the files from Thom1729/Sublime-JS-Custom,
    and used them to replace the existing files for JSX and TSX.
    - Additionally, I removed the 'js' file extension from the
      JSX sublime-syntax file.
  - This change required some changes to the tests as well.
    However, all changes to tests involved erroneous code,
    so the change in highlighting is not significantly problematic.
mitranim pushed a commit to mitranim/Packages that referenced this pull request Mar 25, 2022
This commit syncs the official rust-lang syntax definition from https://github.com/rust-lang/rust-enhanced/ with the changes made here in the sublimehq repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants