feat: Typst Language Support #302

grantlemons · 2024-12-04T18:12:02Z

Re-opening of #289 from my fork instead of a branch

…d harper-ls

…of an environment's span

… possessives or conjunctions

…espective tests

…y parsing

…ove dict parsing to fit better

…ntent in span so quotes aren't escaped

grantlemons · 2024-12-18T17:36:20Z

Mostly just needs more tests. I'm no expert in Typst syntax, so if someone wants to help it would be much appreciated.

harper-core/src/parsers/typst.rs

Andrew15-5 · 2024-12-18T18:35:42Z

If I can understand the core concepts mentioned above (what should/could be checked), then I maybe be able to add some tests.

harper-core/src/parsers/typst.rs

…tions inside function

harper-core/tests/run_tests.rs

grantlemons · 2025-01-04T20:48:11Z

I decided to move typst parsing into its own crate, since it simplifies the feature dependency and establishes good precident that can be followed for unofficial/external parsers in the future.

- Bump version to v0.15.0

elijah-potter

There's some stuff I didn't see last time. Sorry again, but I just need some clarification.

harper-cli/Cargo.toml

harper-cli/src/main.rs

harper-ls/Cargo.toml

harper-ls/src/backend.rs

harper-typst/src/lib.rs

elijah-potter · 2025-01-13T16:46:18Z

@grantlemons, don't forget to make the necessary changes to nvim-lspconfig and the Visual Studio Code plugin as well as the documentation.

You may open another PR to do those, or push to master. Up to you.

Andrew15-5

Eyyy! It was already merged, kewl. I guess I'm late to the party, but I still have a few things I want to discuss about the final patch.

P.S. I was really busy and only now found time to look into this.

Andrew15-5 · 2025-01-15T22:02:26Z

harper-typst/src/lib.rs

+    #[test]
+    fn contraction() {
+        let document = Document::new_curated("doesn't", &Typst);
+        let token_kinds = document.tokens().map(|t| t.kind).collect_vec();
+        dbg!(&token_kinds);
+
+        assert_eq!(token_kinds.len(), 1);
+        assert!(!token_kinds.into_iter().any(|t| {
+            matches!(
+                t,
+                TokenKind::Word(WordMetadata {
+                    noun: Some(NounData {
+                        is_possessive: Some(true),
+                        ..
+                    }),
+                    ..
+                })
+            )
+        }))
+    }


Uhh, why is contraction identified a "possessive"? To me, it looks obviously wrong.

You're right, that must have happened when I was removing the contraction logic from the typst parser itself. I'm actually going to remove the contraction tests entirely, since that's document logic.

Andrew15-5 · 2025-01-15T22:06:40Z

harper-typst/src/lib.rs

+    #[test]
+    fn non_adjacent_spaces_not_condensed() {
+        let source = r#"#authors_slice.join(", ", last: ", and ")  bob"#;
+
+        let document = Document::new_curated(source, &Typst);
+        let token_kinds = document.tokens().map(|t| t.kind).collect_vec();
+        dbg!(&token_kinds);
+
+        assert!(matches!(
+            &token_kinds.as_slice(),
+            &[
+                TokenKind::Unlintable, // authors_slice.join
+                TokenKind::Punctuation(Punctuation::Comma),
+                TokenKind::Space(1),
+                TokenKind::Unlintable, // Ident
+                TokenKind::Punctuation(Punctuation::Comma),
+                TokenKind::Space(1),
+                TokenKind::Word(_), // and
+                TokenKind::Space(1),
+                TokenKind::Space(2),
+                TokenKind::Word(_),
+            ]
+        ))
+    }


So how would harper work in this case? Will it complain that there are 2 consecutive commas in the "sentence"? (And also that there is not 2, but 3 consecutive spaces, even though the 1st of 3 spaces won't go directly next to the last 2 spaces.) Or are Unlintable tokens split lintable tokens into isolated sentences/blocks?

, , and bob

And also, are 2 spaces intentional?

P.S. the // bob comment is missing.

Yes, the unlintable tokens split it up because our matching patterns don't expect them.

The two spaces are intentional, they are included because of some changes I was making to how Space Spans are condensed. Really, I don't think it's relevent to typst though, so I'm going to remove it from the test

Andrew15-5 · 2025-01-15T22:15:07Z

harper-typst/src/lib.rs

+    #[test]
+    fn header_parsing() {
+        let source = r"= Header
+                       Paragraph";
+
+        let document = Document::new_curated(source, &Typst);
+        let token_kinds = document.tokens().map(|t| t.kind).collect_vec();
+        dbg!(&token_kinds);
+
+        let charslice = source.chars().collect_vec();
+        let tokens = document.tokens().collect_vec();
+        assert_eq!(tokens[0].span.get_content_string(&charslice), "Header");
+        assert_eq!(tokens[2].span.get_content_string(&charslice), "Paragraph");
+
+        assert!(matches!(
+            &token_kinds.as_slice(),
+            &[
+                TokenKind::Word(_),
+                TokenKind::Newline(1),
+                TokenKind::Word(_)
+            ]
+        ))
+    }


2 concerns:

Are all the leading spaces in "= Header\n Paragraph" string ignored by the harper?

The LF between two text lines are treated as spaces (Space: "\n") in Typst's AST, which are replaced by whitespaces so that you can wrap your long paragraphs after 80's column. But in this case, the first line is a heading, which means that there is heading() + par() and Newline(1) seems appropriate in this specific use case. So will it still print Newline for wrapped long paragraph? And most importantly, what exactly does Newline mean to harper?

Are all the leading spaces in "= Header\n Paragraph" string ignored by the harper?

I believe leading spaces are ignored, yes. @elijah-potter correct me if I'm wrong.

what exactly does Newline mean to harper?

TokenKind::Newline(count) represents count \n newline characters.

I believe they act to break up sequences of tokens between lines. In this case, for instance, you don't want Harper to consider the text to be Header Paragraph, you want it to be Header and Paragraph seperetely.

Andrew15-5 · 2025-01-15T22:19:07Z

harper-typst/src/lib.rs

+    #[test]
+    fn label_unlintable() {
+        let source = r"= Header
+                       <label>
+                       Paragraph";
+
+        let document = Document::new_curated(source, &Typst);
+        let token_kinds = document.tokens().map(|t| t.kind).collect_vec();
+        dbg!(&token_kinds);
+
+        assert!(matches!(
+            &token_kinds.as_slice(),
+            &[
+                TokenKind::Word(_),
+                TokenKind::Newline(1),
+                TokenKind::Unlintable,
+                TokenKind::Newline(1),
+                TokenKind::Word(_),
+            ]
+        ))
+    }


Similar to above, the LF after the label won't add any new line to the output, as labels by themselves aren't visible (usually they are attached to some element in the markup mode). So this again raises concern if Newline token is really a hard newline that I'm thinking of (that breaks paragraphs into separate paragraphs or something similar).

Andrew15-5 · 2025-01-15T22:22:07Z

harper-typst/src/lib.rs

+        let source = r#"group’s
+writing"#;


Why this is written differently from previous examples? Why not write:

let source = r#" group's writing "#;

There is dedent crate that can prettify this into "group's\nwriting", but that's a new dependency.

This is because this test exists to test the offsetting functionality, converting between byte spans and char spans. I wanted to keep the string as similar as possible to the string that had issues, hence why I didn't change the formatting at all.

Andrew15-5 · 2025-01-15T22:44:29Z

harper-typst/tests/test_sources/complex_typst.typ

I'm not sure if an overly complex example is the intent here, so if it isn't, then I would like to simplify it and make it both: use best practices and be more practical. The way the sugar syntax for emph is used, is really cursed and isn't the intended way. And in addition, fix some bugs that were not caught because the visual output isn't present, so not audited.

Simplified example

#let template( title: "Default Title", authors: ("Author 1", "Author 2"), abstract: [*This is content*], body, ) = { set par(justify: true) set page( paper: "us-letter", columns: 2, number-align: top, numbering: (..n) => if n.pos().first() > 1 { n.pos().map(str).join(" of ") + h(1fr) + title }, ) place( top + center, float: true, scope: "parent", clearance: 2em, )[ #show heading: set text(17pt) = #title #let authors-line = if authors.len() > 3 { // "et al." isn't parsed properly, but this isn't the fault of the Typst // parser. // authors-max3.push("et al.") authors => authors.join(", ") } else { authors => authors.join(", ", last: ", and ") } #emph(authors-line(authors.slice(0, calc.min(authors.len(), 3)))) #par(justify: false)[ *Abstract* \ #abstract ] ] body } #show: template.with( title: "A fluid dynamic model for glacier flow", authors: ("Grant Lemons", "John Doe", "Jane Doe"), abstract: lorem(80), ) = Introduction #lorem(300) = Related Work #lorem(200)

Yes, the intent was to make it rather complicated, to see if any issues would arise from nesting statements and such. If you'd like to add more examples that demonstrate a more idiomatic typst document, that would be fantastic.

elijah-potter · 2025-01-15T23:00:51Z

Eyyy! It was already merged, kewl. I guess I'm late to the party, but I still have a few things I want to discuss about the final patch.

Sorry about that! I'm sure @grantlemons would be happy to open a new PR once he has time to look at your comments. Thanks for catching the stuff you've pointed out! I've been quite busy with web stuff, so I admittedly rushed this one.

I see that you're a significant contributor on Typst. I'm impressed. It would be cool to get Harper plugin made for upstream.

Andrew15-5 · 2025-01-15T23:20:44Z

Sorry about that!

If you thought it was a negative "Eyyy!", then it wasn't, the hint is "kewl", so it's fine, no need to apologize. I also wouldn't want to delay the merge just because I'm busy and can't reply for a long time.

I'm sure @grantlemons would be happy to open a new PR once he has time to look at your comments.

Actually, perhaps I can open the new PR. I'll add the Typst fixes, then anyone can request changes or open the PR into my PR and I will accept it (considering we agree on the changes).

Thanks for catching the stuff you've pointed out!

You're welcome! As a perfectionist, this is what I do, hehe. And I'm really interested in both Typst and a tool that can deliver the best spellcheck experience into mine editor. I use Neovim btw.

I see that you're a significant contributor on Typst. I'm impressed.

Thank you! 💚 Actually, I'm not really a significant contributor yet (most of the changes so far were docs changes), but I'm intending to fix that. I have some things that are piling up, so I just need to finish other stuff so I can focus on making Typst even better! 🚀

It would be cool to get Harper plugin made for upstream.

Huh? plugin? What do you mean by that? Harper is already a "plugin" by itself that just needs to be added to an IDE. I'm open to collaboration.

P.S. The biggest blocker for me for the long time was finishing the NixOS config so I can ditch the Pop!_OS 22.04 (with utterly broken GNOME, at times). So this is my top priority right now.

grantlemons · 2025-01-15T23:37:40Z

@Andrew15-5 I've created a new PR #391, let me know if you have any further suggestions. (Idiomatic tests would probably be best suited by you creating a new pr though)

This was referenced Dec 4, 2024

feat: Typst Language Support #289

Closed

feat: add typst to the list of languages harper supports neovim/nvim-lspconfig#3478

Closed

feat: Span visualization command #303

Merged

grantlemons force-pushed the typst-support branch from 9a7ae81 to 6b5d78a Compare December 4, 2024 20:19

grantlemons added 16 commits December 9, 2024 11:48

feat(Automattic#230): map basic typst expressions to tokens

bed51f7

feat(Automattic#230): change recursive shorthand from macro to function

7201575

feat(Automattic#230): flesh out more complicated typst syntax parsing

d880613

feat(Automattic#230): delegate typst files to parser in harper-cli an…

9376e71

…d harper-ls

fix(Automattic#230): fix offset update after delegating parser

749e6dd

fix(Automattic#230): ParBreak to ParBreak, not two Newlines

ba3c307

feat(Automattic#230): remove offset variable, and just use the start …

c6a4d05

…of an environment's span

feat(Automattic#230): parse numbers properly and add test for numbers

0425110

feat(Automattic#230): consolidate words separated by apostrophes into…

1f43b27

… possessives or conjunctions

fix(clippy): satisfy clippy

835c396

feat(Automattic#230): simplify possessive-conjunction logic and add r…

4dbc264

…espective tests

feat(Automattic#230): create additional parsers for complex dictionar…

f3eda92

…y parsing

feat(Automattic#230): add some tests for dictionary parsing, and impr…

24e0551

…ove dict parsing to fit better

fix(Automattic#230): fix dict parsing by manually getting document co…

c63d41a

…ntent in span so quotes aren't escaped

fix(Automattic#230): remove debug print of typst ast in test

f57d6c2

style(Automattic#230): expand explainer on str parsing

550cf20

grantlemons force-pushed the typst-support branch from 6b5d78a to 550cf20 Compare December 9, 2024 18:49

Andrew15-5 reviewed Dec 18, 2024

View reviewed changes

harper-core/src/parsers/typst.rs Outdated Show resolved Hide resolved

feat(Automattic#230): remove quotes from Str parsing

7cd135f

Andrew15-5 reviewed Dec 20, 2024

View reviewed changes

harper-core/src/parsers/typst.rs Outdated Show resolved Hide resolved

grantlemons added 4 commits December 20, 2024 15:07

fix(Automattic#230): remove improper test case

54418ff

Merge remote-tracking branch 'upstream/master' into typst-support

935c85d

tests(Automattic#230): add test using unicode apostrophe

5336778

refactor(Automattic#230): simplify parsing by moving some helper func…

60bb986

…tions inside function

elijah-potter mentioned this pull request Jan 3, 2025

feat: more languages supported #79

Open

Merge branch 'master' into typst-support

32da8f8

mattfbacon reviewed Jan 4, 2025

View reviewed changes

harper-core/tests/run_tests.rs Outdated Show resolved Hide resolved

grantlemons added 4 commits January 4, 2025 10:48

fix: add newline to eof

df7652b

refactor: change harper_ls language_id if chain to match statement

6dcf841

refactor: move typst parsing to a new crate

1672de3

refactor: add many comments to typst parser

ada56a0

grantlemons requested a review from elijah-potter January 4, 2025 20:48

grantlemons added 4 commits January 6, 2025 12:15

refactor: undo changes to test runner so it is all markdown

d638a28

fix: add crate info to Cargo.toml

cfb468c

Merge remote-tracking branch 'upstream/master' into typst-support

0f55ab1

Merge remote-tracking branch 'upstream/master' into typst-support

fcd965b

- Bump version to v0.15.0

elijah-potter reviewed Jan 10, 2025

View reviewed changes

grantlemons added 4 commits January 10, 2025 10:57

fix(Automattic#230): remove feature flags

911bcdb

fix(Automattic#230): use document in tests to handle contractions

79e743e

Merge remote-tracking branch 'upstream/master' into typst-support

290a08a

fix(Automattic#230): remove pattern previously used for contractions

ec307ab

grantlemons requested a review from elijah-potter January 12, 2025 04:43

fix(core): delete erroneous file that was breaking the build

6a16626

elijah-potter approved these changes Jan 13, 2025

View reviewed changes

elijah-potter merged commit 6d8904c into Automattic:master Jan 13, 2025
17 checks passed

elijah-potter mentioned this pull request Jan 14, 2025

Harper doesn't work on text files? #149

Open

Andrew15-5 reviewed Jan 15, 2025

View reviewed changes

BrewTestBot mentioned this pull request Jan 15, 2025

harper 0.16.0 Homebrew/homebrew-core#204400

Merged

grantlemons mentioned this pull request Jan 15, 2025

Typst Test Fixes #391

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Typst Language Support #302

feat: Typst Language Support #302

grantlemons commented Dec 4, 2024 •

edited

Loading

grantlemons commented Dec 18, 2024 •

edited

Loading

Andrew15-5 commented Dec 18, 2024

grantlemons commented Jan 4, 2025

elijah-potter left a comment

elijah-potter commented Jan 13, 2025

Andrew15-5 left a comment

Andrew15-5 Jan 15, 2025

grantlemons Jan 15, 2025

Andrew15-5 Jan 15, 2025

grantlemons Jan 15, 2025

Andrew15-5 Jan 15, 2025

grantlemons Jan 15, 2025 •

edited

Loading

Andrew15-5 Jan 15, 2025

Andrew15-5 Jan 15, 2025

grantlemons Jan 15, 2025

Andrew15-5 Jan 15, 2025

grantlemons Jan 15, 2025

elijah-potter commented Jan 15, 2025

Andrew15-5 commented Jan 15, 2025

grantlemons commented Jan 15, 2025

feat: Typst Language Support #302

feat: Typst Language Support #302

Conversation

grantlemons commented Dec 4, 2024 • edited Loading

grantlemons commented Dec 18, 2024 • edited Loading

Andrew15-5 commented Dec 18, 2024

grantlemons commented Jan 4, 2025

elijah-potter left a comment

Choose a reason for hiding this comment

elijah-potter commented Jan 13, 2025

Andrew15-5 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grantlemons Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elijah-potter commented Jan 15, 2025

Andrew15-5 commented Jan 15, 2025

grantlemons commented Jan 15, 2025

grantlemons commented Dec 4, 2024 •

edited

Loading

grantlemons commented Dec 18, 2024 •

edited

Loading

grantlemons Jan 15, 2025 •

edited

Loading