Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harper doesn't work on text files? #149

Open
sbromberger opened this issue Sep 7, 2024 · 15 comments
Open

Harper doesn't work on text files? #149

sbromberger opened this issue Sep 7, 2024 · 15 comments

Comments

@sbromberger
Copy link

I've got harper configured with helix, and it works really well for markdown files. However, it doesn't seem to want to work on text files (or any other non-markdown file I've tried). Is this an inherent limitation?

Helix allows specifying per-file-type language servers, so it would be really useful to be able to remove this limitation from the lsp server (if it indeed exists).

@elijah-potter
Copy link
Collaborator

I've never used Helix, so I really can't speak to how to set it up with Harper. I know there are others who use Harpet with Helix that can perhaps help.

However Harper supports a number of programming languages and works correctly in both Visual Studio Code and Neovim. If you haven't already, try opening up a JavaScript buffer.

@sbromberger
Copy link
Author

sbromberger commented Sep 8, 2024

So, it works very well with Helix for markdown. The specific Helix configuration isn't an issue; I know how to get the language servers to run in Helix.

But even if we configure Helix to use harper-ls for .txt files, the language server doesn't detect misspellings or otherwise respond (I can confirm it launched, though). Is this intentional / a restriction that's built into the language server? Should harper-ls work on text files?

@sbromberger sbromberger changed the title Harper only works on markdown files? Harper doesn't work on text files? Sep 8, 2024
@anarcat
Copy link

anarcat commented Sep 8, 2024

I am having similar issues making the LS work in emacs, see #150

In my case, it actually fails with an error, but from a primitive, user perspective, it doesn't find any errors either. :)

@sbromberger
Copy link
Author

sbromberger commented Sep 9, 2024

Just an idea: why limit harper based on filetype at all? It probably should be up to the user to determine what kind of files to use - that is, if I have a typst, or LaTeX, or .myfunkymarkdown file, harper shouldn't really care, right? If I try to use harper on a binary file, well, that's on me.

This would allow users to use harper for things like git commits, emails written in AERC or himalaya, scratch notes with weird extensions, etc.

Alternately, if you MUST check file type (to do per-language comment checking, I guess?), allow a default / catchall.

Bottom line: this is a brilliant piece of code and I want to use it everywhere I can.

@masriomarm
Copy link

@elijah-potter it does normally run with helix on multiple programming languages, rust at least.
Also works well with markdown.
But it doesn't seem to respond with normal text files and git-commit messages.

Is this an intentional behavior?
Thanks in advance.

@anarcat
Copy link

anarcat commented Sep 9, 2024 via email

@elijah-potter
Copy link
Collaborator

Internally, Harper has a number of parsers whose principle job is to convert an arbitrary character sequence into a Document.

There are tree-sitter-based parsers for grabbing the comments out of programming languages and wrappers around other things like pulldown-cmark. Most of these parsers, however, wrap around the single PlainEnglish parser.

Input to the PlainEnglish parser is assumed to be just that--plain english. Any tokens relevant to a markup language (like Typst) will be included in the linted Document, leading to erroneous errors. Harper is not a simple regex engine, so shoving arbitrary text through it isn't a solution.

What we could do in harper-ls (and what it sounds likes you are suggesting) is in the cases where Harper cannot find a parser for the provided buffer's language we simply default back to the PlainEnglish parser.

Does that sound correct?

Side Note:

This would allow users to use harper for things like emails written in AERC or himalaya

You should already be able to use Harper in these situations, as per #92

@sbromberger
Copy link
Author

sbromberger commented Sep 10, 2024

@elijah-potter - Thank you for the great explanation of how harper works behind the scenes.

What we could do in harper-ls (and what it sounds likes you are suggesting) is in the cases where Harper cannot find a parser for the provided buffer's language we simply default back to the PlainEnglish parser.

Yes - or, more precisely, allow this to be an option so that folks who don't want harper to do anything for unknown files can continue to have the same behavior they have now, but those of us who want plain text to be the default can do so.

For helix, this is a really good proposal, since we can choose to start specific language servers for specific languages detected by helix - so I can add harper-ls to my "plain text" language detection section in my Helix config, and it will only launch when Helix detects plain text.

Hopefully that's not too confusing. Thank you for your consideration of this idea.

EDIT:

This would allow users to use harper for things like emails written in AERC or himalaya

You should already be able to use Harper in these situations, as per #92

It doesn't work with himalaya: for example, himalaya message write opens up $EDITOR with a file with an eml extension but harper-ls doesn't check it (even though I can confirm it's launched, since I configured Helix to launch harper-ls whenever a .eml file extension is opened).

@sbromberger
Copy link
Author

Another option I just thought of - allow users to override the "detected file" via command-line option: that way I can start up something like harper-ls --stdio --parser=PlainEnglish for the specific file types I have.

@lukasmwerner
Copy link
Contributor

I will say I've been using aerc-mail with neovim + harper and I haven't encountered any issues so far. Granted my implementation in harper for email files is a bit naive but it "works" enough. What I really need to do is sit down and read how the email spec works to find a way to ignore quoted replies

@bcspragu
Copy link

So I see #93 makes files with a language_id of 'mail' use the PlainEnglish parser, which is great, but similar to @sbromberger, I'm not seeing harper-ls offer any suggestions on .eml files in aerc, even though I know it's configured correctly (in Helix) because it works for Markdown and I see it start up when the .eml file opens.

I tried to figure out where the 'mail' string comes from, but didn't see in the LSP spec, or tower-lsp, or in the Harper codebase, though I'm sure it's around here somewhere 🤷

What I really need to do is sit down and read how the email spec works to find a way to ignore quoted replies

Another option, which perhaps requires less reading, is to make a parser based on PlainEnglish that just ignores lines starting with >

@lukasmwerner
Copy link
Contributor

I got the language id from when i open my aerc and compose an email it shows mail in my neovim statusbar
Screenshot 2024-10-15 at 8 48 08 PM

same goes for :LspInfo when I'm composing.
Screenshot 2024-10-15 at 8 50 19 PM

I also considered using a custom parser that ignores lines starting with > however i also get/write emails that use that syntax to do block quotes, which feels like a bit of a compromise.

@bcspragu
Copy link

Ah, that was the piece I was missing: it's the editor that sets this LSP metadata! From there, I found that Helix supports a language-id parameter when configuring a language , so my languages.toml includes:

[[language]]
name = "eml"
language-id = "mail" # this is what I needed to add
scope = "source.eml"
roots = []
file-types = ["eml"]
injection-regex = "eml"
text-width = 80
soft-wrap = { enable = true, wrap-at-text-width = true }
language-servers = [ "harper-ls" ]

This doesn't answer the larger "arbitrary text file" discussion of course, but solved the email case for me, thanks!

@pinpox
Copy link

pinpox commented Oct 17, 2024

I'm trying to figure out how to run harper on typst files, a markup-based typsetting language, similar to latex. Is there any way to get it working? I'm using neovim and the configured language server works for other (code) files, but not in typst.

@elijah-potter
Copy link
Collaborator

I'm trying to figure out how to run harper on typst files, a markup-based typsetting language, similar to latex. Is there any way to get it working? I'm using neovim and the configured language server works for other (code) files, but not in typst.

Sorry to get back to you so late on this. We've just merged Typst support in #302, so it should be out in the next release.

If the plaintext config option is still important to you all, I can get it pushed out pretty easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants