Add lint for broken doc links #13696

maxcnunes · 2024-11-16T16:23:31Z

changelog: [doc_broken_link]: Add pedantic lint to catch broken doc links that won't produce a link tag by rustdoc.

rustbot · 2024-11-16T16:23:36Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Jarcho (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

@rustbot author: the review is finished, PR author should check the comments and take action accordingly
@rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

maxcnunes · 2024-11-16T16:58:14Z

Just noticed there are tests failing due to a false positive like this:

/// Referencing an slice [T]

This will be considered a broken link although actually it isn't. I guess in order to fix it I won't be able to check fake value from pulldown_cmark::Parser::new_with_broken_link_callback anymore, since it doesn't provide the raw text to check why it was considered a broken link. I will try to work on a different solution, appreciate if there are any suggestions to achieve it.

maxcnunes · 2024-11-16T21:51:28Z

I will try to use similar approach used on https://github.com/rust-lang/rust-clippy/blob/master/clippy_lints/src/tabs_in_doc_comments.rs

bors · 2024-11-21T22:12:56Z

☔ The latest upstream changes (presumably 8298da7) made this pull request unmergeable. Please resolve the merge conflicts.

Jarcho

Sorry for the long wait. Left a couple of specific comments.

One thing you'll need to do is take the text input from the markdown parser. Currently you'll be linting inside code and html sections. Doc attribute can also contain multiple lines.

clippy_lints/src/doc/broken_link.rs

maxcnunes · 2024-12-07T21:48:12Z

@Jarcho thanks for taking the time to review this.

I applied two of your code improvement suggestions. I am a bit confused though about these others comments:

One thing you'll need to do is take the text input from the markdown parser.

Do you mean this lint should run on this parser's output instead of running before that step, which is done against the rust AST attributes? I tried doing that as the first approach, but the new_with_broken_link_callback we use sanitizes broken links replacing them with a fake text and link values, and that makes impossible to run any of this lint logic.

Currently you'll be linting inside code and html sections.

It is being applied only for AttrKind::DocComment attributes. Doesn't it guarantees only code doc comments are covered by this lint, which I imagine is what we want?

Doc attribute can also contain multiple lines.

This current logic is checking for broken links across multiple lines, so I am confused what you mean on this one, as it is already checking for multiple lines.

maxcnunes · 2024-12-07T21:48:55Z

@rustbot review

Jarcho · 2024-12-08T00:44:46Z

It is being applied only for AttrKind::DocComment attributes. Doesn't it guarantees only code doc comments are covered by this lint, which I imagine is what we want?

Yes. And markdown contain code and html sections. Neither of which will try to parse links.

This current logic is checking for broken links across multiple lines, so I am confused what you mean on this one, as it is already checking for multiple lines.

Right now you're assuming that each attribute contains a single line. This is normally true, but isn't guaranteed.

maxcnunes · 2024-12-08T13:19:57Z

@Jarcho So, to make sure I got it right:

Yes. And markdown contain code and html sections. Neither of which will try to parse links.

Are you saying AttrKind::DocComment attributes are markdown, which include code and html sections, and we should not try to parse links for those types, applying it on real document's content only? If that is what you mean, is your suggestion to follow the same approach from doc/mod.rs which uses the pulldown_cmark to parse the doc comments? But in this case we would not use new_with_broken_link_callback in order to properly handle those broken links, since as mentioned before, they get replaced with fake values in that case.

About this other one:

Right now you're assuming that each attribute contains a single line. This is normally true, but isn't guaranteed.

Would attributes with multiple lines be represented with \n so I should handle that case or are multiple lines represented in a different format? Also, is there are way to reproduce those multiple lines without actually adding \n to comments so I can properly have tests for that case?

Jarcho · 2024-12-11T02:13:37Z

Are you saying AttrKind::DocComment attributes are markdown, which include code and html sections, and we should not try to parse links for those types, applying it on real document's content only? If that is what you mean, is your suggestion to follow the same approach from doc/mod.rs which uses the pulldown_cmark to parse the doc comments? But in this case we would not use new_with_broken_link_callback in order to properly handle those broken links, since as mentioned before, they get replaced with fake values in that case.

I don't see why that would stop this from working. You can use the span given to the callback to know where to start parsing the input string for the link's destination.

Would attributes with multiple lines be represented with \n so I should handle that case or are multiple lines represented in a different format? Also, is there are way to reproduce those multiple lines without actually adding \n to comments so I can properly have tests for that case?

rustdoc joins all the doc attributes together with \n as the separator and then passes that to the markdown parser.

rustbot assigned Jarcho Nov 16, 2024

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Nov 16, 2024

Add lint for broken doc links

6ff53c7

maxcnunes force-pushed the lint-doc-broken-links branch from 8d859c9 to 6ff53c7 Compare November 16, 2024 16:34

maxcnunes force-pushed the lint-doc-broken-links branch from fc34f5d to 640f282 Compare November 18, 2024 00:24

Fix false positives on broken link detection

8deb383

maxcnunes force-pushed the lint-doc-broken-links branch from 640f282 to 8deb383 Compare November 18, 2024 00:27

maxcnunes added 9 commits November 17, 2024 21:32

Refactor variable names

853e785

Fix doc comment about broken link lint

83b3de7

Refactor, remove not used variable

bc3ff86

Improve broken link to catch more cases and span point to whole link

ed3271d

Include reason why a link is considered broken

bbfa2f2

Drop some checker because rustdoc already warn about them

d4472d6

Refactor to use a single enum instead of multiple bool variables

d5d9213

Fix lint warnings

53c2aaa

Rename function to collect broken links

2e8f2dc

Jarcho requested changes Dec 4, 2024

View reviewed changes

clippy_lints/src/doc/broken_link.rs Outdated Show resolved Hide resolved

clippy_lints/src/doc/broken_link.rs Outdated Show resolved Hide resolved

Jarcho added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels Dec 6, 2024

maxcnunes added 2 commits December 7, 2024 17:44

Warn directly instead of collecting all entries first

f45b5db

Iterate directly rather than collecting

f29b4e3

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties and removed S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) labels Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lint for broken doc links #13696

Add lint for broken doc links #13696

maxcnunes commented Nov 16, 2024

rustbot commented Nov 16, 2024

maxcnunes commented Nov 16, 2024

maxcnunes commented Nov 16, 2024

bors commented Nov 21, 2024

Jarcho left a comment

maxcnunes commented Dec 7, 2024

maxcnunes commented Dec 7, 2024

Jarcho commented Dec 8, 2024

maxcnunes commented Dec 8, 2024

Jarcho commented Dec 11, 2024

Add lint for broken doc links #13696

Are you sure you want to change the base?

Add lint for broken doc links #13696

Conversation

maxcnunes commented Nov 16, 2024

rustbot commented Nov 16, 2024

maxcnunes commented Nov 16, 2024

maxcnunes commented Nov 16, 2024

bors commented Nov 21, 2024

Jarcho left a comment

Choose a reason for hiding this comment

maxcnunes commented Dec 7, 2024

maxcnunes commented Dec 7, 2024

Jarcho commented Dec 8, 2024

maxcnunes commented Dec 8, 2024

Jarcho commented Dec 11, 2024