-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[Perf] Optimize documentation lints **a lot** (1/2) (18% -> 10%) #14693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
Turns out that `doc_markdown` uses a non-cheap rustdoc function to convert from markdown ranges into source spans. And it was using it a lot (about once every 18 lines of documentation on `tokio`, which ends up being about 1800 times). This ended up being about 18% of the total Clippy runtime as discovered by lintcheck --perf in docs-heavy crates. This PR optimizes one of the cases in which Clippy calls the function, and a future PR once pulldown-cmark/pulldown-cmark issue number 1034 is merged will be open. Note that not all crates were affected by this crate equally, those with more docs are affected far more than those light ones.
fe7ec9b
to
565cf5a
Compare
let Some(fragment_span) = fragments.span(cx, range.clone()) else { | ||
return ControlFlow::Break(()); | ||
}; | ||
|
||
let span = Span::new( | ||
fragment_span.lo() + BytePos::from_usize(fragment_offset), | ||
fragment_span.lo() + BytePos::from_usize(fragment_offset + word.len()), | ||
fragment_span.ctxt(), | ||
fragment_span.parent(), | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you not be adjusting the range before creating the span? fragment_offset
looks like it's an offset in the markdown text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand this comment correctly. This snippet is taken as-is from check
with variable names fixed, check->offset
didn't really care about the markdown text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fragment_offset
looks like it's an offset in the cooked doc string. It can't be used as an offset for a span since that doesn't always line up perfectly with the source text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing this out, text_to_check
only contains text, it doesn't contain links, or bold text, etc. And fragment_offset
is resetted for each one of those text
s. I can add a debug assertion for future proofing this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A text fragment can still contain escape sequences e.g. #[doc = "docs with unicode \u{xxxxxx}"]
. The string the fragments work on is the cooked version of the doc string, not the source form. Multiline comments (/** */
) might also have issues, don't know how the those are presented.
@@ -117,6 +134,17 @@ fn check_word(cx: &LateContext<'_>, word: &str, span: Span, code_level: isize, b | |||
// try to get around the fact that `foo::bar` parses as a valid URL | |||
&& !url.cannot_be_a_base() | |||
{ | |||
let Some(fragment_span) = fragments.span(cx, range.clone()) else { | |||
return ControlFlow::Break(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. One spot failing to get a span doesn't mean all the others will.
let Some(fragment_span) = fragments.span(cx, range.clone()) else { | ||
return ControlFlow::Break(()); | ||
}; | ||
|
||
let span = Span::new( | ||
fragment_span.lo() + BytePos::from_usize(fragment_offset), | ||
fragment_span.lo() + BytePos::from_usize(fragment_offset + word.len()), | ||
fragment_span.ctxt(), | ||
fragment_span.parent(), | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the previous two comments.
/// Checks if a string is upper-camel-case, i.e., starts with an uppercase and | ||
/// contains at least two uppercase letters (`Clippy` is ok) and one lower-case | ||
/// letter (`NASA` is ok). | ||
/// letter (`NASA` is ok).[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accident?
Turns out that
doc_markdown
uses a non-cheap rustdoc function to convert from markdown ranges into source spans. And it was using it a lot (about once every 17 lines of documentation ontokio
, which ends up being about 2000 times).This ended up being about 18% of the total Clippy runtime as discovered by lintcheck --perf in docs-heavy crates. This PR optimizes one of the cases in which Clippy calls the function, and a future PR once pulldown-cmark/pulldown-cmark#1034 is merged will be opened. This PR lands the use of the function into the single-digit zone.
Note that not all crates were affected by this crate equally, those with more docs are affected far more than those light ones.
changelog:[
clippy::doc_markdown
] has been optimized by 50%