Skip to content

syntax: better error message for missing repetition quantifier #582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 11, 2019

Conversation

christianrondeau
Copy link
Contributor

@christianrondeau christianrondeau commented May 29, 2019

Hi, first contribution to the Rust ecosystem, hopefully I'll get it right.

This is to solve @CAD97 issue #545 which raised an uncleared error message when repetition quantifiers were not followed by a valid decimal (other languages would treat the {} as characters and this implementation would fail, but the message was "decimal literal empty" which was confusing)

The change is fairly simple, my main objective is to be able to contribute by starting small.

A few notes:

  • To reuse the Span and pattern instances, I match and return the same error values but with a different kind. This way we can keep current and future errors in parse_decimal untouched.
  • I also didn't extract a method for the match block since the code was small, but if you believe it makes sense, I could also make a change_error_kind(From, To, self.parse_decimal()) method or pass the desired error kind to parse_decimal(ErrorKindWhenMissing), but I felt the small duplication made it more obvious to read.
  • I used RepetitionQuantifierDecimalMissing even though it also throws when the decimal is invalid, because the current implementation doesn't differentiate missing and invalid decimals, and I didn't want to change the behavior. It is also valid to have a missing decimal in cases like x{1,} or x{,2}, so I'm sure there's a better name for it.

I also noticed there was no integration tests for error messages. I excluded it from this PR but if you believe it can be valuable (it helped me make sure that my change actually worked) I can include it in this PR as a separate commit (or a different PR if you prefer). The test looks like this:

fn repetition_quantifier_decimal_error() {
    assert_panic_message(r"\\u{[^}]*}", r#"
regex parse error:
    \\u{[^}]*}
        ^
error: decimal literal empty
"#);

I'm ready to do my homework if I missed something, so just let me know!

@christianrondeau christianrondeau changed the title tests: repetition quantifier decimal missing syntax: better error message for missing repetition quantifier May 30, 2019
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Overall, I think I like the approach here. I've left a few comments that we should address before merging though. Also:

I used RepetitionQuantifierDecimalMissing even though it also throws when the decimal is invalid, because the current implementation doesn't differentiate missing and invalid decimals

Could you say more about this? There is a DecimalInvalid error variant and it is used.

@@ -307,6 +310,9 @@ impl fmt::Display for ErrorKind {
write!(f, "invalid repetition count range, \
the start must be <= the end")
}
RepetitionQuantifierDecimalMissing => {
write!(f, "Repetition quantifier expects a valid decimal")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this match the style of surrounding style of error messages? i.e., Start with a lowercase letter.

@@ -1113,7 +1113,15 @@ impl<'s, P: Borrow<Parser>> ParserI<'s, P> {
ast::ErrorKind::RepetitionCountUnclosed,
));
}
let count_start = self.parse_decimal()?;
let count_start = match self.parse_decimal() {
Err(ast::Error {kind: ast::ErrorKind::DecimalEmpty, span, pattern}) => return Err(ast::Error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this code match the style of the surrounding code? i.e., Keep all lines limited to 79 columns inclusive. Please fix this for any other long lines introduced in this PR. (It is hard to see them in Github's PR diff view.)

It might be clearest to "fix" these long lines by extracting this into a separate method, as you suggested. Then you can reuse that below.

@@ -153,6 +153,9 @@ pub enum ErrorKind {
/// The range provided in a counted repetition operator is invalid. The
/// range is invalid if the start is greater than the end.
RepetitionCountInvalid,
/// An opening `{` was not followed by a valid decimal value.
/// For example, `x{}` or `x{]}` would fail.
RepetitionQuantifierDecimalMissing,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, it looks like DecimalEmpty is no longer used. This is a bit weird, but I see why it makes sense to do that. Could you update the docs of DecimalEmpty to say something like:

/// Note that this error variant is no longer used. Namely, a decimal
/// number can only appear as a repetition quantifier. When the number
/// in a repetition quantifier is empty, then it gets its own specialized
/// error, `RepetitionQuantifierDecimalMissing`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you name this RepetitionCountDecimalEmpty in order to be consistent with other error variants?

@christianrondeau
Copy link
Contributor Author

Changes:

  • Fixed upper case letter in error message (good catch)
  • Extracted a function for the error specialization. Called it specialize_err and made it generic, so it's simple to reuse and read. It's now under 79 columns wide.
  • Updated the doc of DecimalEmpty as asked (I think your text was perfect)
  • Renamed RepetitionQuantifierDecimalMissing to RepetitionCountDecimalEmpty
  • Since the code is already a little bit more modified than I initially wanted, I included my integration test, I it wil be easier for you to tell me if you wish to keep it or if I should remove it.

For your first question, the code currently doesn't handle a{} and a{]} differently. So "empty" is technically correct, though I thought "missing" was more precise since it "expected" one, but didn't found one (i.e. it's not empty, it's just not there). But my thought process is probably not valid for a future reader, so I used your change. Showing a different error message for empty or invalid decimal would require looking further just for the purposes of the error message, so I thought it wasn't worth the complexity.

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Thanks so much! Improving error messages is great.

self.parse_decimal(),
ast::ErrorKind::DecimalEmpty,
ast::ErrorKind::RepetitionCountDecimalEmpty,
)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, this is much nicer. I like specialize_err. :-)

@BurntSushi BurntSushi merged commit 172898a into rust-lang:master Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants