Skip to content

Fix parsing ambiguous abbreviations when lowercase #591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 1, 2019

Conversation

angularsen
Copy link
Owner

Fixes #590

  • Add separate abbreviation-to-unit mapping that preserves case sensitivity
  • Fall back to case sensitive mapping if more than 1 unit is found
  • Add test

I'm sure we can optimize this with a single map, but I don't think we're talking any significant memory footprint here.

Copy link
Collaborator

@tmilnthorp tmilnthorp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know how this even slipped my mind. Yuck. I think it might be better to just revert the whole ignore case PR. Ambiguous between yotta/yocto, zetta/zepto, peta/pico, and mega/milli prefixes. Seems a bit silly to have to always specify case sensitivity for prefixed units, but not for others just in case case doesn't matter (if you even know it's a prefix, what if it's from user input?).

@angularsen
Copy link
Owner Author

Yeah, it feels yucky to have missed this and I also have second thoughts on the whole case insensitive support.

I think it's nice that you can parse both 5 PSI and 5 psi for free, without having to specify both abbreviations. On the other hand, adding some entries in JSON is a one-time effort and also not much work. Also, then we don't allow weird things like 5 PsI, which could potentially mean something else entirely and then they would get the wrong unit instead of UnitNotFoundException.

I'm a bit torn, on one hand we are in a good position to make it super easy to parse user input without being too strict on perfect casing that I think many often get wrong. It's hard to do this outside the library.

On the other hand, it feels a bit hacky, but in the implementation at least ensures there is no ambiguity by falling back to case sensitivity matching if more than one unit matches.

The only problem I can think of is when trying to parse a unit we haven't added yet, that happens to be identical to an existing unit with different casing and instead of getting UnitNotFoundException you get the wrong unit.

To counter this, I propose to make parsing methods case sensitive by default, with a new parameter ignoreCase. That should give us the best of both worlds, allowing users to explicitly parse with ignore case?

@angularsen
Copy link
Owner Author

I'm merging this now, just pushed a new nuget out and I don't want this bug in it. We can revise this later.

@angularsen angularsen merged commit ce8751c into master Feb 1, 2019
@angularsen angularsen deleted the fix-parsing-ambiguous-lowercase-units branch February 1, 2019 10:06
@angularsen
Copy link
Owner Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants