-
Notifications
You must be signed in to change notification settings - Fork 246
Wrap multiline string literals to line length. #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/// We want this to print as: | ||
/// """ | ||
/// some text this is exactly the right length | ||
/// """ | ||
/// and not: | ||
/// """ | ||
/// some text \ | ||
/// this is exactly the right length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since backslashes are used to suppress newlines, I think I agree with this choice. I'll admit it's somewhat opinionated though. I don't know if other folks would want to be able to choose where in a string they break it. If we made this configurable, we could allow that, but the problem with doing that is that once a string is broken in this way, it can never be automatically unbroken, so shifting indentation could cause a string to split awkwardly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would not automatically re-flow string literals. Most likely a user already put thought into the re-flow. If you do want to re-flow, I personally think it’s not too complicated to change the string literal to be on a single line again and then let swift-format break it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely a user already put thought into the re-flow
The main case that this isn't true is what @allevato mentioned, ie. format causes a split, now you indent for some other reason (nest in an if/whatever), and now never automatically reflow.
I personally think it’s not too complicated to change the string literal to be on a single line again
Could have a refactoring for this pretty easily too.
I'm somewhat split in my opinion as I can see benefits in both. I definitely think we need to have an option (probably defaulted to off) if we do reflow though. Maybe // swift-format-ignore
is enough for cases when it's on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added an option to configure multiline string reflow behavior. When false, existing escaped newlines are left in place and string reflow does not occur. When true, existing escaped newlines are removed and string reflows using the inserted .escaped
break tokens.
The option defaults to false, so reflow will not occur by default and will be opt-in. // swift-format-ignore
works regardless of this setting, so folks can turn on reflow and still keep escaped newlines for specific literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke with @allevato and extended the reflow config behavior to include 3 options: never
, onlyLinesOverLength
, and always
.
never
behaves the same as today, we never break up multiline strings.onlyLinesOverLength
respects existing escaped line breaks but will break up lines over the line length.always
does not respect existing escaped line breaks. It will remove existing escaped line breaks and reflow everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After chatting offline, we decided on the above and a couple other changes:
- Keep the default mode as
never
, to preserve existing behavior (don't break or reflow strings) - Even when reflowing is
onlyLinesOverLength
oralways
, only do it for regular multiline strings ("""
), never for raw multiline strings (#"""
, etc.)
The rationale for (2) is that it's highly likely that raw multiline strings are used for things like structured data, code, etc., whereas regular multiline strings are more likely to be simple prose.
That's not always the case, of course; swift-syntax and swift-format both have a mixture of code snippets in regular and raw multiline strings. But if we wanted to change the default behavior for this later, we could update those so that they format correctly in each case.
@ahoppen @bnbarham Do you think this is a reasonable landing spot for the time being?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me! Also CC @shawnhyam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. I'll go ahead and kick off CI, but let's wait for other folks to comment on it as well.
It looks like this is triggering a bunch of whitespace linter issues when it does the format pass over swift-syntax. Can you look at what it's doing there and see if those are expected? https://ci.swift.org/job/swift-syntax-PR-macOS/4470/console I suspect most of these are coming from the multiline strings that contain Swift code fragments, like this: https://github.com/swiftlang/swift-syntax/blob/f8b635392690fe59c74064ce222e134335cf22d0/CodeGeneration/Sources/generate-swift-syntax/templates/swiftsyntax/SyntaxRewriterFile.swift#L165 That's an interesting case that we need to consider, because this is definitely a scenario where we may want to control where newlines go via One idea would be to only reflow strings where every line ends in a |
Something to take into consideration is that ASCII spaces between words are not universal across scripts. Most (all?) Latin-based scripts use them, but Chinese and Japanese writing rarely do, and some scripts use alternate characters (interpuncts, etc.) instead of ASCII spaces. |
Good point. At a minimum, the logic on this line should preserve the original scalar detected by We could consider other kinds of characters to break on later; unfortunately, we didn't include any of the Unicode breaking properties in |
The case of preformatted code (or any preformatted text) came to mind for me -- this is possibly one solution, although I'm not convinced I would want it to reformat these blocks at all. I guess if I was used to it, it might be fine. But my feeling is this should be controlled by an option so it could at least be turned off if it's creating problems. |
Code in multi-line strings is the use case that stands out as most compelling to address to me. I'd love to see any examples that someone might have where they have non-code text that is split across multiple lines terminated by I guess one example would be other kinds of structured text: let x = """
value1: \(value1), \
value2: \(value2), \
value3: \(value3), \
value4: \(value4)
""" which could reflow like this, depending on line width: let x = """
value1: \(value1), value2: \(value2), value3: \
\(value3), value4: \(value4)
""" I think the argument could be made that the second form is worse. It should be fairly straightforward to make something like |
Well now we're getting dangerously close to including swift-syntax as a dependency just so we can format code-in-block-strings. 🙃 |
swift-evolution proposal to add descriptive labels to multiline strings, à la Markdown code fences? let code = """swift
func \(name)() -> \(result) {
}
""" 🤔 Of course, my preference would be to do something good here without modifying the language 😅 |
I mean, why not? Pitch the triple-backtick syntax! Do it! |
I took a look at a couple alternatives for respecting user inserted "\\n" while reflowing formatter inserted "\\n". I couldn't find a compelling solution. As a stopgap solution I've added support for |
d444d35
to
785b4d3
Compare
Change how PrettyPrinter emits StringSegmentSyntax's so that they can be broken up over multiple lines within multiline string literals. This does not change the behavior of single line string literals, which cannot contain newlines. This also does not change interpolations which are still emitted as verbatims and are never line broken by the pretty printer. Line breaking is done exclusively through escaped newlines. A literal containing escaped newlines will remove all escaped newlines and then reinsert them based on line length. Hard newlines are respected by the formatter and will not be moved, even if it causes short lines. Escaped newlines will be reformatted by the pretty printer so that lines ending in an escaped newline are at line length. Wrapping is implemented by introducing a new newlinebreak behavior `.escaped`. `.escaped` acts very similarly to `.elective` but has slightly different length calculation logic and printing behavior. An escaped newline is printed including it's preceeding whitespace followed by "\\\n". So a break of `.break(_, 2, .escaped)` will print as " \\\n". Because an escaped line break takes up characters when broken (unlike other breaks), length calculation must be handled differently for `.escaped` breaks.
785b4d3
to
705af09
Compare
Modify the pretty printer to insert escaped newlines "\\n" when a line would exceed configured line length. This only affects multi-line strings and not single-line strings. Previously string segements in multiline literals were treated as one large
.syntax
token. They are now broken up into multiple tokens to allow opportunities to line break.A string segement
"the quick fox"
will now be comprised of the tokens:we now break up the segment based on whitespace. After each series of whitespace, we insert the new
.escaped
break.A couple of odd things to note here:
.escaped
.syntax
and not a.space
.escaped
is very similar to an.elective
line break. It will only break when it exceed the line length, otherwise it will be printed assize
spaces. Unlike.elective
, when it is printed it will print as an escaped newline "\\n". On the implementation side we also have to calculate it's length differently than we do for.elective
.Because all whitespace in a string literal is significant (unlike whitespace in other swift syntax), it is emitted alongside the break. This is because, unlike other breaks, we always want to print the whitespace in a string literal. The choice to print the whitespace prior to the line break is cosmetic.
Whitespace is emitted as a
.syntax
to avoid special casing between.break
s and.space
s. The pretty printer will ignore spaces (or change them) around line breaks to make code look better. However this doesn't work for string literals where the whitespace is significant. Emitting.syntax
token ensures all whitespace in the literal shows up.For example, the literal:
will be formatted as:
Wrapping will reflow escaped newlines already present in the string literal. For example, with a line length of 30:
will be formatted as:
However hard line breaks are always respected, since they appear in the string literal itself not just the syntax. So:
will not be reformatted.
Breaks are only inserted after whitespace, so words longer than line length will not be wrapped (line length 30):
will be formatted as
even though the second line exceeds 30 characters.
Fixes #724