Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #34105, unescape triple-quoted strings after dedenting #35001

Merged
merged 1 commit into from
Mar 11, 2020

Conversation

JeffBezanson
Copy link
Member

It seems clear to me that triple-quoted strings should only specially handle whitespace that directly occurs in the source text, not including escape sequences. This also makes it easier to fix #34967, since if unescaping is done first then you get a UTF-8 error when we try to strip whitespace if you write invalid UTF-8 strings using escape sequences (which is allowed of course).
But, this could be breaking since it obviously changes the values of some string literals.

fix #34105

@JeffBezanson JeffBezanson added parser Language parsing and surface syntax triage This should be discussed on a triage call minor change Marginal behavior change acceptable for a minor release labels Mar 4, 2020
@Keno
Copy link
Member

Keno commented Mar 4, 2020

Yes, I'm in favor. I came across this when implementing the unindenting in one of the pure Julia parsers (can't remember which one) and thought it was an odd behavior. I concur with your assessment that this needs to happen in the opposite order.

@StefanKarpinski
Copy link
Member

While this is a behavior change, it's hard to argue that this wasn't a bug. 💯

@JeffBezanson JeffBezanson force-pushed the jb/triplequoteescapes branch from f854f9f to a723195 Compare March 10, 2020 20:22
@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Mar 10, 2020
@JeffBezanson JeffBezanson merged commit d5d71d7 into master Mar 11, 2020
@JeffBezanson JeffBezanson deleted the jb/triplequoteescapes branch March 11, 2020 18:47
@davidanthoff
Copy link
Contributor

@ZacLN probably something we need to replicate in CSTParser as well?

@GunnarFarneback
Copy link
Contributor

This PR changed the parsing of

"""\n
"""

from \n to \n\n.

Is this unavoidable collateral damage or something that should be fixed?

(Real code example that gets an extra newline: https://github.com/JuliaRegistries/RegistryTools.jl/blob/cbebbed17e13e7a18556a7dd62346d8610e867ea/src/types.jl#L126)

@StefanKarpinski
Copy link
Member

It's unclear to me why that had the \n after the opening """ in the first place.

@GunnarFarneback
Copy link
Contributor

GunnarFarneback commented Mar 29, 2020

It's unclear to me too, but one reason might be that

julia> """
       \na
       """
"       \na\n       "

before this PR. That's another thing that has changed, probably for the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor change Marginal behavior change acceptable for a minor release parser Language parsing and surface syntax
Projects
None yet
5 participants