-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strconv: add QuotedPrefix #45033
Comments
I think this proposal makes sense; having to find the end of the quote manually means reimplementing a basic version of Unquote. I also wonder if the input should be a |
It may be that the operation to expose is "find me the end of this string" and not "also Unquote it". |
That sounds more flexible, with the downside that one would end up looping over the input bytes twice. That already happens today, so perhaps the cost isn't significant enough to matter. |
Given that many strings do not have any escape characters, the operation could report whether the string had any escape characters. If not, the user could trivially truncate off the leading and trailing quote characters and avoid a second pass. |
This proposal has been added to the active column of the proposals project |
So if we do the string-end-finder, I guess the API would be:
? Reporting the presence of absence of escape characters seems like premature optimization. |
Wouldn't you need to know where the prefixed quoted string ended so that the caller can start parsing what comes afterwards?
I agree. In my all my use-cases, performance was not paramount. Also, this loop does not seem hard to write in the extremely performance sensitive cases. |
You can call the |
Ah yes, I apologize. I missed the part where it "returns the quoted string". |
I feel like an API that simply returns the length of a valid quoted string (and the entire length otherwise) is simpler to use. Suppose we had: // QuotedPrefixLen returns the length of the quoted string (as defined by Unquote) at the prefix of s.
// If s does not start with a valid quoted string, QuotedPrefix returns len(s).
func QuotedPrefixLen(s string) int For example, if I wanted to parse the following text:
the code would look like: in = strings.TrimSpace(in)
for len(in) > 0 {
n := strconv.QuotedPrefixLen(in)
s, err := strconv.Unquote(in[:n])
if err != nil {
...
}
... // make use of s
in = strings.TrimSpace(in[n:])
} Alternatively, with the in = strings.TrimLeft(in, " ")
for len(in) > 0 {
s1, err := strconv.QuotedPrefix(in)
if err != nil {
...
}
s2, err := strconv.Unquote(s1)
if err != nil {
... // technically this will never happen, but it's not obvious to reviewers
}
... // make use of s2
in = strings.TrimLeft(in[len(s1):], " ")
} That said, I'm still okay with |
Would I guess if that's the case, you could still call |
I struggled between having it return 0 or len(s) on error situations. The advantage of returning len(s) is that naive slicing of the input would pass an invalid string to |
Have it return |
You can always wrap QuotedPrefix to get the single-result. |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
@rsc, To be clear, we're going with the API you proposed in #45033 (comment), right? |
I just took a look at the implementation of |
Change https://golang.org/cl/314775 mentions this issue: |
The existing
Unquote
function unescapes a Go string assuming that the entirety of input string is the escaped the string. However, in many parsing applications, we have a quoted string followed by some amount of unconsumed input, the presence of arbitrary characters after the quoted string currently breaks theUnquote
function.I propose adding:
This functionality may simplify a number of standard packages that implement their own logic to determine the end of a quoted string so that they can pass the correctly sized string to
strconv.Unquote
:The text was updated successfully, but these errors were encountered: