Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: negative slice/array indexes #33359

Closed
pjebs opened this issue Jul 30, 2019 · 35 comments
Closed

proposal: Go 2: negative slice/array indexes #33359

pjebs opened this issue Jul 30, 2019 · 35 comments
Labels
dotdotdot ... FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Milestone

Comments

@pjebs
Copy link
Contributor

pjebs commented Jul 30, 2019

Currently a negative index to a slice causes a panic.

This proposal doesn't want to change that behaviour.

...except at compiler level.

The compiler, when presented with a -x in the slice index pretends it is equivalent to len(s)-x.

It is essentially a macro (syntax sugar). It only applies when a human types it in code. It does not apply if you use an int variable of negative value.

@gopherbot gopherbot added this to the Proposal milestone Jul 30, 2019
@pjebs
Copy link
Contributor Author

pjebs commented Jul 30, 2019

v := m[-3]

// is equivalent to

v := m[len(m)-3]

@pjebs
Copy link
Contributor Author

pjebs commented Jul 30, 2019

The advantage is that when writing complex algorithms, it improves the readability for some people. It seems more intuitive for -2 to mean second last item in slice compared to len - 2 which requires an extra step in cognitive processing.

@mvdan
Copy link
Member

mvdan commented Jul 30, 2019

This has been proposed in multiple different ways in the past. How does this proposal compare to the earlier issues? For example:

#11245
#16231
#20176

I personally think that #25594 was the best of the group, but it was ultimately closed by the author as it didn't seem like a clear win.

@mvdan mvdan added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 30, 2019
@bcmills
Copy link
Contributor

bcmills commented Jul 30, 2019

What are the constraints on the x in -x? Must it be an integer literal? What about (typed or untyped) constants?

What happens if x is an int with a negative value? (Consider https://play.golang.org/p/234hf3Youmr.)

@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 30, 2019
@ianlancetaylor ianlancetaylor added v2 An incompatible library change LanguageChange Suggested changes to the Go language labels Jul 30, 2019
@ianlancetaylor ianlancetaylor changed the title Proposal: Go 2: negative index slices proposal: Go 2: negative index slices Jul 30, 2019
@pjebs
Copy link
Contributor Author

pjebs commented Jul 31, 2019

  1. You can't use a variable or constants.
  2. It's only for an index. Not an index range (but perhaps this is doable).

The key is to think of it as a macro.

@bcmills
Copy link
Contributor

bcmills commented Jul 31, 2019

Treating constants differently from literals seems very non-orthogonal.

Treating literals differently from variables seems surprising. Since the operation in question modifies the behavior of index expressions, I would expect it to be indicated by something that looks more like a variant of an index expression than an ordinary mathematical operator, especially for something that is merely syntactic sugar.

@bcmills
Copy link
Contributor

bcmills commented Jul 31, 2019

Here's an alternative to consider: if you want to reduce the stutter, consider the len([…]) expression to be elided. Normally, we indicate elision in Go using the _ character. So perhaps _ within an index expression could stand for len([…])?

x := someDescriptiveSliceName[_-3]

@pjebs
Copy link
Contributor Author

pjebs commented Aug 1, 2019

_ seems unnecessary to me.

@deanveloper
Copy link

This isn't purely at a compiler level because you can only know the value of an int at runtime. For instance:

func reversedIndex(slice []int, index int) int {
    return slice[-index] // we can't know if this is "-index" or "len(slice)-index" until runtime
}

Technicalities aside, I think that the necessity of having a special case for negative integers is a bit inconsistent, however I understand it's use because of backwards compatibility and out of necessity in order to flip the integer's sign before indexing normally into an array. But the necessary evil seems like an exception that makes the feature not feel... right.

@deanveloper
Copy link

deanveloper commented Aug 3, 2019

Sorry, I hadn't seen that. I thought I read the comments but didn't see that part.

The compiler, when presented with a -x in the slice index pretends it is equivalent to len(s)-x.
It is essentially a macro (syntax sugar). It only applies when a human types it in code. It does not apply if you use an int variable of negative value.

That part in the initial proposal seems to indicate that variables are allowed. Might want to reword that and explicitly state that variables are not allowed.

@ianlancetaylor
Copy link
Contributor

Another idea: permit len without parentheses, or len() with nothing in the parentheses, within an index operand (inside the square brackets). It would evaluate to the length of the slice. Then one could write s[len-1] to get the last element of a slice. I don't know if that is a good idea but it would be a minor simplification and would avoid the non-orthogonality of treating negative constants specially. And it seems clearer than ....

@griesemer griesemer changed the title proposal: Go 2: negative index slices proposal: Go 2: negative slice/array indexes Aug 27, 2019
@cespare
Copy link
Contributor

cespare commented Aug 28, 2019

@ianlancetaylor I like that idea. It doesn't help much if the name of the slice is s, but it makes a difference with longer names:

t := deferredTypeStack[len(deferredTypeStack)-1]

becomes

t := deferredTypeStack[len - 1]

@bcmills's idea seems reasonable as well:

t := deferredTypeStack[_ - 1]

A couple of questions that would need to be answered with either of these:

  1. Does len/_ work with arbitrary expressions, and it takes the value of the nearest surrounding indexing expression?
  2. Does it work for maps as well?

So for example, if the answer to both is "yes", then I suppose you could do something like this (not saying it's particularly good or useful):

m := make(map[string]int)
for item := range <-ch {
	m[fmt.Sprintf("channel item %d", len+1)] = item
}

@pjebs
Copy link
Contributor Author

pjebs commented Aug 28, 2019

s[len-1]

I don't think that suggestion is a big enough advantage in the real world. The governing idea of my proposal is to reduce the cognitive processing involved when reading complex algorithms.

When attempting to get a gist of what an algorithm does, seeing a -1 "clicks" conceptually more frictionlessly in my mind than seeing len-1.

Also note that using len is not backward compatible because someone might have redefined len to their own variable.

Python is a language that regularly uses negative indices and it helps with readability (for me, immensely). I've never seen a Python developer complain about negative indices, but I suspect they will miss the feature if it was removed.

I don't know if that is a good idea but it would be a minor simplification and would avoid the non-orthogonality of treating negative constants specially.

I don't know much about the actual developer time to implement my proposal but I suspect it would be minimal to treat it as a special case.

@deanveloper
Copy link

When attempting to get a gist of what an algorithm does, seeing a -1 "clicks" conceptually more frictionlessly in my mind than seeing len-1.

I disagree. A new user of Go would have much less trouble with len - 1 than -1. The only reason that -1 makes any sense at all is because you're already used to it from other languages (ie Python), however other people are not used to these same features.

Sure, python allows -1 to mean len - 1, but in JavaScript -1 just means the -1 index rather than len-1. In Java it would just throw an IndexOutOfBounds, and in C I think(?) the behavior would be undefined.

Go has always favored verbosity in order to promote clarity, and I think that len - 1 is a great way to do that. The benefit of the reduction of cognitive processing will be achieved once it is widely used and our eyes get used to it, just as it was in languages such as Python.

@ianlancetaylor
Copy link
Contributor

@pjebs Actual developer time isn't a concern here. The concern is lack of orthogonality in the language. It's confusing to add a complex new concept: an expression that can be any non-negative value, or can be a negative constant, but cannot be a negative value that is not a constant.

@ianlancetaylor
Copy link
Contributor

@cespare I think that len with no argument would implicitly take as an argument the immediately surrounding index expression, without re-evaluating it. It would be an error to use it outside of an index expression. I suppose it should work with maps as well although it's hard to imagine people using it much.

@cespare
Copy link
Contributor

cespare commented Aug 28, 2019

@ianlancetaylor when you write s[len-1], isn't len part of a subtraction expression, not (immediately) inside an index expression?

@ianlancetaylor
Copy link
Contributor

Yes, by "immediately" I meant to describe what should happen with s1[s2[len - 1]].

@ianlancetaylor
Copy link
Contributor

The most common case here is probably len(s) - 1. It's worth noting that if we had generics, we could have library functions like slices.Last(s) and slices.Pop(s) that would work with any slice type. That might remove many of the cases where this is valuable.

I agree with @deanveloper that s[-1] does not click for me.

@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Sep 10, 2019
@mdaliyan
Copy link

mdaliyan commented Oct 2, 2019

Here's an alternative to consider: if you want to reduce the stutter, consider the len([…]) expression to be elided. Normally, we indicate elision in Go using the _ character. So perhaps _ within an index expression could stand for len([…])?

x := someDescriptiveSliceName[_-3]

_ seems magic to me. It's something that needs explanation. Anything that makes me answer "it's magic" to a newcomer who asks me "what is this?" bothers me .

I prefer x[len-1] rather than x[_-1]

@ianlancetaylor
Copy link
Contributor

It seems that the current proposal is that len may appear in an index expression, without arguments or parentheses, to refer to the length of the innermost value (a slice, array, or map) being indexed.

The next question would be: how often does this case occur? It should be possible to search some corpus of Go code for how often we see x[...len(x)...]. Anybody want to write a little program to find expressions of this sort?

@IanTayler
Copy link

@ianlancetaylor I wrote a quick regex hack in python to find single-line examples of this pattern. Found 799 examples in 379 files in the /src/ directory of this repo (master branch as of commit d2c039f).

The command I ran was python path/to/lenslice.py $(find path/to/go/src -type f | grep "go$")

It only finds single-line examples of this and it can have some false-positives (although in a quick manual scan of the results I only found a couple of false positives, both as commented code). It may still help us get a general idea of how common (or not) this pattern is.

Here is the quick hack as a gist: https://gist.github.com/IanTayler/1db93fe558134c7b5f540ede13045725

And here are the results of running the command for find src -type f | grep "go$" files in this repo, as a pastebin: https://pastebin.com/8tK0bg2j

@ianlancetaylor
Copy link
Contributor

@IanTayler Thanks for doing that.

One thing I see right off the bat is that quite a few of the expressions have multiple len calls, such as the very first one which says lastRepeat[:len(lastRepeat)-len(after)]. With this proposal that would become lastRepeat[:len-len(after)]. Now I am much less comfortable with this idea.

@cespare
Copy link
Contributor

cespare commented Nov 4, 2019

@ianlancetaylor to me, that seems an argument in favor of @bcmills's _ proposal.

lastRepeat[:_-len(after)]

@ianlancetaylor
Copy link
Contributor

To me _ means "ignore the value", so I'm not sure I buy @bcmills 's suggestion that we should treat it as "elide the value." I would be slightly happier with ..., which at least already stands in for the length when used as [...]int{}.

@mdaliyan
Copy link

mdaliyan commented Nov 5, 2019

To me _ means "ignore the value", so I'm not sure I buy @bcmills 's suggestion that we should treat it as "elide the value." I would be slightly happier with ..., which at least already stands in for the length when used as [...]int{}.

Agreed with the first part that says _ means ignore in Go already. As I said before, _ is something that needs explanation to a newcomer.

@alanfo
Copy link

alanfo commented Nov 5, 2019

Well, len still looks better to me than a symbol though I agree that ... would be more appropriate than _.

However, I still have doubts about whether it's worth doing anything here. In cases where a slice has a long name or its length is used more than once in a piece of code, I often cache the length in a simple variable and then use that instead. This is, in effect, what len (or a symbol) would be doing - albeit implicitly - but has the advantage that the variable can be used anywhere and not just in an index expression.

Also, as @ianlancetaylor intimated earlier, generic library functions would allow many of these cases to be expressed in a simple intuitive way. The example we're stuck on at the moment could be solved with slices.Drop(lastRepeat, len(after)).

@bcmills
Copy link
Contributor

bcmills commented Nov 5, 2019

That's a good point about '_' meaning “ignore” rather than “infer”. I agree that ... is a better fit.

@IanTayler
Copy link

IanTayler commented Nov 5, 2019

... does not immediately suggest "length of the surrounding slice" to me, in the way len does. I agree with [:len-len(after)] being a bit troublesome, but I also don't like ... or _. Both seem like the kind of thing I hate seeing when I'm learning a new language.

So my vote considering the current options (negative indices, _, len and ...) is that we're better off as we are now.

Maybe len() is okay. That one wouldn't bother me as much.

@jimmyfrasche
Copy link
Member

If something like this goes in, I'm 👍 specifically on len() as in s[len()-1].

It's clearer than s[len-1] or s[len(...)-1] imo.

(It's been mentioned a couple times but always in posts with other proposals so just 👍'ing those is ambiguous.)

@alanfo
Copy link

alanfo commented Nov 6, 2019

If something like this goes in, I'm +1 specifically on len() as in s[len()-1].

Yes, I agree with that.

Although I prefer len to using a symbol, it seems incomplete (it is after all a function) and len() therefore looks better. len(...) is too long.

@ianlancetaylor
Copy link
Contributor

The examples shown in #33359 (comment) show that using len with no arguments can be rather confusing in real code. Using len() is not much better, as it still unclear at first glance what length is being taken. Symbols like _ and ... are not particularly clear. Nothing seems like a real improvement on the current code.

For these reasons this is a likely decline. Leaving open for four weeks for final comments.

-- for @golang/proposal-review

@donutloop
Copy link
Contributor

The most common case here is probably len(s) - 1. It's worth noting that if we had generics, we could have library functions like slices.Last(s) and slices.Pop(s) that would work with any slice type. That might remove many of the cases where this is valuable.

I agree with @deanveloper that s[-1] does not click for me.

In my point of view. This slices.Last(s) and slices.Pop(s) is very clean and good understandable. It be could be even just Last(s)and Pop(s). Of course! The old behavior might still cause some confusion. Honestly... I didn’t even know that this is possible with slices.

@ianlancetaylor
Copy link
Contributor

No change in consensus.

@pjebs
Copy link
Contributor Author

pjebs commented Apr 6, 2020

I implemented it in v1.0.3 of https://github.com/rocketlaunchr/igo Go transpiler

@golang golang locked and limited conversation to collaborators Apr 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dotdotdot ... FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests