Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: treat s[-1] as equivalent to s[len(s)-1] #25594

Open
josharian opened this issue May 27, 2018 · 23 comments
Open

proposal: spec: treat s[-1] as equivalent to s[len(s)-1] #25594

josharian opened this issue May 27, 2018 · 23 comments
Labels
LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Milestone

Comments

@josharian
Copy link
Contributor

Overview

This is a backwards-compatible language change proposal. It will not be as thorough as it could be, since I do not think it should be adopted. I am writing it up merely for future reference.

I propose to treat negative constant literals in slicing and indexing expressions as offsets from the end of the slice or array. For example:

  • s[-1] is equivalent to s[len(s)-1]
  • s[:-1] is equivalent to s[:len(s)-1]
  • s[:-2:-1] is equivalent to s[:len(s)-2:len(s)-1]

The motivation is to improve readability. Slice and index expressions like this occur commonly when treating slices as stacks.

Consider this code from the compiler:

func (f *Func) newPoset() *poset {
	if len(f.Cache.scrPoset) > 0 {
		po := f.Cache.scrPoset[len(f.Cache.scrPoset)-1]
		f.Cache.scrPoset = f.Cache.scrPoset[:len(f.Cache.scrPoset)-1]
		return po
	}
	return newPoset()
}

Using the proposed syntactic sugar, it would read:

func (f *Func) newPoset() *poset {
	if len(f.Cache.scrPoset) > 0 {
		po := f.Cache.scrPoset[-1]
		f.Cache.scrPoset = f.Cache.scrPoset[:-1]
		return po
	}
	return newPoset()
}

Scope

The proposed sugar would only apply to negative constant literals.

var b [10]byte
var v int = -1

const c = -1

var (
	_ = b[v]  // unchanged: results in runtime panic
	_ = b[c]  // unchanged: results in compile time error: index bounds out of range
	_ = b[-1] // new: evaluates to b[len(b)-1]
)

The rationale for b[v] to panic at runtime is that negative indices often arise due to overflow and programmer error.

The same rationale discourages allowing constant expressions such as b[c] as syntactic sugar. Constant expressions can be complex, non-local, and build-tag controlled.

However, a constant literal displays clear, obvious, local intent, and overflow-free, and affords little room for programmer error.

Order of evaluation and side-effects

Currently, the slice/array is evaluated before any index or slice indices. See https://play.golang.org/p/kTr9Az5HoDj.

This allows a natural order of evaluation for the proposal. Given expr[-1] or expr[:-1], expr is evaluated exactly once, and its length is used in subsequent len(expr)-c calculations.

Data

A quick-and-dirty AST parsing program examining slice expressions suggests that such expressions occur, but not with particularly high frequency.

Running over GOROOT yields that 2.51% of slice expressions could be rewritten to use this syntactic sugar. Running over my GOPATH yields 3.17%. Searching for candidate index expressions yields 0.35% and 0.91% respectively.

This is an underestimate. In many cases, for clarity, the code will already have been rewritten like:

n := len(s)-1
x = s[n]
s = s[:n]

And all index expressions were considered candidates for this analysis, which includes map accesses and assignments.

Nevertheless, this analysis suggests that this sugar is unlikely to be transformative in how Go code is written, and therefore probably does not pull its weight.

@josharian josharian added LanguageChange Suggested changes to the Go language v2 An incompatible library change Proposal labels May 27, 2018
@josharian josharian added this to the Go2 milestone May 27, 2018
@mvdan
Copy link
Member

mvdan commented May 27, 2018

Similar proposal in the past: #20176

@ianlancetaylor
Copy link
Member

Also #11245, but this variant is, on the one hand, better because because it doesn't accept variables, but, on the other hand, worse because why should we only accept constants?

@josharian
Copy link
Contributor Author

@mvdan yes, this proposal was follow-up to a suggestion mooted during the conversation in #20176. I should have noted that.

@ianlancetaylor indeed. The best argument I see for the discrepancy is one of practicality: variables (and constant expressions) and constant literals should be treated differently because doing so affords increased readability without increasing the probability of silent bugs. But I agree that that isn’t a particularly compelling argument, given the small proportion of code that matches the pattern.

@cznic
Copy link
Contributor

cznic commented May 27, 2018

How does adding an exception to an expression (-1 in some cases is to be read as len(foo)-1) improve readability? Please explain, for me it's quite the opposite.

@josharian
Copy link
Contributor Author

@gopherbot please close this issue in 48 hours.

(That obviously doesn’t work. But I wish it did. I’ll be the bot.)

@josharian
Copy link
Contributor Author

@cznic the same way all syntactic sugar does: By providing a mental shortcut for common uses. Instead of having to parse out everything and confirm equivalence of the inner/outer expressions, I see [-1] and think “final element”.

@as
Copy link
Contributor

as commented May 27, 2018

That would be more natural if the indices of the slice s were treated as elements in a ring of order len(s).

@DeedleFake
Copy link

I don't think is usually that much of a big deal usually, but I've wanted this feature sometimes for cases where I'm indexing into a slice that isn't bound to a variable. For example, given func Example() []string, I've occasionally wanted to do something like Example()[:-1] for various reasons. Similarly, if the slice has a really long identifier for some reason, it definitely improves readability if using the name twice on the same line would make it annoyingly long. Of course, in either of these cases you could probably just stick it in a variable most of the time, but there are occasionally instances where that's kind of annoying.

Not the best examples, perhaps, but I think that it could be useful in these cases.

@josharian
Copy link
Contributor Author

I'd like to re-open this. I wrote it six years ago(!) only to dismiss it. But over the years, talking to gophers, this comes up over and over as a common ergonomics complaint, like min and max (which I'm thrilled to have).

I propose the narrowest possible addition: special handling for the literal -1 in indexing and slicing. This gets the vast majority of the value, while leaving lots of options for future expansion (to other negative literals, to constant expressions).

@josharian josharian reopened this Aug 7, 2024
@jimmyfrasche
Copy link
Member

Now that there's a slices package we could have something like slices.At(s, -1) which is a bit longer but has the benefit of not requiring a language change and being easier to spot

@josharian
Copy link
Contributor Author

Or slices.Last(s). But then you also want slices.Pop(s) for the indexing. But should that do cap as well or just length? Should it return the final element as well as reslicing it? The space of useful operations is quite large here, which is why having access to raw indexing/slicing operators is valuable; they provide readable, orthogonal ways to express all of them.

@josharian
Copy link
Contributor Author

Lots of relevant discussion at #53510 (thanks to @dsnet for pointing this out).

@golang golang unlocked this conversation Aug 7, 2024
@seankhliao seankhliao added LanguageChangeReview Discussed by language change review committee and removed v2 An incompatible library change FrozenDueToAge labels Aug 7, 2024
@dsnet
Copy link
Member

dsnet commented Aug 7, 2024

To summarize some of my concerns with slices.Last:

  • It introduces an inconsistency where the first element is obtained with s[0], while the last is obtained with slices.Last(s).
  • It's unclear how to handle mutability. If slices.Last returns just E, then this is more natural to operate with, but if it returns *E, then you can mutate the element by storing into the pointer, but always handling a pointer can be fairly unnatural.

I support a language change that allows indexing with negative literals. Restricting it to just literals avoid any possibility of arithmetic bugs since there's no arithmetic involved. The presence of s[-1] is clear user intent to get the last element.

@dsnet
Copy link
Member

dsnet commented Aug 7, 2024

Copying over some of my usage analytics from #53510:

Frequency of constant slice indexes relative to start or end:

Index Occurrences Percentage
0 4953854 41.684%
1 2192516 18.449%
2 1135707  9.556%
3 1370480 11.532%
4 521199  4.386%
other 1391696 11.710%
-5 196  0.002%
-4 443  0.004%
-3 1682  0.014%
-2 10336  0.087%
-1 306168  2.576%

Even though obtaining the last element is an order-of-magnitude less frequent than getting the first element, there are still ~300k occurences.

Also, getting the last element is still 1/4 as common as arbitrary indexing expressions.

@atdiar
Copy link

atdiar commented Aug 7, 2024

I'm a bit on the fence on this.
It makes sense to me especially if we consider the wrapping behavior of uints.
But people are probably used to seeing len(s) - 1 anyway.

My worry is the curse of knowledge.

Is there a way to check that someone who has never coded in go, looking at a codebase, can intuit what the negative slice index does?

Other than that, if it's only literals as proposed, I guess that could work. But that's never something that truly bit me in my own code. Or at least I don't remember.

@zigo101
Copy link

zigo101 commented Aug 8, 2024

Should slicing syntax also support negative indexes? Such as aSlice[0:-1]?

@dsnet
Copy link
Member

dsnet commented Aug 8, 2024

I believe yes.

The original proposal said that:

s[:-1] is equivalent to s[:len(s)-1]

and I would assume aSlice[0:-1] is equivalent to aSlice[:-1].

Truncating off the last element is also a fairly common operation that I've personally needed to do (and yet another reason why slices function probably can't easily help with).

@doggedOwl
Copy link

Is there a way to check that someone who has never coded in go, looking at a codebase, can intuit what the negative slice index does?

it's familiar enough from other languages like python, ruby etc and it's one the main question for "why doesn't support" in languages that don't.
It's easy to remember and explain if you think of it as wrapping around.
and while i think this particular one is easy to intuit not everything should necessarily be easy to intuit, it's enough that it is clear and one search/spec consultation away.

@atdiar
Copy link

atdiar commented Aug 8, 2024

@doggedOwl not convinced. Searching python negative index and the first result is a stackoverflow question:
https://stackoverflow.com/questions/11367902/negative-list-index

In this day and age of advanced code completion, we could avoid having to explain this. len(s) - 1 is a common idiom for both beginners and more advanced programmers and does not require more mental resources to think in terms of wrapping.
Regardless of what other languages may have implemented.

Then again, just like how we have any instead of interface{}, that may work. But the latter is a bit more specific maybe.

@jimmyfrasche
Copy link
Member

I thought about what the semantics should be for various library functions then realized I've written these things enough times and threw it all in a repo so I can use it later https://pkg.go.dev/github.com/jimmyfrasche/sidx I wouldn't expect most of that to be in std.

@Owyn
Copy link

Owyn commented Aug 19, 2024

Golang is supposed to be simple and concise, that was the idea as far as I know,
so I was rather baffled when I learned that it's not up to par with Python in the area of indices and still uses some low-level stuff like s[len(s)-1] instead of a simple s[-1] (which was super obvious and pretty handy in Python)

And about stackoverflow - I was using it pretty actively first year or so after I discovered programming as a kid, but now there's nothing to ask there for me, not because I know everything (😅), but because I discovered google and search function on that very stackoverflow,

and now looking at the absolute most of questions there it seems obvious to me that those questions are asked by people who haven't even touched any kind of a "tour" for the language they are asking their questions for,

and I doubt that after learning something new and handy - they were disappointed about the learning curve or anything like that, because personally when I discovered negative indices in Python I was happy to learn about them and then use them instead of cumbersome low-level constructions of calculating length minus 1

@seankhliao
Copy link
Member

note this was declined by review in #33359

@apparentlymart
Copy link

apparentlymart commented Dec 2, 2024

One interesting thing about the slices.At variant is that it could be defined as resolving the index modulo the length, so that indices greater than the length of the slice would "wrap around" back to the start again. That sort of function is useful if you want to distribute a smaller number of things somewhat-evenly over a larger number of things.

Negative indices are therefore just the extension of that principle to lower numbers, wrapping around to the end of the slice.

However, I suppose to be an adequate answer to the original use case it would also need something like slices.SliceAt for taking a sub-slice with similar handling of the indices.

package slices

// At returns the element from s at the given index modulo the slice length.
func At[Slice ~[]E, E any](s Slice, index int) E

// SliceAt returns a subslice of s starting at start modulo the slice length and ending at
// start+length modulo the slice length.
//
// If the requested subslice would cross from the end of s back to the start of s then
// the result is a slice over a newly-allocated backing array. Otherwise the result
// shares the backing array of s.
func SliceAt[Slice ~[]E, E any](s Slice, start, length int) Slice

// (no variation for the three-operand slice because it doesn't really make sense
// to specify capacity modulo length.)

Both of these functions effectively treat the slice as an infinite series containing repetitions of slice s in both directions.

The quirk of SliceAt sometimes causing a new allocation and copying but sometimes behaving like a normal slice is admittedly awkward. Perhaps it would be better to require (start % len(s)) <= ((start + length) % len(s)) and length <= len(s), panicking if not, so that the result can be guaranteed to always share the same backing array, but that then adds an additional panicking case to carefully avoid, and cancels the mental model of an infinitely-repeating sequence. 🤷‍♂️

I will pre-concede that earlier comments suggest that taking the final element is considerably more common than any other variation that isn't an already in-bounds index and so this generality might not actually be justified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Projects
None yet
Development

No branches or pull requests