Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for out-of-bound accesses with single index vs slices #943

Closed
Poshi opened this issue Feb 11, 2022 · 5 comments
Closed

Documentation for out-of-bound accesses with single index vs slices #943

Poshi opened this issue Feb 11, 2022 · 5 comments

Comments

@Poshi
Copy link
Contributor

Poshi commented Feb 11, 2022

According to the documentation:

out-of-bounds index accesses are errors, but out-of-bounds slice accesses result in trimming the indices, resulting in a short string or even the empty string

This is somewhat ambiguous: if out-of-bound index accesses are errors, it is surprising the an slice access does not trigger the same error. Would not make sense to return the same as in the slice access: an absent value?

Also, the examples below show:

mlr -n put '
  end {
    x = "abcde";
    print x[1:2];
    print x[1:6];
    print x[10:20];
  }
'
ab


If we take for good that out-of-bound slice accesses can result in short strings, should not the second access (x[1:6]) result in the short string "abcde"? (It is a short string because the slice should return a 6 characters string but the result only contains 5).

@johnkerl
Copy link
Owner

johnkerl commented Feb 11, 2022

@Poshi you're right, the documentation is not clear.

I very intentionally imitated Python's behavior here, but didn't document that fact ...

@johnkerl johnkerl changed the title Out of bound accesses Documentation for out-of-bound accesses with single index vs slices Feb 11, 2022
@johnkerl
Copy link
Owner

OK I didn't entirely even mimic Python behavior as intended. :(

Keeping in mind Python indices are 0-up, with lower-inclusive-upper-exclusive, while Miller indices are 1-up, with lower-inclusive-upper-inclusive:

$ python
>>> x = "abcde"

>>> x[0:2]
'ab'

>>> x[0:5]
'abcde'

>>> x[0:6]
'abcde'

>>> x[9:20]
''
$ mlr repl
[mlr] x = "abcde"

[mlr] x = "abcde"

[mlr]  x[1:2]
"ab"

[mlr]  x[1:5]
"abcde"

[mlr]  x[1:6]
""
[mlr]  x[10:20]
""

So for the first thing

Would not make sense to return the same as in the slice access: an absent value?

-- I stand by my intention to mimic Python here.

For the second thing

If we take for good that out-of-bound slice accesses can result in short strings, should not the second access (x[1:6]) result in the short string "abcde"?

-- you are quite right, and I'm embarrassed to realize I missed this. :(

@johnkerl johnkerl added the bug label Feb 21, 2022
@johnkerl
Copy link
Owner

For arrays, I did what I intended:

[mlr] x = [1,2,3,4,5]

[mlr] x[1:2]
[1, 2]

[mlr] x[1:5]
[1, 2, 3, 4, 5]

[mlr] x[1:6]
[1, 2, 3, 4, 5]

[mlr] x[10:20]
[]

@johnkerl
Copy link
Owner

@Poshi please let me know what you think of #960.

@Poshi
Copy link
Contributor Author

Poshi commented Feb 22, 2022

@Poshi please let me know what you think of #960.

Just checked the three commits. Looks good! The new code seems OK. I'm not sure if some checks can be reduced, but they look fine. (I've also been lost a couple minutes when the index went down to zero until I realized the difference between "index" and "Zindex").
And regarding que documentation, adding quotes to the examples makes the empty strings stand out! Good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants