Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for chunks_exact/_mut; slice chunks with exact size #47115

Closed
1 task done
sdroege opened this issue Jan 2, 2018 · 61 comments · Fixed by #55178
Closed
1 task done

Tracking issue for chunks_exact/_mut; slice chunks with exact size #47115

sdroege opened this issue Jan 2, 2018 · 61 comments · Fixed by #55178
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@sdroege
Copy link
Contributor

sdroege commented Jan 2, 2018

This is inspired by ndarray and generally seems to allow llvm to remove more bounds checks in the code using the iterator (because the slices will always be exactly the requested size), and doesn't require the caller to add additional checks.

A PR adding these for further discussion will come in a bit.

Open questions:

  • Should the new iterators panic if the slice is not divisible by the chunk_size, or omit any leftover elements.
    The latter is implemented right now and very similar to how zip works and @shepmaster even argues that without this, this iterator is kind of useless and the optimization should be implemented as part of the normal chunks iterator (which seems non-trivial, see ).
    Omission of leftover elements is also how this iterator is implemented in ndarray (but far more general).
A function for getting access to the remainder exists on the iterator, similar to how `slice::Iter` and `slice::IterMut` give access to the tail (note: the remainder are the odd elements that don't completely fill a chunk, it's not the tail!)
**The majority of people (who spoke up here) seem to prefer the non-panicking behaviour**

- [x] Should it be called `exact_chunks` or `chunks_exact`? The former is how it's called in `ndarray`, the latter is [potentially more discoverable](https://github.com/rust-lang/rust/issues/47115#issuecomment-403090815) in e.g. IDEs.
**It was renamed to `chunks_exact`**
@sdroege
Copy link
Contributor Author

sdroege commented Jan 2, 2018

Example can be found here

The relevant part with differences in the assembly is

before:

.LBB4_24:
  cmp r11, 4
  mov eax, 4
  cmovb rax, r11
  test rbx, rbx
  je .LBB4_18
  cmp rbx, 4
  mov edx, 4
  cmovb rdx, rbx
  test r13, r13
  je .LBB4_18
  mov qword ptr [rbp - 96], rax
  mov qword ptr [rbp - 48], rsi
  mov qword ptr [rbp - 56], r9
  cmp r11, 3
  jbe .LBB4_27
  mov qword ptr [rbp - 96], rdx
  mov qword ptr [rbp - 48], rsi
  mov qword ptr [rbp - 56], r9
  cmp rbx, 3
  jbe .LBB4_29
  cmp rax, 1
  je .LBB4_39
  lea r10, [r13 + rax]
  sub r11, rax
  lea r12, [r15 + rdx]
  sub rbx, rdx
  cmp rax, 3
  jb .LBB4_41
  je .LBB4_42
  movzx r14d, byte ptr [r13]
  movzx r8d, byte ptr [r13 + 1]
  movzx eax, byte ptr [r13 + 2]
  imul r13d, eax, 19595
  imul edi, r8d, 38470
  imul eax, r14d, 7471
  add eax, edi
  add eax, r13d
  shr eax, 16
  mov byte ptr [r15], al
  cmp rdx, 1
  je .LBB4_44
  mov byte ptr [r15 + 1], al
  cmp rdx, 3
  jb .LBB4_45
  mov byte ptr [r15 + 2], al
  je .LBB4_46
  mov byte ptr [r15 + 3], 0
  test r11, r11
  mov r15, r12
  mov r13, r10
  jne .LBB4_24

after:

.LBB5_18:
  test rsi, rsi
  je .LBB5_20
  add rdx, -4
  movzx r10d, byte ptr [rsi]
  movzx eax, byte ptr [rsi + 1]
  movzx ebx, byte ptr [rsi + 2]
  lea rsi, [rsi + 4]
  imul r13d, ebx, 19595
  imul eax, eax, 38470
  imul ebx, r10d, 7471
  add ebx, eax
  add ebx, r13d
  shr ebx, 16
  mov byte ptr [rcx], bl
  mov byte ptr [rcx + 1], bl
  mov byte ptr [rcx + 2], bl
  mov byte ptr [rcx + 3], 0
  lea rcx, [rcx + 4]
  cmp rdx, 4
  jae .LBB5_18

@bluss
Copy link
Member

bluss commented Jan 2, 2018

I don't want to derail your discussion too much. Const generics and value-level chunks both have their uses. I'm reminded of this existing implementation of the "const" kind of chunking, in this case in an iterator that actually allows access to the whole blocks and then the uneven tail at the end: BlockedIter. Note that a Block<Item=T> is an array of T.

@sdroege
Copy link
Contributor Author

sdroege commented Jan 2, 2018

Interesting, thanks for mentioning that. For my use case that would probably work more or less the same way, but it's slightly different indeed.

@bluss
Copy link
Member

bluss commented Jan 2, 2018

BlockedIter was developed while looking at exactly the hand off between the blocks and the elementwise tail; the idea was to avoid some of the loss that otherwise shows up in code that converts between slices and slice iterators. In this case it's the same pointer being bumped through the whole iteration.

@bluss
Copy link
Member

bluss commented Jan 2, 2018

The chunks iterators are a good candidate for zip specialization (TrustedRandomAccess trait)

@sdroege
Copy link
Contributor Author

sdroege commented Jan 2, 2018

True. I'll add that in a bit, as a separate PR for the existing chunked iterators and as a separate commit for the new ones.

@sdroege
Copy link
Contributor Author

sdroege commented Jan 2, 2018

I forgot to add some benchmark results earlier. This is with the code from #47115 (comment) and running on a 1920*1080*4 byte slice. Basically 2.46x as fast.

running 2 tests
test tests::bench_with_chunks       ... bench:   7,702,902 ns/iter (+/- 177,747)
test tests::bench_with_exact_chunks ... bench:   3,132,468 ns/iter (+/- 202,032)

kennytm added a commit to kennytm/rust that referenced this issue Jan 4, 2018
…s, r=kennytm

Implement TrustedRandomAccess for slice::{Chunks, ChunksMut, Windows}

As suggested by @bluss in rust-lang#47115 (comment)
bors added a commit that referenced this issue Jan 5, 2018
Implement TrustedRandomAccess for slice::{Chunks, ChunksMut, Windows}

As suggested by @bluss in #47115 (comment)
@bluss bluss added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Jan 13, 2018
@bluss bluss changed the title Add slice::ExactChunks and ::ExactChunksMut that always returns slices with the requested size Tracking issue for exact_chunks/_mut; slice chunks with exact size Jan 13, 2018
sdroege added a commit to sdroege/rust that referenced this issue Jan 13, 2018
These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

See rust-lang#47115
@shepmaster
Copy link
Member

The latter is implemented right now and very similar to how zip works and @shepmaster even argues that without this, this iterator is kind of useless and the optimization should be implemented as part of the normal chunks iterator

Thinking of this from another direction, if either Rust or LLVM were to magically figure out how to perform this optimization in the case of chunks but chunks_exact didn't perform the length truncation, we'd then have two functions in the standard library that did the exact same thing. This would be annoying as a consumer.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Jan 14, 2018
Add slice::ExactChunks and ::ExactChunksMut iterators

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes rust-lang#47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see rust-lang#47115 (comment) for an example what this improves.
kennytm added a commit to kennytm/rust that referenced this issue Jan 15, 2018
Add slice::ExactChunks and ::ExactChunksMut iterators

These guarantee that always the requested slice size will be returned
and any leftoever elements at the end will be ignored. It allows llvm to
get rid of bounds checks in the code using the iterator.

This is inspired by the same iterators provided by ndarray.

Fixes rust-lang#47115

I'll add unit tests for all this if the general idea and behaviour makes sense for everybody.
Also see rust-lang#47115 (comment) for an example what this improves.
@sdroege
Copy link
Contributor Author

sdroege commented Jan 15, 2018

This shouldn't have been closed by the merge, can someone reopen it? I don't have the permissions for that it seems

@SimonSapin
Copy link
Contributor

The libs team discussed this and the consensus was to stabilize this with the methods panicking before returning an iterator if the slice’s length is not a multiple of the requested chunk size. This is consistent with e.g. [T]::copy_from_slice panicking on unequal sizes. Callers can take a sub-slice before calling these methods if they wish to ignore extra items.

@rfcbot fcp merge

@sdroege sdroege changed the title Tracking issue for exact_chunks/_mut; slice chunks with exact size Tracking issue for chunks_exact/_mut; slice chunks with exact size Sep 25, 2018
pietroalbini added a commit to pietroalbini/rust that referenced this issue Sep 25, 2018
@sdroege
Copy link
Contributor Author

sdroege commented Sep 26, 2018

With that done, how do we go from here to stabilization?

@alexcrichton
Copy link
Member

The next step would be FCP, but this issue is already in FCP. It has a blocking concern that would need to be resolved by @SimonSapin to make progress (assuming there's consensus about what to do with panicking)

@alexcrichton
Copy link
Member

Ok we discussed this a bit at libs triage, and the conclusion is that we'd like to recheck that the concerns with the original panicking API are resolved with today's implementation. To recap, today's implementation (on nightly) doesn't panic if the slice doesn't have an exact multiple of the length, but rather it's silently ignored. There are inherent methods on each iterator, though, to pull out the remainder.

@shepmaster does this resolve your original concern?

Or others as well, any opposition to having the current semantics be stabilized?

@shepmaster
Copy link
Member

@shepmaster does this resolve your original concern?

I am happy with the current external behavior of the function.

@alexcrichton
Copy link
Member

Ok great! @SimonSapin can you @rfcbot resolve panicking ?

@sdroege
Copy link
Contributor Author

sdroege commented Oct 17, 2018

Are there any new concerns? @SimonSapin

@SimonSapin
Copy link
Contributor

@rfcbot resolve panicking

@SimonSapin
Copy link
Contributor

Ping checkbox @aturon, @sfackler, or @withoutboats

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Oct 17, 2018
@rfcbot
Copy link

rfcbot commented Oct 17, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@alexcrichton
Copy link
Member

Ok! It's been quite awhile here so I think it's ok to shirt circuit the FCP slightly, @sdroege want to send the stabilization PR?

@SimonSapin
Copy link
Contributor

Should we consider #54580 to be "trivially enough" similar to this to stabilize at the same time?

@sdroege
Copy link
Contributor Author

sdroege commented Oct 18, 2018

@sdroege want to send the stabilization PR?

Yeah, preparing a PR now. I'll add rchunks in a separate commit into the same PR if it's agreed that it should be part of that. But I guess first of all a review of #54580 should be done.

@sdroege
Copy link
Contributor Author

sdroege commented Oct 18, 2018

There's now #55178 for the stabilization of this here (but not rchunks).

bors added a commit that referenced this issue Oct 18, 2018
Add slice::rchunks(), rchunks_mut(), rchunks_exact() and rchunks_exact_mut()

These work exactly like the normal chunks iterators but start creating
chunks from the end of the slice.

----

The new iterators were motivated by a [comment](#47115 (comment)) by @DutchGhost.

~~~This currently includes the commits from #54537 to not have to rename things twice or have merge conflicts. I'll force-push a new version of the branch ones those are in master.~~~

Also the stabilization tracking issue is just some number right now. I'll create the corresponding issue once this is reviewed and otherwise mergeable.

cc @DutchGhost
sdroege added a commit to sdroege/rust that referenced this issue Oct 18, 2018
kennytm added a commit to kennytm/rust that referenced this issue Oct 19, 2018
…lexcrichton

Stabilize slice::chunks_exact(), chunks_exact_mut(), rchunks(), rchunks_mut(), rchunks_exact(), rchunks_exact_mut()

Fixes rust-lang#47115, rust-lang#55177
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Oct 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet