Skip to content

Commit

Permalink
Rollup merge of rust-lang#119917 - Zalathar:split-off, r=cuviper
Browse files Browse the repository at this point in the history
Remove special-case handling of `vec.split_off(0)`

rust-lang#76682 added special handling to `Vec::split_off` for the case where `at == 0`. Instead of copying the vector's contents into a freshly-allocated vector and returning it, the special-case code steals the old vector's allocation, and replaces it with a new (empty) buffer with the same capacity.

That eliminates the need to copy the existing elements, but comes at a surprising cost, as seen in rust-lang#119913. The returned vector's capacity is no longer determined by the size of its contents (as would be expected for a freshly-allocated vector), and instead uses the full capacity of the old vector.

In cases where the capacity is large but the size is small, that results in a much larger capacity than would be expected from reading the documentation of `split_off`. This is especially bad when `split_off` is called in a loop (to recycle a buffer), and the returned vectors have a wide variety of lengths.

I believe it's better to remove the special-case code, and treat `at == 0` just like any other value:
- The current documentation states that `split_off` returns a “newly allocated vector”, which is not actually true in the current implementation when `at == 0`.
- If the value of `at` could be non-zero at runtime, then the caller has already agreed to the cost of a full memcpy of the taken elements in the general case. Avoiding that copy would be nice if it were close to free, but the different handling of capacity means that it is not.
- If the caller specifically wants to avoid copying in the case where `at == 0`, they can easily implement that behaviour themselves using `mem::replace`.

Fixes rust-lang#119913.
  • Loading branch information
matthiaskrgr authored Jan 26, 2024
2 parents a5b60c9 + a655558 commit 772e80a
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 14 deletions.
8 changes: 0 additions & 8 deletions library/alloc/src/vec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2204,14 +2204,6 @@ impl<T, A: Allocator> Vec<T, A> {
assert_failed(at, self.len());
}

if at == 0 {
// the new vector can take over the original buffer and avoid the copy
return mem::replace(
self,
Vec::with_capacity_in(self.capacity(), self.allocator().clone()),
);
}

let other_len = self.len - at;
let mut other = Vec::with_capacity_in(other_len, self.allocator().clone());

Expand Down
24 changes: 18 additions & 6 deletions library/alloc/tests/vec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -958,23 +958,35 @@ fn test_append() {
#[test]
fn test_split_off() {
let mut vec = vec![1, 2, 3, 4, 5, 6];
let orig_ptr = vec.as_ptr();
let orig_capacity = vec.capacity();
let vec2 = vec.split_off(4);

let split_off = vec.split_off(4);
assert_eq!(vec, [1, 2, 3, 4]);
assert_eq!(vec2, [5, 6]);
assert_eq!(split_off, [5, 6]);
assert_eq!(vec.capacity(), orig_capacity);
assert_eq!(vec.as_ptr(), orig_ptr);
}

#[test]
fn test_split_off_take_all() {
let mut vec = vec![1, 2, 3, 4, 5, 6];
// Allocate enough capacity that we can tell whether the split-off vector's
// capacity is based on its size, or (incorrectly) on the original capacity.
let mut vec = Vec::with_capacity(1000);
vec.extend([1, 2, 3, 4, 5, 6]);
let orig_ptr = vec.as_ptr();
let orig_capacity = vec.capacity();
let vec2 = vec.split_off(0);

let split_off = vec.split_off(0);
assert_eq!(vec, []);
assert_eq!(vec2, [1, 2, 3, 4, 5, 6]);
assert_eq!(split_off, [1, 2, 3, 4, 5, 6]);
assert_eq!(vec.capacity(), orig_capacity);
assert_eq!(vec2.as_ptr(), orig_ptr);
assert_eq!(vec.as_ptr(), orig_ptr);

// The split-off vector should be newly-allocated, and should not have
// stolen the original vector's allocation.
assert!(split_off.capacity() < orig_capacity);
assert_ne!(split_off.as_ptr(), orig_ptr);
}

#[test]
Expand Down

0 comments on commit 772e80a

Please sign in to comment.