Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add RawVec to unify raw Vecish code #26955

Merged
merged 2 commits into from
Jul 18, 2015
Merged

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Jul 10, 2015

Per the top level comment:

A low-level utility for more ergonomically allocating, reallocating, and deallocating a
a buffer of memory on the heap without having to worry about all the corner cases
involved. This type is excellent for building your own data structures like Vec and VecDeque.
In particular:

  • Produces heap::EMPTY on zero-sized types
  • Produces heap::EMPTY on zero-length allocations
  • Catches all overflows in capacity computations (promotes them to "capacity overflow" panics)
  • Guards against 32-bit systems allocating more than isize::MAX bytes
  • Guards against overflowing your length
  • Aborts on OOM
  • Avoids freeing heap::EMPTY
  • Contains a ptr::Unique and thus endows the user with all related benefits

This type does not in anyway inspect the memory that it manages. When dropped it will
free its memory, but it won't try to Drop its contents. It is up to the user of RawVec
to handle the actual things stored inside of a RawVec.

Note that a RawVec always forces its capacity to be usize::MAX for zero-sized types.
This enables you to use capacity growing logic catch the overflows in your length
that might occur with zero-sized types.

However this means that you need to be careful when roundtripping this type
with a Box<[T]>: cap() won't yield the len. However with_capacity,
shrink_to_fit, and from_box will actually set RawVec's private capacity
field. This allows zero-sized types to not be special-cased by consumers of
this type.

Edit:
fixes #18726 and fixes #23842

@Gankra
Copy link
Contributor Author

Gankra commented Jul 10, 2015

CC @rust-lang/libs

@rust-highfive
Copy link
Collaborator

r? @huonw

(rust_highfive has picked a reviewer for you, use r? to override)

#[deprecated(since = "1.2.0",
reason = "replaced with deref coercions or Borrow")]
#[allow(deprecated)]
pub struct DerefString<'a> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I deleted this stuff since it's been deprecated for a while and porting it to the new stuff would be more effort than it's worth.

@reem
Copy link
Contributor

reem commented Jul 10, 2015

cc me

// this is still enough for space for what the user requested, because otherwise
// the computation of `2 * needed_size` would have overflowed `usize::MAX`.

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to be dead code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I should probably just remove this...

@Gankra
Copy link
Contributor Author

Gankra commented Jul 11, 2015

Bench results:
https://gist.github.com/Gankro/4eed4b4842045114333f

Vec seems to be uniformly like 10% slower maybe? Our benches are so noisy...

if old_size >= MAX_MEMORY_SIZE { panic!("capacity overflow") }
let mut size = max(old_size, 2 * mem::size_of::<T>()) * 2;
if old_size > size || size > MAX_MEMORY_SIZE {
size = MAX_MEMORY_SIZE;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon reflection, this is just wrong... MAX_MEMORY_SIZE is in no way a multiple of size_of<T>? Also the capacity isn't updated...? Also this seems to be relying on * 2 wrapping?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that mem::size_of::<T>() <= (~0)/4 and MAX_MEMORY_SIZE-1 <= (~0)/2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arielb1 You still can't just set size to MAX_MEMORY_SIZE, right?

@Gankra
Copy link
Contributor Author

Gankra commented Jul 11, 2015

Added some extra micro-optimization in the same vein as Vec used to have. (updated the gist)

Now it looks like it's 10% faster than the old impl... but it looks like everything looked about that much faster this time (even unrelated stuff).

Benches. How do they work?

Still, this is fixing some soundness issues with the current implementation, and now it looks like it's basically a performance wash?

@bluss
Copy link
Member

bluss commented Jul 11, 2015

Even unrelated stuff? Cpu frequency scaling often ruins it for me. Hm I thought this looked so good.

But you're right, if I look at for example the LinkedList iteration benchmark as control, then it looks like the benchmark runs happened at different cpu speeds basically.

dealloc(self.allocation, self.cap);
}
}
for _x in self.by_ref() {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this compiles nowadays if you just remove the .by_ref()?

@petrochenkov
Copy link
Contributor

@gankro
I think all vectors, including zero-sized, should have maximum capacity of isize::MAX... just in case.
(Something like this is what I have in mind.)
Nobody will use huge vectors of zero-sized types anyway.

@Gankra
Copy link
Contributor Author

Gankra commented Jul 11, 2015

@petrochenkov we definitely support them today; this seems like a needless regression for a hypothetical future change. If that future change does in fact occur we could consider making the change then.

@petrochenkov
Copy link
Contributor

It's easier to make things stricter now, than it will be in the future. Besides, this restriction gives some potentially useful guarantees, for example, index difference never overflows. The restriction could be easily relaxed in the future if a single motivational example for large zero-sized vectors is found.
Anyway, this is not a big deal.

@Gankra
Copy link
Contributor Author

Gankra commented Jul 11, 2015

Oh I forgot to mention, this finally fixes #18726 (assuming someone fixed arrays).

@Gankra
Copy link
Contributor Author

Gankra commented Jul 12, 2015

@petrochenkov I've never seen "blind" index differences (where you want a signed result) before. What's the usecase look like?

@bors
Copy link
Contributor

bors commented Jul 12, 2015

☔ The latest upstream changes (presumably #26985) made this pull request unmergeable. Please resolve the merge conflicts.

}
}

/// Creates a RawVec with exactly the capacity and alignment requirements for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate "for" in this sentence.

@bors
Copy link
Contributor

bors commented Jul 14, 2015

☔ The latest upstream changes (presumably #27002) made this pull request unmergeable. Please resolve the merge conflicts.

@Gankra Gankra force-pushed the raw-vec branch 4 times, most recently from 48ca988 to d8a862c Compare July 14, 2015 20:14
@Manishearth
Copy link
Member

This breaks the debuginfo tests, I think the pretty printers need to be updated so that they can understand the new Vec/String.

cc @michaelwoerister

@Gankra
Copy link
Contributor Author

Gankra commented Jul 16, 2015

@bors r-

Oh shoot! I forgot that that was broken! I asked on IRC about that, didn't get a response, and then got distracted and forgot.

@Gankra
Copy link
Contributor Author

Gankra commented Jul 17, 2015

I've fixed the pretty printer; need someone to review that change.

@alexcrichton
Copy link
Member

pretty printer changes look fine to me

@Gankra
Copy link
Contributor Author

Gankra commented Jul 17, 2015

@bors r=bluss,alexcrichton

@bors
Copy link
Contributor

bors commented Jul 17, 2015

📌 Commit b0ee1eb has been approved by bluss,alexcrichton

@bors
Copy link
Contributor

bors commented Jul 17, 2015

⌛ Testing commit b0ee1eb with merge de860d5...

@alexcrichton
Copy link
Member

@bors: retry force

@bors
Copy link
Contributor

bors commented Jul 17, 2015

⌛ Testing commit b0ee1eb with merge af1f70a...

@bors
Copy link
Contributor

bors commented Jul 17, 2015

💔 Test failed - auto-mac-32-opt

@Gankra
Copy link
Contributor Author

Gankra commented Jul 17, 2015

@bors retry

Seems spurious...?

bors added a commit that referenced this pull request Jul 17, 2015
Per the top level comment:

A low-level utility for more ergonomically allocating, reallocating, and deallocating a
a buffer of memory on the heap without having to worry about all the corner cases
involved. This type is excellent for building your own data structures like Vec and VecDeque.
In particular:

* Produces heap::EMPTY on zero-sized types
* Produces heap::EMPTY on zero-length allocations
* Catches all overflows in capacity computations (promotes them to "capacity overflow" panics)
* Guards against 32-bit systems allocating more than isize::MAX bytes
* Guards against overflowing your length
* Aborts on OOM
* Avoids freeing heap::EMPTY
* Contains a ptr::Unique and thus endows the user with all related benefits

This type does not in anyway inspect the memory that it manages. When dropped it *will*
free its memory, but it *won't* try to Drop its contents. It is up to the user of RawVec
to handle the actual things *stored* inside of a RawVec.

Note that a RawVec always forces its capacity to be usize::MAX for zero-sized types.
This enables you to use capacity growing logic catch the overflows in your length
that might occur with zero-sized types.

However this means that you need to be careful when roundtripping this type
with a `Box<[T]>`: `cap()` won't yield the len. However `with_capacity`,
`shrink_to_fit`, and `from_box` will actually set RawVec's private capacity
field. This allows zero-sized types to not be special-cased by consumers of
this type.

Edit: 
fixes #18726 and fixes #23842
@bors
Copy link
Contributor

bors commented Jul 17, 2015

⌛ Testing commit b0ee1eb with merge e58601a...

@bors
Copy link
Contributor

bors commented Jul 18, 2015

💔 Test failed - auto-mac-64-opt

@Gankra
Copy link
Contributor Author

Gankra commented Jul 18, 2015

@bors retry

@bors
Copy link
Contributor

bors commented Jul 18, 2015

@bors bors merged commit b0ee1eb into rust-lang:master Jul 18, 2015
@Gankra
Copy link
Contributor Author

Gankra commented Jul 18, 2015

yusssss

@nox
Copy link
Contributor

nox commented Sep 30, 2020

This probably doesn't matter in real life, but I just realised this PR regressed Cow::deref because prior to this PR, both variants of Cow<str> had (ptr, len) in that order, meaning Cow::deref just returned that pair in both paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet