BTree: remove Ord bound from new #88040

nbdd0121 · 2021-08-15T00:50:18Z

K: Ord bound is unnecessary on BTree{Map,Set}::new and their Default impl. No elements exist so there are nothing to compare anyway, so I don't think "future proof" would be a blocker here. This is analogous to HashMap::new not having a K: Eq + Hash bound.

#79245 originally does this and for some reason drops the change to new and Default. I can see why changes to other methods like entry or symmetric_difference need to be careful but I couldn't find out any reason not to do it on new.

Removing the bound also makes the stabilisation of const fn new not depending on const trait bounds.

cc @steffahn who suggests me to make this PR.

r? @dtolnay

dtolnay

No elements exist so there are nothing to compare anyway

I don't think "future proof" would be a blocker here

I couldn't find out any reason not to do it

This is not as clear cut as the PR description implies.

I am hesitant about this because this API change makes it impossible to selectively tackle some of the code bloat caused by monomorphizations inside the internals of these collections. According to https://www.stroustrup.com/SCARY.pdf such optimizations deliver 1.2x to 2.1x faster performance and 1x to 25x smaller code size in real applications.

I pasted below a sketch of the kind of restructure that this change hinders.

Clearly we still have bounds available via the insert and contains methods etc, so a comparator could be passed down from there as a function argument to every places that needs it, but the extra argument passing could negate a chunk of the performance benefit relative to stashing a comparator at suitable places inside of the data structure up front.

@rust-lang/libs-api @rust-lang/libs

use self::private::{Element, TypeErasedSet};
use std::cmp::Ordering;
use std::marker::PhantomData;
use std::mem;

pub struct Set<T> {
    imp: TypeErasedSet,
    marker: PhantomData<T>,
}

impl<T> Set<T> {
    pub fn new() -> Self
    where
        T: Ord,
    {
        unsafe fn erased_cmp<T: Ord>(a: &dyn Element, b: &dyn Element) -> Ordering {
            T::cmp(&*(a as *const _ as *const _), &*(b as *const _ as *const _))
        }
        Set {
            imp: unsafe { TypeErasedSet::new(erased_cmp::<T>) },
            marker: PhantomData,
        }
    }

    pub fn insert(&mut self, value: T)
    where
        T: Ord,
    {
        let boxed = Box::new(value);
        unsafe {
            let erased = mem::transmute::<Box<dyn Element>, Box<dyn Element + 'static>>(boxed);
            self.imp.insert(erased)
        }
    }

    pub fn contains(&self, value: &T) -> bool
    where
        T: Ord,
    {
        unsafe { self.imp.contains(value) }
    }
}

mod private {
    use std::cmp::Ordering;

    pub(super) trait Element {}
    impl<Sized> Element for Sized {}

    type Comparator = unsafe fn(&dyn Element, &dyn Element) -> Ordering;

    pub(super) struct TypeErasedSet {
        // Obviously you could build a more efficient representation than this.
        // Just illustrating that the concept works without a monomorphized T.
        repr: Vec<Box<dyn Element>>,
        cmp: Comparator,
    }

    impl TypeErasedSet {
        pub(super) unsafe fn new(cmp: Comparator) -> Self {
            TypeErasedSet {
                repr: Vec::new(),
                cmp,
            }
        }

        pub(super) unsafe fn insert(&mut self, value: Box<dyn Element>) {
            match self.find(&*value) {
                Ok(present) => self.repr[present] = value,
                Err(absent) => self.repr.insert(absent, value),
            }
        }

        pub(super) unsafe fn contains(&self, value: &dyn Element) -> bool {
            self.find(value).is_ok()
        }

        unsafe fn find(&self, value: &dyn Element) -> Result<usize, usize> {
            self.repr
                .binary_search_by(|elem| unsafe { (self.cmp)(&**elem, value) })
        }
    }
}

fn main() {
    let mut set = Set::new();
    set.insert(5);
    set.insert(2);
    set.insert(9);
    set.insert(4);
    set.insert(9);
    println!("{}", set.contains(&4));
    println!("{}", set.contains(&3));
}

nbdd0121 · 2021-08-15T02:17:44Z

Clearly we still have bounds available via the insert and contains methods etc, so a comparator could be passed down from there as a function argument to every places that needs it, but the extra argument passing could negate a chunk of the performance benefit relative to stashing a comparator at suitable places inside of the data structure up front.

I suppose this wouldn't be a big deal for BTreeMap, since when searching the tree we currently only take NodeRef without having the references to the BTreeMap itself. So:

If the comparator is to be stored in BTreeMap but not within the nodes, then extra argument is needed anyway. This would mean that it doesn't make sense to store comparator in the BTreeMap.
If the comparator is to be stored within nodes, then the comparator is not needed to create an empty BTreeMap because it starts with no nodes (root = None).

steffahn · 2021-08-15T08:44:01Z

Storing a comparator in the data structure might kill covariance and ignoring that issue might be unsound. I will investigate more on that later.

Passing a comparator with every call (to insert, etc) seems more reasonable to me. Especially since you'd need to "pass down" the function anyway if it's stored at top level. Whether it originates from a read operation or from a compile-time constant passed down from one layer above in the call hierarchy doesn't seem like the most significant difference. But I don't even know where the proclaimed performance benefit in that "SCARY" paper comes from (it's a bit lengthy, havent read through much of it yet) so I can't be sure to have the full picture. (The code size reduction is obvious on the other hand.)

Passing the comparator every call also allows for more flexible API, where you could offer a version of BTreeMap (using the same underlying erased data structure) that supports a comparator dependent on short-term borrows of another data structure. E. g. a BTreeSet of indices into another Vec would be possible. (Sorted by the Order of the Vec elements).

steffahn · 2021-08-15T11:45:58Z

@dtolnay

[…] ignoring that issue might be unsound. I will investigate more on that later.

Here’s a demonstration using your code example.

use std::any::Any;
use once_cell::sync::OnceCell;

fn main() {
    let s = Set::<Wrapper<for<'a> fn(&'a str)>>::new();
    let mut s: Set<Wrapper<fn(&'static str)>> = s;
    static S: OnceCell<&'static str> = OnceCell::new();
    let f = |s| {
        let _ = S.try_insert(s);
    };
    s.insert(Wrapper(f));
    s.contains(&Wrapper(f));
    println!("{}", S.get().unwrap());
}

// for the record, BTreeSet is indeed covariant, too
use std::collections::BTreeSet;
fn _b_tree_set_is_covariant<'a: 'b, 'b, T>(x: BTreeSet<&'a T>) -> BTreeSet<&'b T> {
    x
}

struct Wrapper<T>(T);

impl<T: 'static> Ord for Wrapper<T> {
    fn cmp(&self, _other: &Self) -> Ordering {
        if let Some(f) = (self as &dyn Any).downcast_ref::<Wrapper<for<'a> fn(&'a str)>>() {
            f.0(&String::from("Hello world!"));
        }
        Ordering::Equal
    }
}
impl<T: 'static> PartialOrd<Self> for Wrapper<T> {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}
impl<T: 'static> Eq for Wrapper<T> {}
impl<T: 'static> PartialEq<Self> for Wrapper<T> {
    fn eq(&self, other: &Self) -> bool {
        self.cmp(other).is_eq()
    }
}

(in the playground)
Example output:

���)

This unsoundness is not a small bug in your code, but an inherent problem of storing a comparison function in a covariant collection. Since BTreeSet will never lose its covariance, I don’t think your argument against removing K: Ord bounds from BTreeMap::new can be salvaged.

Amanieu · 2021-08-15T12:25:25Z

Incidentally hashbrown already does something similar internally (passing the comparator as an argument) but for different reasons:

It makes RawTable independent of the details of how a value is hashed or compared.
It makes the RawTable API more flexible since the comparator closure can contain borrowed data that isn't tied to the lifetime of the table.

steffahn · 2021-08-15T12:38:38Z

Incidentally hashbrown already does something similar internally (passing the comparator as an argument)

It’s a very slight similarity only when considering the motivation discussed here (i.e. less monomorphization), since the hasher arguments are generic (impl Fn(&T) -> u64), so at run-time there’s no “passing” but instead still full monomorphization if the calls come from a wrapping HashMap (the passed function items are zero-sized types, not function pointers).

dtolnay · 2021-08-15T14:07:34Z

@rfcbot fcp merge

#88040 (comment) is persuasive as far as the Ord impl which exists during new being totally useless at any point after new has returned, so I am on board with this.

rfcbot · 2021-08-15T14:07:35Z

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

steffahn · 2021-08-15T14:32:31Z

@ anyone reviewing this, FYI:

this suggestion

cc @steffahn who suggests me to make this PR.

happened in a (short) topic on Zulip which can be found here (or in the archive).

nbdd0121 · 2021-08-15T17:51:41Z

~~Surprised to see that covariance are not asserted, filed #88058~~

EDIT: It is asserted just using a different name from the other containers.

bors · 2021-08-17T02:02:39Z

☔ The latest upstream changes (presumably #88094) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2021-08-18T01:28:30Z

☔ The latest upstream changes (presumably #86808) made this pull request unmergeable. Please resolve the merge conflicts.

rfcbot · 2021-08-18T13:24:15Z

🔔 This is now entering its final comment period, as per the review above. 🔔

stlankes · 2021-08-26T13:00:42Z

Naive question... Would this a solution for issue #88244 ?

nbdd0121 · 2021-08-27T07:53:35Z

Naive question... Would this a solution for issue #88244 ?

Yes

steffahn · 2021-08-27T08:01:28Z

I don’t think that this PR has anything to do with resolving or not resolving #88244. While it makes the particular code example compile again, there’s no reason why other trait bounds on other const functions shouldn’t have the same problem (w.r.t. usability in associated consts for trait impls). Especially the fact that it’s a change/regression in compiler behavior means that either #88244 would be closed as “working as intended” (breaking changes to unstable features are okay) or it needs a bugfix.

rfcbot · 2021-08-28T13:26:29Z

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

m-ou-se · 2021-08-31T20:36:19Z

@bors r+

bors · 2021-08-31T20:36:20Z

📌 Commit f33f266 has been approved by m-ou-se

Rollup of 9 pull requests Successful merges: - rust-lang#86376 (Emit specific warning to clarify that `#[no_mangle]` should not be applied on foreign statics or functions) - rust-lang#88040 (BTree: remove Ord bound from new) - rust-lang#88053 (Fix the flock fallback implementation) - rust-lang#88350 (add support for clobbering xer, cr, and cr[0-7] for asm! on OpenPower/PowerPC) - rust-lang#88410 (Remove bolding on associated constants) - rust-lang#88525 (fix(rustc_typeck): produce better errors for dyn auto trait) - rust-lang#88542 (Use the return value of readdir_r() instead of errno) - rust-lang#88548 (Stabilize `Iterator::intersperse()`) - rust-lang#88551 (Stabilize `UnsafeCell::raw_get()`) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

rust-highfive assigned dtolnay Aug 15, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 15, 2021

dtolnay reviewed Aug 15, 2021

View reviewed changes

This comment has been minimized.

Sign in to view

dtolnay added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Aug 15, 2021

rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Aug 15, 2021

nbdd0121 force-pushed the btreemap branch from 0016639 to 5ad40fa Compare August 15, 2021 20:35

nbdd0121 force-pushed the btreemap branch from 5ad40fa to 1455958 Compare August 17, 2021 02:06

BTree: remove Ord bound from new

f33f266

nbdd0121 force-pushed the btreemap branch from 1455958 to f33f266 Compare August 18, 2021 02:55

rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Aug 18, 2021

steffahn mentioned this pull request Aug 27, 2021

False negative for "the trait bound K: Ord is not satisfied" #88244

Closed

rfcbot added the finished-final-comment-period The final comment period is finished for this PR / Issue. label Aug 28, 2021

rfcbot added to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Aug 28, 2021

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 31, 2021

m-ou-se mentioned this pull request Sep 1, 2021

Rollup of 9 pull requests #88556

Merged

bors merged commit 5878780 into rust-lang:master Sep 1, 2021

rustbot added this to the 1.56.0 milestone Sep 1, 2021

nbdd0121 deleted the btreemap branch September 4, 2021 02:53

apiraino removed the to-announce Announce this issue on triage meeting label Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTree: remove Ord bound from new #88040

BTree: remove Ord bound from new #88040

nbdd0121 commented Aug 15, 2021 •

edited

Loading

dtolnay left a comment

nbdd0121 commented Aug 15, 2021

steffahn commented Aug 15, 2021 •

edited

Loading

steffahn commented Aug 15, 2021 •

edited

Loading

Amanieu commented Aug 15, 2021

steffahn commented Aug 15, 2021 •

edited

Loading

This comment has been minimized.

dtolnay commented Aug 15, 2021

rfcbot commented Aug 15, 2021 •

edited by joshtriplett

Loading

steffahn commented Aug 15, 2021

nbdd0121 commented Aug 15, 2021 •

edited

Loading

bors commented Aug 17, 2021

bors commented Aug 18, 2021

rfcbot commented Aug 18, 2021

stlankes commented Aug 26, 2021 •

edited

Loading

nbdd0121 commented Aug 27, 2021

steffahn commented Aug 27, 2021 •

edited

Loading

rfcbot commented Aug 28, 2021

m-ou-se commented Aug 31, 2021

bors commented Aug 31, 2021

BTree: remove Ord bound from new #88040

BTree: remove Ord bound from new #88040

Conversation

nbdd0121 commented Aug 15, 2021 • edited Loading

dtolnay left a comment

Choose a reason for hiding this comment

nbdd0121 commented Aug 15, 2021

steffahn commented Aug 15, 2021 • edited Loading

steffahn commented Aug 15, 2021 • edited Loading

Amanieu commented Aug 15, 2021

steffahn commented Aug 15, 2021 • edited Loading

This comment has been minimized.

dtolnay commented Aug 15, 2021

rfcbot commented Aug 15, 2021 • edited by joshtriplett Loading

steffahn commented Aug 15, 2021

nbdd0121 commented Aug 15, 2021 • edited Loading

bors commented Aug 17, 2021

bors commented Aug 18, 2021

rfcbot commented Aug 18, 2021

stlankes commented Aug 26, 2021 • edited Loading

nbdd0121 commented Aug 27, 2021

steffahn commented Aug 27, 2021 • edited Loading

rfcbot commented Aug 28, 2021

m-ou-se commented Aug 31, 2021

bors commented Aug 31, 2021

nbdd0121 commented Aug 15, 2021 •

edited

Loading

steffahn commented Aug 15, 2021 •

edited

Loading

steffahn commented Aug 15, 2021 •

edited

Loading

steffahn commented Aug 15, 2021 •

edited

Loading

rfcbot commented Aug 15, 2021 •

edited by joshtriplett

Loading

nbdd0121 commented Aug 15, 2021 •

edited

Loading

stlankes commented Aug 26, 2021 •

edited

Loading

steffahn commented Aug 27, 2021 •

edited

Loading