Skip to content
This repository has been archived by the owner on Nov 5, 2018. It is now read-only.

Allocate smaller nodes in unbounded channels #81

Closed
ghost opened this issue Jul 28, 2018 · 5 comments
Closed

Allocate smaller nodes in unbounded channels #81

ghost opened this issue Jul 28, 2018 · 5 comments

Comments

@ghost
Copy link

ghost commented Jul 28, 2018

If an unbounded channel is created for sending only one message, 31 of the initially allocated 32 slots won't be used at all.

The situation can be improved by starting with a single-slot node in the beginning and allocating exponentially larger nodes for further messages.

Perhaps we should also avoid allocating anything at all in the channel constructor. This might give us some performance wins in cases where no messages are sent through the channel.

Regarding interaction with #80, only nodes of size 32 should be recycled.

@ghost
Copy link
Author

ghost commented Sep 16, 2018

Note to self: we should probably grow nodes up to size 64 or 128, which improves performance quite a bit under high concurrency.

@kleimkuhler
Copy link
Contributor

kleimkuhler commented Sep 25, 2018

I have an initial implementation for growing the Block slot count: https://github.com/kleimkuhler/crossbeam-channel/commit/caf6c945e5399ca2ed3647da33b1edddd53d38c7

slots.len() starts out at 1, and doubles for each fn install_new_block until it reaches a size of 32. It still uses the const BLOCK_CAP as its check for the cap. As discussed, it is also not currently optimized because it allocates slots separately on the heap.

It calculates the slot_count based off the current Block's slots.len() it is looking at (head or tail).

Currently some tests do not finish (probably should add a cut-off time). I'll need to learn more about debugging in Rust in a multi-threaded environment to figure out what is going on.

I committed it to a branch in case you have an idea for the failures. I can update here as I get a better handle on debugging!

edit: Updated commit: https://github.com/kleimkuhler/crossbeam-channel/commit/7a6929c75bbd009e9f959de2b0f537f28abc659d... thank you @cynecx

@ghost
Copy link
Author

ghost commented Sep 25, 2018

Oh, looks like we've been hit by undefined behavior again. :( Last time that happened was here: rust-lang/rfcs#1892 (comment)

This is how I discovered the bug:

  • Run cargo test --test mpsc normal::channel_tests::drop_full_shared, which seems to get stuck into an infinite loop.
  • Run the test in gdb and hit Ctrl-C. Inspect the stacktrace.
  • Looks like tail_slot_count is zero.
  • Turns out, if you call Block::new(0, 1), the intermediate Vec of slots will be empty.
  • This can only be explained as UB, and mem::zeroed() is really fishy.

The problem is that we create values of type T using mem::zeroed() and then copy them from the iterator into Vec and then into Box<[]>. Reading such invalid values is instant UB.

So, mem::zeroed() and mem::uninitialized() are footguns that should be avoided at all costs. MaybeUninit will be a much safer alternative in the future but until then, we need to find a different solution.

I suggest we change the message type from UnsafeCell<ManuallyDrop<T>> to UnsafeCell<Option<T>>. Then we can add this constructor for Slot<T>:

impl Slot<T> {
    fn empty() -> Slot<T> {
        Slot {
            msg: UnsafeCell::new(None),
            ready: AtomicBool::new(false),
        }
    }
}

We can safely initialize Blocks:

Block {
    start_index,
    slots: (0..slot_count)
        .map(|_| Slot::empty())
        .collect::<Vec<_>>()
        .into_boxed_slice(),
    next: Atomic::null(),
}

Some other parts of list.rs will need to be modified as well, but it's not a big deal.

@kleimkuhler
Copy link
Contributor

I have a new commit up (https://github.com/kleimkuhler/crossbeam-channel/commit/ba6b35ec4eef10c9b2cf41e65e8fc6aacbe2f7b7) after working out the changes required for avoiding the UB introduced by mem::zeroed() (thanks again for the links; they provided a lot of helpful information about this problem).

The issue I am addressing now is the failing drops test (for each of the wrappers). It appears I am getting roughly ~2x the number of drop counts compared to what is expected. The assertion it is failing on is here:

assert_eq!(DROPS.load(Ordering::SeqCst), steps);

It can be replicated with $ cargo test --test list normal::drops

I have tried a lot of different ways of dropping the msg of each Slot<T>, but I consistently get those results back. I think msg is in fact being properly dropped, but the counts are now different and I need to figure out why that is.

Note: Block<T> slots type is now slots: Box<[Slot<T>]> instead of slots: Box<[UnsafeCell<Slot<T>>]>. I don't run into an issue with that, and it is required to fit your recommendation of Block initialization above. However, without changing slots type, and thus initializing a Block with

Block {
    start_index,
    slots: (0..slot_count)
        .map(|_| UnsafeCell::new(Slot::empty()))
        .collect::<Vec<_>>()
        .into_boxed_slice(),
    next: Atomic::null(),
}

still suffers from the same test failure.

@ghost
Copy link
Author

ghost commented Sep 26, 2018

I left some comments in the commit you linked. Since Option<T> is not wrapped in a ManuallyDrop anymore, we don't need to use ptr::read and ptr::write to read and write into it. Note that these Option<T>s will be dropped when the Slot<T> is dropped so better to leave them None in the end.

@kleimkuhler kleimkuhler mentioned this issue Oct 3, 2018
3 tasks
@ghost ghost closed this as completed in #101 Oct 4, 2018
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

1 participant