Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Add Deque struct to the stdlib #2925

Open
wants to merge 1 commit into
base: nightly
Choose a base branch
from

Conversation

gabrieldemarmiesse
Copy link
Contributor

@gabrieldemarmiesse gabrieldemarmiesse commented Jun 2, 2024

@JoeLoser I did the API design proposal as requested. We should be able to easily discuss the api with this format.

If you want to look at the rendered markdown, you can read it here:
https://github.com/gabrieldemarmiesse/mojo/blob/add_deque_proposal/proposals/add-deque-struct.md

Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
Copy link
Contributor

@laszlokindrat laszlokindrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up, excellent doc! My main quibble is calling the data structure you propose a Deque. The two most important examples for us to follow are those of CPP and Python, and both of them make different decisions from what you propose here, which is why I think our double ended queue should follow those. That being said, I'm not against having a data structure like you describe in the stdlib, but I would prefer we just call it something explicit like RingBuffer or RingQueue or something like that.

Comment on lines +34 to +35
4) Can be efficiently converted from and into a `List`.
5) Has fast access to its elements (cache-friendly).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO 4) should not be a goal for the data structure we will call deque (or Deque). Neither CPP or Python makes this promise, and those are the languages/stdlibs closest to Mojo, and compatibility with Python is a stated goal (although, I admit, the level of this compatibility is not well defined). CPP does promise 5), and I think it's reasonable to aim for it (but maybe not right away).


- Slow indexing in the middle
- Not cache friendly
- Thread safe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be qualified: Thread safety is promised for pop/append on the ends, but not in the middle: https://docs.python.org/3/library/collections.html#collections.deque

Comment on lines +86 to +88
While we use Rust's implementation of the double-ended queue, that does not mean
that we need to follow Rust's API or that we can't provide an API fammiliar to
Python and Mojo's users. Here is the proposed API, which can be used as roadmap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mention above, Python's collections.deque promises O(1) insertion on the ends. This is part of the API contract, so you cannot fully decouple implementation for API. But either way, I would strongly argue for just following Python's deque to begin with. This is what Python developers first reach for when they want a queue, and CPP makes similar tradeoffs while also mitigating the cache problems you mention above.

- Efficient indexing
- Cache friendly
- Not thread safe
- Needs to move all existing elements when growing the buffer, O(N) worst case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would mean that the contents need to be Movable. This could be a reasonable restriction in the short term, but eventually queues of non-movable types should be supported as well (when we enable in-place construction).

@gabrieldemarmiesse
Copy link
Contributor Author

gabrieldemarmiesse commented Jun 8, 2024

@laszlokindrat Overall, I don't have a strong opinion on the internal implementation. Python and C++ use a double-linked list of arrays and Rust uses a ring buffer. I chose the ring buffer because it would be simpler as a first step, but I don't mind starting directly with the double-linked list of arrays. If possible I would prefer to avoid having to do both 😅 I don't think we need both in the stdlib. Someone else can always add a lib with the other.

Indeed if we drop the fast access to middle elements requirement, then we can easily switch to the other implementation.

If we decide that the arrays in the double-linked-list do not have a size known at compile-time, we can easily steal pointers from other data structures to append to the deque or initialize the deque with.

@JoeLoser JoeLoser added the needs-discussion Need discussion in order to move forward label Jun 10, 2024
@JoeLoser JoeLoser removed the needs-discussion Need discussion in order to move forward label Jun 10, 2024
@laszlokindrat laszlokindrat removed their assignment Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants