-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Allocator
type param to MutableBuffer
#6336
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
return; | ||
} | ||
|
||
#[cfg(feature = "allocator_api")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire MutableBuffer
has been gated under allocator_api
. Do we still need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may misread this. Only those new public APIs are limited under the feature gate.
@@ -28,6 +31,23 @@ use crate::{ | |||
|
|||
use super::Buffer; | |||
|
|||
#[cfg(not(feature = "allocator_api"))] | |||
pub trait Allocator: private::Sealed {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance, it's somewhat confusing to define a sealed trait here. Maybe add some comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, please take a look
@@ -28,6 +31,23 @@ use crate::{ | |||
|
|||
use super::Buffer; | |||
|
|||
#[cfg(not(feature = "allocator_api"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's a good idea to introduce allocator-api2 as a compatibility layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your information. It looks like allocator-api2
provides many things we don't need. And I'm hesitant to add a new dep for the normal case. We can add it in the future if it turns out that allocator-api2
suits our use case well. What we need here is
- Placeholders to standard library things that are not available without unstable features (
Allocator
andGlobal
) - Wrapped
alloc
anddealloc
methods
As for now we can define them by ourselves in few lines.
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your work. Overall, it looks good to me.
By the way, I still believe introducing allocator_api2
is a good idea, as it eliminates the need for a feature that requires nightly Rust.
The decision is up to you.
This PR also makes me wonder if we can introduce an allocator
for all similar APIs.
Thanks for your review @Xuanwo
I plan to add this for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @waynexia and @Xuanwo for the reviews
I think this idea is quite cool 🙏 However, in order to get it ready to merge a few more things are needed:
- Documentation (specifically document the new feature here https://crates.io/crates/arrow)
- Tests
In terms of testing, what I think would be the most useful would be an "end to end" test / example of how to use this feature. For example to create a MutableBuffer
with a custom allocator and report on the allocations performed or something.
Perhaps we could add such an example to the examples directory https://github.com/apache/arrow-rs/tree/master/arrow/examples and add a pointer to that that to the documentation?
@@ -61,8 +61,8 @@ jobs: | |||
submodules: true | |||
- name: Setup Rust toolchain | |||
uses: ./.github/actions/setup-builder | |||
- name: Test arrow-buffer with all features | |||
run: cargo test -p arrow-buffer --all-features | |||
- name: Test arrow-buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it confusing at first why this line doesn't follow the model of the rest of the tests in this workflow which run with --all-features
I am also concerned this may remove coverage for some of the other features such as prettyprint, ffi and pyarrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arrow-buffer
only has one feature gate allocator_api
from this PR. This specific change won't affect other tests like prettyprint
as the command -p arrow-buffer
only runs on this sub-crate. But I agree it's easy to forget when the new feature comes in the future... Sadly, cargo seems not to support opt-out one feature in CLI 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, the same problem comes to the main arrow
lib because I added a delegate feature allocator_api = ["arrow-buffer/allocator_api"]
. Looks like we have to enumerate all available features except allocator_api
in CI?
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Thanks for reviewing @alamb. I've created a new example
I tried to use |
arrow-buffer/src/buffer/mutable.rs
Outdated
#[cfg(not(feature = "allocator_api"))] | ||
allocator: Global, | ||
#[cfg(feature = "allocator_api")] | ||
allocator: *value.allocator(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The From
impl would benefit from having a type parameter for the allocator, otherwise the allocator here would always implicitly be the global one. That probably requires two separate impl blocks depending on the feature flag.
The same would be useful for Buffer::from_vec(Vec)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will default to Global
without allocator_api
, and inherit from the vector via Vec::allocator()
API. Is this behavior ideal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got what you mean https://github.com/apache/arrow-rs/pull/6336/files#diff-371342744df1b634b0bd9d90f4fe38c1eb0096df322fd3cc2fbc513f3428046cR692-R696
We should have two impl blocks indeed for different type parameters
A minimal allocator implementation that tracks the memory usage shouldn't be too big to include directly in the example, without dependencies. |
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
I am depressed about the large review backlog in this crate. We are looking for more help from the community reviewing PRs -- see #6418 for more |
I plan to look into this in more detail as it might be useful in our codebase, which makes good use of a custom allocator to track memory usage. A unit test with a custom allocator, to be run with miri, would be very nice. I have a concern that after turning the |
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Good catch 👍 I've limited those
New test |
Which issue does this PR close?
Part of #3960
Rationale for this change
I find #3960 for the same reason -- tracking memory consumption with
Array
s. The most straightforward way is to use the unstableAllocator
API.By adding a decoration layer to
Allocator
we can keep track of the real-time memory consumption of each container instance. There is a tiny tool that implements this https://github.com/waynexia/unkaiWhat changes are included in this PR?
The most important changes in this PR are
allocator_api
, same name as the unstable rust featureA: Allocator = Global
toMutableBuffer
likeVec
Since this requires an unstable rust feature,
allocator_api
can only be used in the nightly toolchain. To keep this library still working in the stable toolchain when the feature gate is disabled, this PR defines many dummy structs likeGlobal
andAllocator
to substitute those from std lib as they are not referencable without unstable feature.When
allocator_api
is enabled, users can either appoint theirAllocator
by new APIsnew_in
andwith_capacity_in
or invokingFrom<Vec<T, A>>
which will inherit the allocator from vec.This PR is tested with MIRI:
Are there any user-facing changes?
New APIs as described above