Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Reduce memcopy in parquet #19350

Merged
merged 8 commits into from
Oct 21, 2024
Merged

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Oct 21, 2024

It should be sound since it matches what we did a long time ago

Ok(mmap::ColumnStore::Local(self.0.deref()))

@github-actions github-actions bot added internal An internal refactor or improvement rust Related to Rust Polars labels Oct 21, 2024
Copy link

codecov bot commented Oct 21, 2024

Codecov Report

Attention: Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.21%. Comparing base (8a76dad) to head (45f1690).
Report is 70 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-io/src/mmap.rs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19350      +/-   ##
==========================================
- Coverage   80.24%   80.21%   -0.04%     
==========================================
  Files        1523     1523              
  Lines      209545   209790     +245     
  Branches     2434     2434              
==========================================
+ Hits       168148   168281     +133     
- Misses      40842    40954     +112     
  Partials      555      555              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nameexhaustion nameexhaustion changed the title refactor(rust): Try reduce memcopy refactor(rust): Reduce memcopy in parquet Oct 21, 2024
@nameexhaustion nameexhaustion marked this pull request as ready for review October 21, 2024 11:00
match self {
ReaderBytes::Borrowed(v) => MemSlice::from_static(v),
ReaderBytes::Owned(v) => MemSlice::from_vec(v),
ReaderBytes::Mapped(v, _) => MemSlice::from_mmap(Arc::new(v)),
ReaderBytes::Owned(v) => MemSlice::from_static(unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't safe if the self is dropped before the MemSlice.

I see that MemSlice has an inner that can have a bytes::Bytes variant. If we have an ReaderBytes::Owned we can move that vec into the MemSlice.

I think we should also change the type form Vec<u8> to Bytes so we can freely clone it.

@nameexhaustion nameexhaustion marked this pull request as draft October 21, 2024 11:40
@nameexhaustion nameexhaustion marked this pull request as ready for review October 21, 2024 12:22
@ritchie46 ritchie46 merged commit 81154ed into pola-rs:main Oct 21, 2024
19 of 20 checks passed
@ritchie46 ritchie46 changed the title refactor(rust): Reduce memcopy in parquet perf: Reduce memcopy in parquet Oct 21, 2024
@ritchie46 ritchie46 added python Related to Python Polars performance Performance issues or improvements labels Oct 21, 2024
@nameexhaustion nameexhaustion deleted the reduce-memcopy branch October 28, 2024 04:54
@c-peters c-peters added the accepted Ready for implementation label Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation internal An internal refactor or improvement performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants