Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add z-order optimize #1429

Merged
merged 6 commits into from
Jun 6, 2023
Merged

feat: add z-order optimize #1429

merged 6 commits into from
Jun 6, 2023

Conversation

wjones127
Copy link
Collaborator

@wjones127 wjones127 commented Jun 3, 2023

Description

Implements Z-order in Rust. This is a very basic version that requires loading the whole partition into memory for sorting. In the future, we can implement a DataFusion-based code path that allows sorting with spilling to disk for even larger optimize jobs.

The Z-order function here is based on Arrow's row format. We truncate to take the first 16 bytes of data (padding with zeros at end as necessary). So for variable-width columns like strings, this means that we are only using the first 15 bytes of the string (the first byte is used to differentiate null and empty strings, see row format docs). If a user has a string column where they all share the same prefix, this z-order function won't work well for them. But in many common cases it will work.

We'll also expose this in Python as a follow up.

Related Issue(s)

Documentation

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate rust labels Jun 3, 2023
@github-actions github-actions bot removed the binding/python Issues for the Python package label Jun 3, 2023
@roeap
Copy link
Collaborator

roeap commented Jun 3, 2023

took the liberty of resolving the merge conflicts I caused :)

roeap
roeap previously approved these changes Jun 4, 2023
Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice ❤️ - folks are going to be quite excited about this ...

Left a few minor comments, mainly to also better understand the inner workings :).

Comment on lines 363 to 369
read_stream: impl Future<
Output = Result<
BoxStream<'static, Result<RecordBatch, ParquetError>>,
DeltaTableError,
>,
> + Send
+ 'static,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this be defined as a separate type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK what you think of the type I added.

@@ -414,6 +439,56 @@ impl MergePlan {
Ok((partial_actions, partial_metrics))
}

/// Creates a stream of batches that are zordered
///
/// Currently requires loading all the data into memory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds a bit like loading all data from the table - for a partitioned table, this would be loading a single partition?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if it's non-partitioned, it's everything. If it's partitioned then only max_concurrent_tasks number of partitions are loaded at once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment for this.

.as_slice(),
)
.unwrap();
let indices = arrow::compute::lexsort_to_indices(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be completely off here, but we may just get away with sorting an enumerate()'ed vector of values here. IIUC, we should not have null values at this point since this is already the encoded array and they byte values can be compared using the native methods. If that is true, we could avoid the non-trivial internals of the lexsort_to_indices function.

I guess the optimizations on the CPU level mentioned in the arrow row blog posts should take effekt either way since we are comparing bytes nonetheless. So leave it to your expertise if this is worthwhile :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think you are right! I'll give that a shot.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely simplified :)

Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate rust
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Z-order sorting option in Optimize operation
2 participants