-
Notifications
You must be signed in to change notification settings - Fork 789
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
parquet Statistics - deprecate
has_*
APIs and add _opt
functions …
…that return `Option<T>` (#6216) * update public api Statistics::min to return an option. I first re-named the existing method to `min_unchecked` and made it internal to the crate. I then added a `pub min(&self) -> Opiton<&T>` method. I figure we can first change the public API before deciding what to do about internal usage. Ref: #6093 * update public api Statistics::max to return an option. I first re-named the existing method to `max_unchecked` and made it internal to the crate. I then added a `pub max(&self) -> Opiton<&T>` method. I figure we can first change the public API before deciding what to do about internal usage. Ref: #6093 * cargo fmt * remove Statistics::has_min_max_set from the public api Ref: #6093 * update impl HeapSize for ValueStatistics to use new min and max api * migrate all tests to new Statistics min and max api * make Statistics::null_count return Option<u64> This removes ambiguity around whether the between all values are non-null or just that the null count stat is missing Ref: #6215 * update expected metadata memory size tests Changing null_count from u64 to Option<u64> increases the memory size and layout of the metadata. I included these tests as a separate commit to call extra attention to it. * add TODO question on is_min_max_backwards_compatible * Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * update ValueStatistics::max docs * rename new optional ValueStatistics::max to max_opt Per PR review, we will deprecate the old API instead of introducing a brekaing change. Ref: #6216 (review) * rename new optional ValueStatistics::min to min_opt * add Statistics:{min,max}_bytes_opt This adds the API and migrates all of the test usage. The old APIs will be deprecated next. * update make_stats_iterator macro to use *_opt methods * deprecate non *_opt Statistics and ValueStatistics methods * remove stale TODO comments * remove has_min_max_set check from make_decimal_stats_iterator The check is unnecessary now that the stats funcs return Option<T> when unset. * deprecate has_min_max_set An internal version was also created because it is used so extensively in testing. * switch to null_count_opt and reintroduce deprecated null_count and has_nulls * remove redundant test assertions of stats._internal_has_min_max_set This removes the assertion from any test that subsequently unwraps both min_opt and max_opt. * replace negated test assertions of stats._internal_has_mix_max_set with assertions on min_opt and max_opt This removes all use of Statistics::_internal_has_min_max_set from the code base, and so it is also removed. * Revert changes to parquet writing, update comments --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
- Loading branch information
1 parent
2461a16
commit 69b17ad
Showing
9 changed files
with
521 additions
and
319 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.