Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: datafusion sanity checker passes when all files filtered out #2830

Conversation

adamfaulkner-at
Copy link
Contributor

@adamfaulkner-at adamfaulkner-at commented Aug 26, 2024

Description

Fix a bug in DeltaScanBuilder where query plans that excluded all files in the table would be rejected by Datafusion's sanity checker with this error message:

thread 'delta_datafusion::tests::parent_distribution_requirements_bug' panicked at crates/core/src/delta_datafusion/mod.rs:2623:14:
called `Result::unwrap()` on an `Err` value: Context("SanityCheckPlan", Plan("Child: [\"CoalesceBatchesExec: target_batch_size=8192\", \"  FilterExec: a@0 > s\", \"    DeltaScan\", \"      ParquetExec: file_groups={0 groups: []}, projection=[a], predicate=a@0 > s, pruning_predicate=CASE WHEN a_null_count@1 = a_row_count@2 THEN false ELSE a_max@0 > s END, required_guarantees=[]\"] does not satisfy parent distribution requirements: SinglePartition"))

This is arguably a bug in datafusion, but it was easy to fix it here.

Related Issue(s)

#2831

Documentation

apache/datafusion#11196

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Aug 26, 2024
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@rtyler rtyler changed the title Failing test that demonstrates distribution requirements bug bug: f test that demonstrates distribution requirements bug Aug 26, 2024
@rtyler rtyler changed the title bug: f test that demonstrates distribution requirements bug bug: failing test that demonstrates distribution requirements bug Aug 26, 2024
@adamfaulkner-at adamfaulkner-at changed the title bug: failing test that demonstrates distribution requirements bug fix: Pass datafusion sanity checker when filtering out all files and sorting. Oct 2, 2024
@adamfaulkner-at adamfaulkner-at changed the title fix: Pass datafusion sanity checker when filtering out all files and sorting. fix: datafusion sanity checker passes when all files filtered out Oct 2, 2024
Copy link

codecov bot commented Oct 2, 2024

Codecov Report

Attention: Patch coverage is 92.59259% with 2 lines in your changes missing coverage. Please review.

Project coverage is 72.41%. Comparing base (8f4232f) to head (1408c7e).

Files with missing lines Patch % Lines
crates/core/src/delta_datafusion/mod.rs 92.59% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2830      +/-   ##
==========================================
- Coverage   72.41%   72.41%   -0.01%     
==========================================
  Files         131      131              
  Lines       40471    40496      +25     
  Branches    40471    40496      +25     
==========================================
+ Hits        29306    29324      +18     
- Misses       9291     9292       +1     
- Partials     1874     1880       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ion-elgreco ion-elgreco added this pull request to the merge queue Oct 3, 2024
Merged via the queue into delta-io:main with commit 8701046 Oct 3, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants