Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AggregationFuzzer error reporting #12832

Merged
merged 6 commits into from
Oct 15, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Oct 9, 2024

Which issue does this PR close?

Part of #12114

Rationale for this change

@Rachelint added an amazing fuzzer framework in #12667 that actually found bugs already ❤️

However, the messages when it fails are hard to interpret

What changes are included in this PR?

  1. Improve the error message reporting by returning DataFusionErrors and printing them rather than unwraping in a task

Are these changes tested?

I tested manually

Error before this PR:

called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(43), "should success to run sql: External(ArrowError(InvalidArgumentError(\"number of columns(1) must match number of fields(2) in schema\"), None))", ...)
thread 'tokio-runtime-worker' panicked at datafusion/core/tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs:243:14:
should success to run sql: External(ArrowError(InvalidArgumentError("number of columns(1) must match number of fields(2) in schema"), None))
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14
   2: core::result::unwrap_failed
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/result.rs:1679:5
   3: core::result::Result<T,E>::expect
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/result.rs:1059:23
   4: fuzz::fuzz_cases::aggregation_fuzzer::fuzzer::AggregationFuzzTestTask::run::{{closure}}
             at ./tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs:241:27
   5: fuzz::fuzz_cases::aggregation_fuzzer::fuzzer::AggregationFuzzer::run::{{closure}}::{{closure}}
             at ./tests/fuzz_cases/aggregation_fuzzer/fuzzer.rs:161:32
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/future/future.rs:123:9
   7: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:331:17
   8: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/loom/std/unsafe_cell.rs:16:9
   9: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/task/core.rs:320:13
  10: tokio::runtime::task::harness::poll_future::{{closure}}

Error after this PR:

##### AggregationFuzzer error report #####
### Sql:
SELECT b, max(a) FROM fuzz_table GROUP BY b
### Schema:
Field { name: "a", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "b", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }
### Session context params:
SessionContextParams { batch_size: 486, target_partitions: 4, sort_hint: false, skip_partial_params: SkipPartialParams { ratio_threshold: 0.0, rows_threshold: 0 } }
### Input:
+------------------------------------------------+----------------------+----------------------+
| a                                              | b                    | c                    |
+------------------------------------------------+----------------------+----------------------+
| 󄼃󭬗屝򅤃宒񖹴񙀇𘡹񀄼򫑚񵴺󁃧󘀴󄇮𑯎񛁔𚟄𻼘譝𻉹󳄱𿷇򋟢󩩜񀉒􍫑񂽚             |                      |                      |
| 𥦡񝙖󗭫𽈊𠋩ꠔ󕥘𿫮򎳊񩏽                                 | -4515507645800315940 |                      |
| 𢨭􁪯𯔽𮄲񷕉𔣔򱙖񵀕򄡢󎾦󽫉                                 | -1680924722960872900 |                      |
| 񑶷񕦙󩴋񧇙󍶦򏢙񞒞񫆟逞񧶸񍊩񨎈򄙖󞋔󤜒򔞋򨸩񗔤󷵆𶚻񦣰󑃘򎱷򺈩򪧽󶇧𰀸𽍯򸕜󉇳             | 2840913941112836491  |                      |
| 񓳋򪷀񌣷񤥅뙀񪾈󭯺􈉁𷻪𕿸󏠯􅃒󲭉󭧾𺋁󘽥򰚱򧨮񸡗𢴖󖞻񲜝񣌛􅹢񄁺󁰤񭰙񝅊񨁟򏠾󳨝𧈗𣉦򇇹       | 5960926486239415572  |                      |
...
| 򻇊󇺏󔄓񮊐񳅒򣟭򧢰򅝆𬼍𐪗                                    | -2146932088256327787 | 9126472890594735950  |
| 򫍠𯾣𺿆󿻋ﳲ򇾂ṷ􊍟򴖜煳􀫍񢣔󵯏늗򥡡󫁒񽏲󝰷󯻴𮕙𙪫თ􃅀𗛪򑸌鳘񀊒󫤑򑗽󟻬󧓡򊦉󺭑𰔯     | 4268632760573911042  | 9176925243680523636  |
| 􍩴࢓􁈇􊼟򨷲𥼔􅥠򪯗                                      | -4916135852127917200 | 9176967928481913325  |
+------------------------------------------------+----------------------+----------------------+

caused by
External error: Arrow error: Invalid argument error: number of columns(1) must match number of fields(2) in schema

Error!
thread 'fuzz_cases::aggregate_fuzz::test_basic_string_aggr_group_by_single_int64' panicked at datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs:723:9:
Error!
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14
   2: fuzz::fuzz_cases::aggregate_fuzz::unwrap_and_report
             at ./tests/fuzz_cases/aggregate_fuzz.rs:723:9
   3: fuzz::fuzz_cases::aggregate_fuzz::test_basic_string_aggr_group_by_single_int64::{{closure}}
             at ./tests/fuzz_cases/aggregate_fuzz.rs:272:5
   4: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/future/future.rs:123:9
   5: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/park.rs:281:63
   6: tokio::runtime::coop::with_budget
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:107:5
   7: tokio::runtime::coop::budget
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/coop.rs:73:5
   8: tokio::runtime::park::CachedParkThread::block_on
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/park.rs:281:31
   9: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/blocking.rs:66:9
  10: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/mod.rs:87:13
  11: tokio::runtime::context::runtime::enter_runtime
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/context/runtime.rs:65:16
  12: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/scheduler/multi_thread/mod.rs:86:9
  13: tokio::runtime::runtime::Runtime::block_on_inner
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/runtime.rs:363:45
  14: tokio::runtime::runtime::Runtime::block_on
             at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.40.0/src/runtime/runtime.rs:335:13
  15: fuzz::fuzz_cases::aggregate_fuzz::test_basic_string_aggr_group_by_single_int64
             at ./tests/fuzz_cases/aggregate_fuzz.rs:272:5
  16: fuzz::fuzz_cases::aggregate_fuzz::test_basic_string_aggr_group_by_single_int64::{{closure}}
             at ./tests/fuzz_cases/aggregate_fuzz.rs:239:56
  17: core::ops::function::FnOnce::call_once
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
  18: core::ops::function::FnOnce::call_once
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Oct 9, 2024
@@ -237,45 +252,53 @@ struct AggregationFuzzTestTask {
}

impl AggregationFuzzTestTask {
async fn run(&self) {
async fn run(&self) -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the key change is to return an actual DataFusionError here and then print it when looking at the task output

Comment on lines +293 to +295
### Sql:\n{}\n\
### Schema:\n{}\n\
### Session context params:\n{:?}\n\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we will also care about some information about reproducing like session context params, sql, when we encounter inconsistent results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that information is already included when inconsistent results are detected (in check_result) but perhaps I don't understand what you are proposing here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it is my mistake, I see it now.

Copy link
Contributor

@Rachelint Rachelint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@alamb
Copy link
Contributor Author

alamb commented Oct 11, 2024

Looks good to me.

THanks @Rachelint -- I am now just waiting for another committer to approve this PR so I can merge it

@jonahgao jonahgao merged commit 377a4c5 into apache:main Oct 15, 2024
24 checks passed
@alamb alamb deleted the alamb/better_fuzz_error branch October 15, 2024 10:43
@alamb
Copy link
Contributor Author

alamb commented Oct 15, 2024

Thank you @jonahgao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants