Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Return empty value when select count(*) from empty table with extra RePartition after it #1205

Open
jackylee-ch opened this issue Jan 8, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@jackylee-ch
Copy link
Contributor

Describe the bug
When there is a ShuffleExchange after HashAgg, whose agg func is Count, and there are no inputs passed to HashAgg, gazelle will return empty batch rather than return none empty batch with 0.

To Reproduce

spark.sql("select 1 as a").filter("a > 1").groupBy().count().repartition(10).explain(true)
@jackylee-ch jackylee-ch added the bug Something isn't working label Jan 8, 2023
@jackylee-ch
Copy link
Contributor Author

jackylee-ch commented Jan 8, 2023

The main reason for this problem is the iterator defined in ColumnarHashAggregateExec is invalid. Its hasNext would return different value if we called it twice without calling next func. And in ColumnarShuffleWriteExec, we would check hasNext twice before calling next.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant