Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: groupby result not merged in cluster mode #8290

Closed
1 of 2 tasks
FANNG1 opened this issue Oct 18, 2022 · 2 comments · Fixed by #8309
Closed
1 of 2 tasks

bug: groupby result not merged in cluster mode #8290

FANNG1 opened this issue Oct 18, 2022 · 2 comments · Fixed by #8309
Assignees
Labels
C-bug Category: something isn't working

Comments

@FANNG1
Copy link

FANNG1 commented Oct 18, 2022

Search before asking

  • I had searched in the issues and found no similar issues.

Version

xx

What's Wrong?

run a sql in 3 nodes cluster, generates result like

20221010, 10
20221010, 20
20221010, 30

while the right reulst is 20221010, 60, seems we distribute data by p_date and author_id, and should add a final merge in root fragment

sql

SELECT a.d_f_date,
       sum(a.m_live_pay_unrisk_order_amt) AS sqlrewriter0
FROM
  (SELECT `p_date` AS `d_f_date`,
          author_id,
          coalesce(SUM(`pay_unrisk_order_amt`), 0) AS `m_live_pay_unrisk_order_amt`
   FROM tableA
   WHERE p_date BETWEEN '20211001' AND '20211115'
     AND pay_unrisk_order_amt < 2000000
     AND pay_unrisk_order_amt > 0
   GROUP BY `p_date`,
            author_id) AS a
GROUP BY a.d_f_date
ORDER BY a.d_f_date,
         sqlrewriter0
LIMIT 100
;
Fragment 0:
  DataExchange: Shuffle
  Exchange Sink: fragment id: [1]
    Aggregate(Partial): group items: [199, 0], aggregate functions: [SUM(179)]
      Filter: [>=($1, 20211001), <=($1, 20211115), <($2, 2000000), >($2, 0)]
        TableScan: [tableA]

Fragment 1:
  DataExchange: Merge
  Exchange Sink: fragment id: [2]
    EvalScalar: [$0]
      Aggregate(Final): group items: [199], aggregate functions: [sum(200)]
        Aggregate(Partial): group items: [199], aggregate functions: [sum(200)]
          EvalScalar: [multi_if(is_not_null($0), assume_not_null($0), true, 0, NULL)]
            Aggregate(Final): group items: [199, 0], aggregate functions: [SUM(179)]
              Exchange Source: fragment id: [0]

Fragment 2:
  Limit: [100], Offset: [0]
    Sort: [199 ASC, 204 ASC], Limit: [100]
      Exchange Source: fragment id: [1]

How to Reproduce?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@FANNG1 FANNG1 added the C-bug Category: something isn't working label Oct 18, 2022
@sundy-li
Copy link
Member

Related to distributed pipeline builder. cc @leiysky

@FANNG1
Copy link
Author

FANNG1 commented Oct 19, 2022

@leiysky , could you explain how u fixed #8290 and #8292 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants