Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect calculation of the total number of estimated / real rows in some parallel plans #604

Open
yhuelf opened this issue Jul 19, 2023 · 3 comments

Comments

@yhuelf
Copy link

yhuelf commented Jul 19, 2023

In this plan, the inner side of the join is executed in full for each worker and the leader. This means that every process must have a private copy of the hash. Therefore, it is inappropriate to multiply the number of rows by "loops" in this case (nodes 5 and 6).

See here for further details : https://www.postgresql.org/docs/current/parallel-plans.html#PARALLEL-JOINS

@yhuelf
Copy link
Author

yhuelf commented Jul 19, 2023

Compare with the parallel hash join for the same query.

The only difference with before is a RESET enable_parallel_hash;

https://explain.dalibo.com/plan/f11gg33e19adf0dh

@yhuelf
Copy link
Author

yhuelf commented Jul 19, 2023

Same problem with a merge join, of course, as per the documentation

https://explain.dalibo.com/plan/56a23c086073a315

@MatteoGioioso
Copy link

Noticed the same, when workers are present the rows in the plan are the average returned per worker despite the number of loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants