-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Implement join #559
[WIP] Implement join #559
Conversation
} | ||
|
||
async fn execute(&self) -> Result<SendableDataBlockStream> { | ||
let read_left_task = tokio::task::spawn(Self::read_from_sink(self.left.clone())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio:task::spawn will push the task to the global tokio runtime
.
In order to limit the resources of one session, we should use context::execute_task
:
https://github.com/datafuselabs/datafuse/blob/master/fusequery/query/src/pipelines/processors/processor_merge.rs#L64
It will bind all the threads used by the processors to the context's runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reminding me about this.
I was intending to implement a really simple version at first to help me understand the whole procedure, but so far I still haven't find a proper way to build join transform into Pipeline
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
planner -> transform:
https://github.com/datafuselabs/datafuse/blob/master/fusequery/query/src/pipelines/processors/pipeline_builder.rs
For a sql, steps to a pipeline:
- parser to planner
- optimizer
- planner -> transform by pipeline builder
- executor the pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BohuTANG Yes, but it seems fusequery
currently has strong consumption that a PlanNode
only has one input
, which doesn't fit join well.
I'm finding a way to handle this properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can extend it to many if need, processor now has many inputs.
946bd8b
to
78b5503
Compare
78b5503
to
4766d2d
Compare
4766d2d
to
b653b0b
Compare
Thanks for the contribution! Please review the labels and make any necessary changes. |
I will finish the prequisite of Join in other PRs first, so this PR won't be ready for review within a predictable period of time. |
Summary
Initial tasks to implement join:
Pipeline
Changelog
Related Issues
Related to #319
Test Plan
Unit Tests
Stateless Tests