-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Partitioned execution #409
Labels
enhancement
New feature or request
Comments
#407 was a refactoring to move the ScalarValue into a more accessible location for the physical plans. For now, the intention is to use the ScalarValue within the physical plan to represent literal values. In the long term we may want to revisit that and use a better encoding that would be more aligned with logical plans, but can revisit once the basic plumbing is laid out a bit better. |
bjchambers
added a commit
that referenced
this issue
Jun 2, 2023
This is part of #409. Introduces `Pipeline` information to the physical plan. This indicates which steps are part of a linear sequence, and should (ideally) be executed together. Also implements a pipeline "scheduler" to determine the pipeline for each step, in a new `sparrow-backend` crate. As the physical plan is built-up, the code should go in this "compiler backend" package, which can own optimization and conversion of logical plans to physical plans.
bjchambers
added a commit
that referenced
this issue
Jun 3, 2023
This is part of #409. Introduces `Pipeline` information to the physical plan. This indicates which steps are part of a linear sequence, and should (ideally) be executed together. Also implements a pipeline "scheduler" to determine the pipeline for each step, in a new `sparrow-backend` crate. As the physical plan is built-up, the code should go in this "compiler backend" package, which can own optimization and conversion of logical plans to physical plans.
18 tasks
bjchambers
changed the title
feat: Introduce physical plans suitable for partitioned & distributed execution
feat: Partitioned execution
Jul 21, 2023
This was referenced Jul 21, 2023
github-merge-queue bot
pushed a commit
that referenced
this issue
Jul 25, 2023
This introduces the key components of partitioned execution. - `sparrow-scheduler` provides functionality for managing the separate pipelines within the query plan and morsel-driven parallelism. It managing a thread-pool of workers pinned to specific CPUs pulling tasks from local queues. - `sparrow-transforms` will provide implementations of the "transforms" (project, select, etc.) and a pipeline for executing the transforms. - `sparrow-execution` will pull everything together to provide partitioned execution. This is part of #409.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary
The current "execution plan" is close, but not quite correctly adapted for use when describing the steps necessary to perform a computation.
Ideally, we would introduce a physical plan that is more similar to relational query engines, allowing us to leverage existing techniques and creating the options to run on existing systems.
For now, the plan is to introduce these and move execution towards running them directly, and then (separately) work towards compiling queries directly to physical plans.
The text was updated successfully, but these errors were encountered: