Holocene

holocene is a follow up to eocene where we implement a vectorized, push based query engine using Arrow as the data format.

Push-based Vectorized Execution

Vectorized execution in the context of database workloads means batches of records, most often when speaking about vectorized execution the meaning is along the lines of the Volcano model, but instead of next() returning a single record, next() returns multiple records.

Actual vectorization as in, SIMD instructions, is sometimes used to implement faster compute kernels but they don't mean the entire query plan is vectorized, but the plan can indeed be executed in parallel.

Push-based in this context describes a paradigm different from the Volcano model , where operators push their results down the pipeline. This approach has the benefit that the query plan becomes a DAG that can be executed in parallel, except for pipeline breakers that can be seen as join points.

Vectorized + push-based models are extremely good for OLAP workloads and represent the union of two ideas, push-based models and vectorized models.

Parallelizable part of the pipeline
each step pushes, multiple records
down the pipeline
                                                       Pipeline breaking, since LIMIT
                                                       will be applied over all records
  +--------+                                                             |
  | Batch  |    +--------+    +------+    +--------+    +------------+   |    +-------+
  +--------+--->| Source |--->| Scan |--->| Filter |--->| Projection |   |    |       |
                +--------+    +------+    +--------+    +------------+   |    |       |
  +--------+                                                             |    |       |
  | Batch  |    +--------+    +------+    +--------+    +------------+   |    |       |
  +--------+--->| Source |--->| Scan |--->| Filter |--->| Projection |---+--->| Limit |
                +--------+    +------+    +--------+    +------------+   |    |       |
  +--------+                                                             |    |       |
  | Batch  |    +--------+    +------+    +--------+    +------------+   |    |       |
  +--------+--->| Source |--->| Scan |--->| Filter |--->| Projection |   |    |       |
                +--------+    +------+    +--------+    +------------+   |    +-------+
                                                                         |
  +--------+                                                             |
  | Batch  |                                                             |
  +--------+                                                             |

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Holocene

Push-based Vectorized Execution

About

Releases

Packages

Languages

clflushopt/holocene

Folders and files

Latest commit

History

Repository files navigation

Holocene

Push-based Vectorized Execution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages