Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: create an Exprs #412

Merged
merged 1 commit into from
Jun 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions crates/sparrow-physical/src/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,35 @@ use std::borrow::Cow;

use arrow_schema::DataType;

/// Represents 1 or more values computed by expressions.
#[derive(Debug, serde::Serialize, serde::Deserialize)]
pub struct Exprs {
/// The expressions computing the intermediate values.
pub exprs: Vec<Expr>,
/// The indices of columns to output.
pub outputs: Vec<ExprId>,
}

impl Exprs {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a lot, I may move this to a separate exprs.rs file... but that can wait.

/// Create expressions computing the value of the last expression.
pub fn singleton(exprs: Vec<Expr>) -> Self {
let output = exprs.len() - 1;
Self {
exprs,
outputs: vec![output.into()],
}
}

pub fn is_singleton(&self) -> bool {
self.outputs.len() == 1
}

/// Return the number of outputs produced by these expressions.
pub fn output_len(&self) -> usize {
self.outputs.len()
}
}

/// The identifier (index) of an expression.
#[derive(Debug, serde::Serialize, serde::Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[repr(transparent)]
Expand Down
22 changes: 11 additions & 11 deletions crates/sparrow-physical/src/step.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
use arrow_schema::SchemaRef;

use crate::Exprs;

/// The identifier (index) of a step.

#[derive(Debug, serde::Serialize, serde::Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
Expand Down Expand Up @@ -41,27 +43,25 @@ pub enum StepKind {
/// The output includes the same rows as the input, but with columns
/// projected as configured.
Project {
/// Expressions to apply to compute additional input columns.
exprs: Vec<crate::Expr>,
/// Indices of expressions to use for the output.
/// Expressions to compute the projection.
///
/// The length should be the same as the number of fields in the schema.
outputs: Vec<usize>,
/// The length of the outputs should be the same as the fields in the schema.
exprs: Exprs,
},
/// Filter the results based on a boolean predicate.
Filter {
/// Expressions to apply to compute the predicate.
///
/// The last expression should be the boolean predicate.
exprs: Vec<crate::Expr>,
/// There should be a single output producing a boolean value.
exprs: Exprs,
},
/// A step that repartitions the output.
Repartition {
num_partitions: usize,
/// Expressions to apply to compute columns which may be referenced by `keys`.
exprs: Vec<crate::Expr>,
/// Indices of expression columns representing the keys.
keys: Vec<usize>,
/// Expressions to compute the keys.
///
/// Each output corresponds to a part of the key.
keys: Exprs,
},
Error,
}