Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] sql image_decode #2757

Merged

Conversation

universalmind303
Copy link
Contributor

  • adds sql image_decode function.

  • moves image_decode out of daft-dsl and into daft-functions.

  • some small changes to SQLFunction so that it can work with ScalarUDF

@github-actions github-actions bot added the enhancement New feature or request label Aug 28, 2024
Copy link

codspeed-hq bot commented Aug 28, 2024

CodSpeed Performance Report

Merging #2757 will degrade performances by 16.54%

Comparing universalmind303:sql-image-functions (7e85278) with main (dee1af8)

Summary

❌ 1 regressions
✅ 15 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main universalmind303:sql-image-functions Change
test_count[1 Small File] 20.5 ms 24.6 ms -16.54%

/// ```sql
/// SELECT concat('hello', 'world');
/// ```
/// this will be converted to: [lit("hello"), lit("world")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to differentiate between function args that take as. input expressions vs functions that mandate that we can only take only take in scalar inputs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand the question.

are you asking if function args can accept exprs as inputs as well?
such as some_function(column1, arg0 := column2)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know what Jay is saying. For example, round takes in one argument decimals that can only be a literal integer value. In those cases, I believe it is already structured so that the to_expr methods in the SQL modules include logic to make those checks before creating an expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or another example is the discussion we had a few months ago regarding seed?

src/daft-sql/src/modules/image.rs Show resolved Hide resolved
Copy link
Member

@kevinzwang kevinzwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions about how the SQL functions are structured

/// ```sql
/// SELECT concat('hello', 'world');
/// ```
/// this will be converted to: [lit("hello"), lit("world")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know what Jay is saying. For example, round takes in one argument decimals that can only be a literal integer value. In those cases, I believe it is already structured so that the to_expr methods in the SQL modules include logic to make those checks before creating an expression.

Comment on lines +30 to +62
match arg {
FunctionArg::Named {
name,
arg,
operator: FunctionArgOperator::Assignment,
} => match name.value.as_ref() {
"mode" => {
let arg = planner.try_unwrap_function_arg_expr(arg)?;
mode = Some(match arg.as_ref() {
Expr::Literal(LiteralValue::Utf8(s)) => s.parse()?,
_ => unsupported_sql_err!("Expected mode to be a string"),
});
}
"on_error" => {
let arg = planner.try_unwrap_function_arg_expr(arg)?;

raise_on_error = match arg.as_ref() {
Expr::Literal(LiteralValue::Utf8(s)) => match s.as_ref() {
"raise" => true,
"null" => false,
_ => unsupported_sql_err!(
"Expected on_error to be 'raise' or 'null'"
),
},
_ => unsupported_sql_err!("Expected raise_on_error to be a boolean"),
};
}
name => unsupported_sql_err!("Unexpected argument: '{name}'"),
},

other => {
unsupported_sql_err!("Invalid arguments for image_decode: '{}'", other)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great idea of how to do this at the moment, but I feel like there is a decent amount of boilerplate that will be shared across different functions to find and validate arguments, and we can maybe provide some common functions for simplifying that. Not blocking but a TODO for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree, I feel like this could be solved a lot cleaner through serde, procedural macros or something else...

Comment on lines +79 to +90
match inputs {
[input] => {
let input = planner.plan_function_arg(input)?;
Ok(decode(input, None))
}
[input, args @ ..] => {
let input = planner.plan_function_arg(input)?;
let args = ImageDecode::from_sql(args, planner)?;
Ok(decode(input, Some(args)))
}
_ => unsupported_sql_err!("Invalid arguments for image_decode: '{inputs:?}'"),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe a little messy to do part of the argument validation here and part in ImageDecode::from_sql

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think this is our usage of literals for some values, then exprs for other values surfacing. If we passed everything in as exprs, we could leave it up to ImageDecode to handle everything, but now we need to split it earlier to separate the exprs from the literals

@universalmind303 universalmind303 merged commit 734c13f into Eventual-Inc:main Sep 4, 2024
32 of 33 checks passed
universalmind303 added a commit that referenced this pull request Sep 4, 2024
@universalmind303 universalmind303 deleted the sql-image-functions branch January 23, 2025 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants