Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Send + Sync traits for Datum #5587

Closed
wants to merge 1 commit into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Apr 3, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

While working on apache/iceberg-rust#295, I faced an issue that a Datum (it is a Scalar behind the trait) cannot be moved into a closure.

error[E0277]: `dyn arrow_array::Datum` cannot be sent between threads safely
   --> crates/iceberg/src/arrow/reader.rs:451:17
    |
449 |               PredicateOperator::NotEq => Ok(Box::new(ArrowPredicateFn::new(
    |                                                       --------------------- required by a bound introduced by this call
450 |                   self.projection_mask.clone(),
451 | /                 move |batch| {
452 | |                     let left = batch.column(term_index);
453 | |                     neq(left, literal.as_ref())
454 | |                 },
    | |_________________^ `dyn arrow_array::Datum` cannot be sent between threads safely
    |

Because the function to return Datum may return Result to propagate error, I cannot put it into the closure.

I'm wondering if we can make Datum also Send + Sync?

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 3, 2024
@tustvold tustvold added the api-change Changes to the arrow API label Apr 3, 2024
@tustvold
Copy link
Contributor

tustvold commented Apr 3, 2024

Are you able to just add these constraints where necessary? This would avoid making a breaking change

@viirya
Copy link
Member Author

viirya commented Apr 3, 2024

Are you able to just add these constraints where necessary? This would avoid making a breaking change

Hmm, but the Datum is basically Scalar. It is a struct not a trait and it is not implemented with Send. Can I add Send as constraint for it outside this crate? I think it doesn't work if I add it like Datum + Send?

@tustvold
Copy link
Contributor

tustvold commented Apr 4, 2024

You can add constraints for marker traits at point of usage

@viirya
Copy link
Member Author

viirya commented Apr 4, 2024

Sorry, I still don't know how can I add Send as a constraint to Scalar outside this crate. 🤔

I can let the function return Datum + Send so Send is a constraint of the returned type. But Scalar doesn't implement Send, how can I make it implement Send outside this crate?

@tustvold
Copy link
Contributor

tustvold commented Apr 5, 2024

I'm confused, Scalar is just a wrapper around a T: Array that implements Datum, whether or not Datum has additional constraints or not has no bearing on whether Scalar is Send?

Perhaps you could provide a small code reproducer?

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

I'm confused, Scalar is just a wrapper around a T: Array that implements Datum, whether or not Datum has additional constraints or not has no bearing on whether Scalar is Send?

Perhaps you could provide a small code reproducer?

Hmm, do you mean to return Array (a Send) instead of Scalar?

Because Scalar is not a Send, although it is simply a wrapper of Array, I guess it cannot be used as Send automatically? I think it cannot be but let me try it first. Thanks.

@tustvold
Copy link
Contributor

tustvold commented Apr 5, 2024

Send is an auto-trait that will be automatically derived if the contents of the type are themselves Send. The same is true for Sync.

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

I just tried it. Looks like it works.

Because Scalar is not a Send, although it is simply a wrapper of Array, I guess it cannot be used as Send automatically? I think it cannot be but let me try it first. Thanks.

Interesting. So even Scalar is not a Send, if Rust compiler knows it is a wrapper of Array, it will be automatically a Send?

fn get_arrow_datum(datum: &Datum) -> Result<Box<dyn ArrowDatum + Send>> {
    match datum.literal() {
        PrimitiveLiteral::Boolean(value) => Ok(Box::new(BooleanArray::new_scalar(*value))),
        PrimitiveLiteral::Int(value) => Ok(Box::new(Int32Array::new_scalar(*value))),
        PrimitiveLiteral::Long(value) => Ok(Box::new(Int64Array::new_scalar(*value))),
        PrimitiveLiteral::Float(value) => Ok(Box::new(Float32Array::new_scalar(value.as_f32()))),
        PrimitiveLiteral::Double(value) => Ok(Box::new(Float64Array::new_scalar(value.as_f64()))),
        l => Err(Error::new(
            ErrorKind::DataInvalid,
            format!("Unsupported literal type: {:?}", l),
        )),
    }
}

fn some_func(...) {
    let literal = get_arrow_datum(&literal)?;

    ....
    move |batch| {
        let left = batch.column(term_index);
         lt(left, literal.as_ref())
    },
    ....
}

@viirya
Copy link
Member Author

viirya commented Apr 5, 2024

Send is an auto-trait that will be automatically derived if the contents of the type are themselves Send. The same is true for Sync.

Ah, got it. I thought it needs to explicitly implement it. Learn it. Thank you @tustvold !

@viirya viirya closed this Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants