-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyO3 bridge for pyarrow interoperability / fix arrow integration test #691
Conversation
} | ||
} | ||
|
||
impl PyArrowConvert for ArrayData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing the conversion traits either on ArrayRef
or T: Array
is not possible, so I sticked with ArrayData
but implemented the PyArrowConvert
trait on the array types and ArrayRef for convenience.
Codecov Report
@@ Coverage Diff @@
## master #691 +/- ##
==========================================
- Coverage 82.43% 82.39% -0.05%
==========================================
Files 168 168
Lines 47340 47396 +56
==========================================
+ Hits 39027 39050 +23
- Misses 8313 8346 +33
Continue to review full report at Codecov.
|
|
||
impl<'a> IntoPy<PyObject> for $typ { | ||
fn into_py(self, py: Python) -> PyObject { | ||
self.to_pyarrow(py).unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to unwrap here because IntoPy
is infallible, see PyO3/pyo3#1813
|
||
pub trait PyArrowConvert: Sized { | ||
fn from_pyarrow(value: &PyAny) -> PyResult<Self>; | ||
fn to_pyarrow(&self, py: Python) -> PyResult<PyObject>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should split it into two conversion traits: TryFromPyArrow
and TryIntoPyArrow
?
I feel I could review the Rust part of this code for basic hygiene (and what I looked at looks good) but I don't understand the nuances with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorgecarleitao addressed the review comments and fixed the python integration test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm. Thanks @kszucs !
Thanks Jorge! Merging it then. |
@@ -50,6 +50,8 @@ chrono = "0.4" | |||
flatbuffers = { version = "=2.0.0", optional = true } | |||
hex = "0.4" | |||
comfy-table = { version = "4.0", optional = true, default-features = false } | |||
prettytable-rs = { version = "0.8.0", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kszucs Was this (re)added by mistake or is it intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it? prettytable-rs should have been replaced by prettyprint
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake, sorry about that! Created a PR to remove it #737
@@ -138,7 +138,7 @@ pub(super) fn list_equal<T: OffsetSizeTrait>( | |||
child_rhs_nulls.as_ref(), | |||
lhs_offsets[lhs_start].to_usize().unwrap(), | |||
rhs_offsets[rhs_start].to_usize().unwrap(), | |||
(lhs_offsets[len] - lhs_offsets[lhs_start]) | |||
(lhs_offsets[lhs_start + len] - lhs_offsets[lhs_start]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code/bug in question appears to have been introduced in apache/arrow#8541 (and thus is also in arrow 5.0, and thus does not need to be backported directly into arrow 5.x) -- I was worried I had backported a bug into arrow 5.3 which needed a fix backported anyways
Which issue does this PR close?
Closes #.
Rationale for this change
Motivation comes from apache/datafusion#856 (comment)
PyO3 provides conversion traits between rust and python types. Using the arrow-rs types in the datafusion python bindings required a lot of boilerplate though we could just simply annotate the right type in the function signature and let PyO3 to do the majority of the work.
The error handling should be improved.
What changes are included in this PR?
Are there any user-facing changes?