pyo3-arrow docs edits (#123)

kylebarron · Aug 13, 2024 · 95d9952 · 95d9952
1 parent f69a65b
commit 95d9952
Showing 1 changed file with 11 additions and 1 deletion.
diff --git a/pyo3-arrow/README.md b/pyo3-arrow/README.md
@@ -128,6 +128,7 @@ You must depend on the `arro3-core` Python package; then you can use the `to_arr
 | `PyField`             | `arro3.core.Field`             |
 | `PySchema`            | `arro3.core.Schema`            |
 | `PyArray`             | `arro3.core.Array`             |
+| `PyArrayReader`       | `arro3.core.ArrayReader`       |
 | `PyRecordBatch`       | `arro3.core.RecordBatch`       |
 | `PyChunkedArray`      | `arro3.core.ChunkedArray`      |
 | `PyTable`             | `arro3.core.Table`             |
@@ -149,6 +150,8 @@ In this case, you must depend on `pyarrow` and you can use the `to_pyarrow` meth
 | `PyTable`             | `pyarrow.Table`             |
 | `PyRecordBatchReader` | `pyarrow.RecordBatchReader` |
 
+`pyarrow` does not have the equivalent of a `PyArrayReader`, but if the materialized data fits in memory, you can convert a `PyArrayReader` to a `PyChunkedArray` and pass that to `pyarrow`.
+
 #### Using `nanoarrow`
 
 [`nanoarrow`](https://arrow.apache.org/nanoarrow/latest/index.html) is an alternative Python library for working with Arrow data. It's similar in goals to arro3, but is written in C instead of Rust. Additionally, it has a smaller type system than `pyarrow` or `arro3`, with logical arrays and record batches both represented by the `nanoarrow.Array` class.
@@ -161,10 +164,18 @@ In this case, you must depend on `nanoarrow` and you can use the `to_nanoarrow`
 | `PySchema`            | `nanoarrow.Schema`      |
 | `PyArray`             | `nanoarrow.Array`       |
 | `PyRecordBatch`       | `nanoarrow.Array`       |
+| `PyArrayReader`       | `nanoarrow.ArrayStream` |
 | `PyChunkedArray`      | `nanoarrow.ArrayStream` |
 | `PyTable`             | `nanoarrow.ArrayStream` |
 | `PyRecordBatchReader` | `nanoarrow.ArrayStream` |
 
+## Version compatibility
+
+| pyo3-arrow | pyo3 | arrow-rs |
+| ---------- | ---- | -------- |
+| 0.1        | 0.21 | 52       |
+| 0.2        | 0.21 | 52       |
+
 ## Why not use arrow-rs's Python integration?
 
 arrow-rs has [some existing Python integration](https://docs.rs/arrow/latest/arrow/pyarrow/index.html), but there are a few reasons why I created `pyo3-arrow`:
@@ -173,4 +184,3 @@ arrow-rs has [some existing Python integration](https://docs.rs/arrow/latest/arr
 - arrow-rs's Python FFI integration does not support Arrow extension types, because it omits field metadata when constructing an `Arc<dyn Array>`. pyo3-arrow gets around this by storing both an `ArrayRef` (`Arc<dyn Array>`) and a `FieldRef` (`Arc<Field>`) in a `PyArray` struct.
 - arrow-rs has no ability to work with an Arrow stream of bare arrays that are not record batches, and so it has no way to interop with a `pyarrow.ChunkedArray` or `polars.Series`.
 - In my opinion arrow-rs is too tightly connected to pyo3 and pyarrow. pyo3 releases don't line up with arrow-rs's release cadence, which means it could be a bit of a wait to use the latest pyo3 version with arrow-rs, especially with arrow-rs [waiting longer to release breaking changes](https://github.com/apache/arrow-rs#release-versioning-and-schedule).
-