50.0.0 (2024-01-08)
Breaking changes:
- Make regexp_match take scalar pattern and flag #5245 [arrow] (viirya)
- Use Vec in ColumnReader (#5177) #5193 [parquet] (tustvold)
- Remove SIMD Feature #5184 [arrow] (tustvold)
- Use Total Ordering for Aggregates and Refactor for Better Auto-Vectorization #5100 [arrow] (jhorstmann)
- Allow the
zip
compute function to operator onScalar
values viaDatum
#5086 [arrow] (Nathan-Fenner) - Improve C Data Interface and Add Integration Testing Entrypoints #5080 [arrow] (pitrou)
- Parquet: read/write f16 for Arrow #5003 [parquet] (Jefffrey)
Implemented enhancements:
- Support get offsets or blocks info from arrow file. #5252 [arrow]
- Make regexp_match take scalar pattern and flag #5246 [arrow]
- Cannot access pen state website on arrow-row #5238 [arrow]
- RecordBatch with_schema's error message is hard to read #5227 [arrow]
- Support cast between StructArray. #5219 [arrow]
- Remove nightly-only simd feature and related code in ArrowNumericType #5185 [arrow]
- Use Vec instead of Slice in ColumnReader #5177 [parquet]
- Request to Memmap Arrow IPC files on disk #5153 [arrow]
- GenericColumnReader::read_records Yields Truncated Records #5150 [parquet]
- Nested Schema Projection #5148 [parquet] [arrow]
- Support specifying
quote
andescape
in CsvWriterBuilder
#5146 [arrow] - Support casting of Float16 with other numeric types #5138 [arrow]
- Parquet: read parquet metadata with page index in async and with size hints #5129 [parquet]
- Cast from floating/timestamp to timestamp/floating #5122 [arrow]
- Support Casting List To/From LargeList in Cast Kernel #5113 [arrow]
- Expose a path for converting
bytes::Bytes
intoarrow_buffer::Buffer
without copy #5104 [arrow] - API inconsistency of ListBuilder make it hard to use as nested builder #5098 [arrow]
- Parquet: don't truncate min/max statistics for float16 and decimal when writing file #5075 [parquet]
- Parquet: derive boundary order when writing columns #5074 [parquet]
- Support new Arrow PyCapsule Interface for Python FFI #5067 [arrow]
48.0.1
arrow patch release #5050 [parquet] [arrow]- Binary columns do not receive truncated statistics #5037 [parquet]
- Re-evaluate Explicit SIMD Aggregations #5032 [arrow]
- Min/Max Kernels Should Use Total Ordering #5031 [arrow]
- Allow
zip
compute kernel to takeScalar
/Datum
#5011 [arrow] - Add Float16/Half-float logical type to Parquet #4986 [parquet]
- feat: cast (Large)List to FixedSizeList #5081 [arrow] (wjones127)
- Update Parquet Encoding Documentation #5051 [parquet]
Fixed bugs:
- json schema inference can't handle null field turned into object field in subsequent rows #5215 [arrow]
- Invalid trailing content after
Z
in timezone is ignored #5182 [arrow] - Take panics on a fixed size list array when given null indices #5169 [arrow]
- EnabledStatistics::Page does not take effect on ByteArrayEncoder #5162 [parquet]
- Parquet: ColumnOrder not being written when writing parquet files #5152 [parquet]
- Parquet: Interval columns shouldn't write min/max stats #5145 [parquet]
- cast
Utf8
to decimal failure #5127 [arrow] - coerce_primitive not honored when decoding from serde object #5095 [arrow]
- Unsound MutableArrayData Constructor #5091 [arrow]
- RowGroupReader.get_row_iter() fails with Path ColumnPath not found #5064 [parquet]
- cast format 'yyyymmdd' to Date32 give a error #5044 [arrow]
Performance improvements:
Closed issues:
- Working example of list_flights with ObjectStore #5116
- (object_store) Error broken pipe on S3 multipart upload #5106
Merged pull requests:
- Update parquet object_store dependency to 0.9.0 #5290 [parquet] (tustvold)
- Update proc-macro2 requirement from =1.0.75 to =1.0.76 #5289 [arrow] [arrow-flight] (dependabot[bot])
- Enable JS tests again #5287 (domoritz)
- Update proc-macro2 requirement from =1.0.74 to =1.0.75 #5279 [arrow] [arrow-flight] (dependabot[bot])
- Update proc-macro2 requirement from =1.0.73 to =1.0.74 #5271 [arrow] [arrow-flight] (dependabot[bot])
- Update proc-macro2 requirement from =1.0.71 to =1.0.73 #5265 [arrow] [arrow-flight] (dependabot[bot])
- Update docs for datatypes #5260 [arrow] (Jefffrey)
- Don't suppress errors in ArrowArrayStreamReader #5256 [arrow] (tustvold)
- Add IPC FileDecoder #5249 [arrow] (tustvold)
- optimize the next function of ArrowArrayStreamReader #5248 [arrow] (doki23)
- ci: Fail Miri CI on first failure #5243 (Jefffrey)
- Remove 'unwrap' from Result #5241 [parquet] (zeevm)
- Update arrow-row docs URL #5239 [arrow] (thomas-k-cameron)
- Improve regexp kernels performance by avoiding cloning Regex #5235 [arrow] (viirya)
- Update proc-macro2 requirement from =1.0.70 to =1.0.71 #5231 [arrow] [arrow-flight] (dependabot[bot])
- Minor: Improve comments and errors for ArrowPredicate #5230 [parquet] (alamb)
- Bump actions/upload-pages-artifact from 2 to 3 #5229 (dependabot[bot])
- make with_schema's error more readable #5228 [arrow] (shuoli84)
- Use
try_new
when casting between structs to propagate error #5226 [arrow] (viirya) - feat(cast): support cast between struct #5221 [arrow] (my-vegetable-has-exploded)
- Add
entries
toMapBuilder
to return both key and value array builders #5218 [arrow] (viirya) - fix(json): fix inferring object after field was null #5216 [arrow] (kskalski)
- Support MapBuilder in make_builder #5210 [arrow] (viirya)
- impl
From<OffsetBuffer<T>>
forScalarBuffer<T>
#5203 [arrow] (mbrobbel) - impl
From<BufferBuilder<T>>
forBuffer
#5202 [arrow] (mbrobbel) - impl
From<BufferBuilder<T>>
forScalarBuffer<T>
#5201 [arrow] (mbrobbel) - feat: Support quote and escape in Csv WriterBuilder #5196 [arrow] (my-vegetable-has-exploded)
- chore: simplify cast_string_to_interval #5195 [arrow] (jackwener)
- Clarify interval comparison behavior with documentation and tests #5192 [arrow] (alamb)
- Add
BooleanArray::into_parts
method #5191 [arrow] (mbrobbel) - Fix deprecated note for
Buffer::from_raw_parts
#5190 [arrow] (mbrobbel) - Fix: Ensure Timestamp Parsing Rejects Characters After 'Z #5189 [arrow] (razeghi71)
- Simplify parquet statistics generation #5183 [parquet] (tustvold)
- Parquet: Ensure page statistics are written only when conifgured from the Arrow Writer #5181 [parquet] (AdamGS)
- Blockwise IO in IPC FileReader (#5153) #5179 [arrow] (tustvold)
- Replace ScalarBuffer in Parquet with Vec (#1849) (#5177) #5178 [parquet] (tustvold)
- Bump actions/setup-python from 4 to 5 #5175 (dependabot[bot])
- Add
LargeListBuilder
tomake_builder
#5171 [arrow] (viirya) - fix: ensure take_fixed_size_list can handle null indices #5170 (westonpace)
- Removing redundant
as casts
in parquet #5168 [parquet] (psvri) - Bump actions/labeler from 4.3.0 to 5.0.0 #5167 (dependabot[bot])
- improve: make RunArray displayable #5166 [arrow] (yukkit)
- ci: Add cargo audit CI action #5160 [arrow] (Jefffrey)
- Parquet: write column_orders in FileMetaData #5158 [parquet] (Jefffrey)
- Adding
is_null
datatype shortcut method #5157 [arrow] (comphead) - Parquet: don't truncate f16/decimal min/max stats #5154 [parquet] (Jefffrey)
- Support nested schema projection (#5148) #5149 [arrow] (tustvold)
- Parquet: omit min/max for interval columns when writing stats #5147 [parquet] (Jefffrey)
- Deprecate Fields::remove and Schema::remove #5144 [arrow] (tustvold)
- Support casting of Float16 with other numeric types #5139 [arrow] (viirya)
- Parquet: Make
MetadataLoader
public #5137 [parquet] (AdamGS) - Add FileReaderBuilder for arrow-ipc to allow reading large no. of column files #5136 [arrow] (Jefffrey)
- Parquet: clear metadata and project fields of ParquetRecordBatchStream::schema #5135 [parquet] (Jefffrey)
- JSON: write struct array nulls as null #5133 [arrow] (Jefffrey)
- Update proc-macro2 requirement from =1.0.69 to =1.0.70 #5131 [arrow] [arrow-flight] (dependabot[bot])
- Fix negative decimal string #5128 [arrow] (viirya)
- Cleanup list casting and support nested lists (#5113) #5124 [arrow] (tustvold)
- Cast from numeric/timestamp to timestamp/numeric #5123 [arrow] (viirya)
- Improve cast docs #5114 [arrow] (tustvold)
- Update prost-build requirement from =0.12.2 to =0.12.3 #5112 [arrow] [arrow-flight] (dependabot[bot])
- Parquet: derive boundary order when writing #5110 [parquet] (Jefffrey)
- Implementing
ArrayBuilder
forBox<dyn ArrayBuilder>
#5109 [arrow] (viirya) - Fix 'ColumnPath not found' error reading Parquet files with nested REPEATED fields #5102 [parquet] (mmaitre314)
- fix: coerce_primitive for serde decoded data #5101 [arrow] (fansehep)
- Extend aggregation benchmarks #5096 [arrow] (jhorstmann)
- Expand parquet crate overview doc #5093 [parquet] (mmaitre314)
- Ensure arrays passed to MutableArrayData have same type (#5091) #5092 [arrow] (tustvold)
- Update prost-build requirement from =0.12.1 to =0.12.2 #5088 [arrow] [arrow-flight] (dependabot[bot])
- Add FFI from_raw #5082 [arrow] (tustvold)
- [fix #5044] Support converting 'yyyymmdd' format to date #5078 [arrow] (Tangruilin)
- Enable truncation of binary statistics columns #5076 [parquet] (emcake)
- IPC writer truncated sliced list/map values #5071 [arrow] (Jefffrey)
- Implement Arrow PyCapsule Interface #5070 [arrow] (kylebarron)
- Remove ByteBufferPtr and replace with Bytes #5055 [parquet] (Jefffrey)
- Support multiple GZip members in parquet page #4951 [parquet] (tustvold)