This crate contains the official Native Rust implementation of Apache Parquet, which is part of the Apache Arrow project.
See the API documentation for examples and the full API.
The API documentation for most recent, unreleased code is available here.
This crate is tested with the latest stable version of Rust. We do not currently test against other, older versions of the Rust compiler.
The parquet
crate follows the SemVer standard defined by Cargo and works well
within the Rust crate ecosystem. See the repository README for more details on
the release schedule, version and deprecation policy.
Note that for historical reasons, this crate uses versions with major numbers
greater than 0.x
(e.g. 19.0.0
), unlike many other crates in the Rust
ecosystem which spend extended time releasing versions 0.x
to signal planned
ongoing API changes. Minor arrow releases contain only compatible changes, while
major releases may contain breaking API changes.
The parquet
crate provides the following features which may be enabled in your Cargo.toml
:
arrow
(default) - support for reading / writingarrow
arrays to / from Parquetasync
- supportasync
APIs for reading Parquetjson
- support for reading / writingjson
data to / from Parquetbrotli
(default) - support for Parquet usingbrotli
compressionflate2
(default) - support for Parquet usinggzip
compressionlz4
(default) - support for Parquet usinglz4
compressionzstd
(default) - support for Parquet usingzstd
compressionsnap
(default) - support for Parquet usingsnappy
compressioncli
- parquet CLI toolscrc
- enables functionality to automatically verify checksums of each page (if present) when decodingexperimental
- Experimental APIs which may change, even between minor releasessimdutf8
(default) - Use thesimdutf8
crate for SIMD-accelerated UTF-8 validation
- All encodings supported
- All compression codecs supported
- Read support
- Primitive column value readers
- Row record reader
- Arrow record reader
- Async support (to Arrow)
- Statistics support
- Write support
- Primitive column value writers
- Row record writer
- Arrow record writer
- Async support
- Predicate pushdown
- Parquet format 4.0.0 support
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.