From dc99f2c3d30bc7d306bd607a7122531595827c1a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Tue, 28 Jan 2025 15:22:04 +0100 Subject: [PATCH 1/2] MINOR: Add links to projects --- README.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 3437b4645..ad024ec6b 100644 --- a/README.md +++ b/README.md @@ -43,11 +43,15 @@ Parquet is built to be used by anyone. The Hadoop ecosystem is rich with data pr ## Modules -The `parquet-format` project contains format specifications and Thrift definitions of metadata required to properly read Parquet files. +The [parquet-format] project contains format specifications and Thrift definitions of metadata required to properly read Parquet files. -The `parquet-java` project contains multiple sub-modules, which implement the core components of reading and writing a nested, column-oriented data stream, map this core onto the parquet format, and provide Hadoop Input/Output Formats, Pig loaders, and other java-based utilities for interacting with Parquet. +The [parquet-java] project contains multiple sub-modules, which implement the core components of reading and writing a nested, column-oriented data stream, map this core onto the parquet format, and provide Hadoop Input/Output Formats, Pig loaders, and other java-based utilities for interacting with Parquet. -The `parquet-compatibility` project contains compatibility tests that can be used to verify that implementations in different languages can read and write each other's files. +The [parquet-testing] project contains compatibility tests that can be used to verify that implementations in different languages can read and write each other's files. + +[parquet-format]: https://github.com/apache/parquet-format +[parquet-java]: https://github.com/apache/parquet-java +[parquet-testing]: https://github.com/apache/parquet-testing ## Building From c2da2f8fb85699c7e9b658a6ffd55d1cadd295c6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= Date: Mon, 3 Feb 2025 12:37:10 +0100 Subject: [PATCH 2/2] Remove duplicated parquet-testing reference on README --- README.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/README.md b/README.md index ad024ec6b..df0ac73a3 100644 --- a/README.md +++ b/README.md @@ -47,7 +47,7 @@ The [parquet-format] project contains format specifications and Thrift definitio The [parquet-java] project contains multiple sub-modules, which implement the core components of reading and writing a nested, column-oriented data stream, map this core onto the parquet format, and provide Hadoop Input/Output Formats, Pig loaders, and other java-based utilities for interacting with Parquet. -The [parquet-testing] project contains compatibility tests that can be used to verify that implementations in different languages can read and write each other's files. +The [parquet-testing] project contains a set of files that can be used to verify that implementations in different languages can read and write each other's files. [parquet-format]: https://github.com/apache/parquet-format [parquet-java]: https://github.com/apache/parquet-java @@ -299,10 +299,6 @@ There are many places in the format for compatible extensions: Parquet Thrift IDL reserves field-id `32767` of every Thrift struct for extensions. The (Thrift) type of this field is always `binary`. -## Testing - -The [apache/parquet-testing](https://github.com/apache/parquet-testing) contains a set of Parquet files for testing purposes. - ## Contributing Comment on the issue and/or contact [the parquet-dev mailing list](http://mail-archives.apache.org/mod_mbox/parquet-dev/) with your questions and ideas. Changes to this core format definition are proposed and discussed in depth on the mailing list. You may also be interested in contributing to the Parquet-Java subproject, which contains all the Java-side implementation and APIs. See the "How To Contribute" section of the [Parquet-Java project](https://github.com/apache/parquet-java#how-to-contribute)