-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-87: [C++] Add all four possible ways to encode Decimals in Parquet to schema conversion #48
Conversation
parquet_fields.push_back( | ||
PrimitiveNode::Make("int64-decimal", Repetition::OPTIONAL, | ||
parquet_cpp::Type::INT64, | ||
parquet_cpp::LogicalType::DECIMAL, -1, 8, 4)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside, it might be nice to improve this API in parquet-cpp -- the -1
is a bit of an eyesore
Looks good. I can merge after you rebase -- we need to get the Parquet tests running in Travis CI. We might look into caching Ubuntu 14.04 library artifacts someplace for parquet-cpp to cut down on build times (e.g. building Thrift takes a little while in the Parquet builds) |
Rebased. |
Thank you. +1 |
We should perhaps separate compression and decompression code (as in Impala) as gzip is more stateful than the other compressors. Closes apache#11 when merged. Author: Wes McKinney <wes@cloudera.com> Author: Konstantin Knizhnik <knizhnik@garret.ru> Closes apache#48 from wesm/PARQUET-456 and squashes the following commits: 5aeba2a [Wes McKinney] Comment typo 8e1f8f2 [Wes McKinney] Move test run to shell script and enable OS X 633fd71 [Wes McKinney] Port gzip codec code from Impala, expand tests, get them to pass a8d3c11 [Wes McKinney] Add compression round-trip test, gzip needs a bunch more work though 0bc8cf7 [Wes McKinney] Fix PATH_SUFFIXES for zlib 69548c9 [Konstantin Knizhnik] Add zlib to thirdparty build toolchain for compression codec
We should perhaps separate compression and decompression code (as in Impala) as gzip is more stateful than the other compressors. Closes apache#11 when merged. Author: Wes McKinney <wes@cloudera.com> Author: Konstantin Knizhnik <knizhnik@garret.ru> Closes apache#48 from wesm/PARQUET-456 and squashes the following commits: 5aeba2a [Wes McKinney] Comment typo 8e1f8f2 [Wes McKinney] Move test run to shell script and enable OS X 633fd71 [Wes McKinney] Port gzip codec code from Impala, expand tests, get them to pass a8d3c11 [Wes McKinney] Add compression round-trip test, gzip needs a bunch more work though 0bc8cf7 [Wes McKinney] Fix PATH_SUFFIXES for zlib 69548c9 [Konstantin Knizhnik] Add zlib to thirdparty build toolchain for compression codec Change-Id: Iecab77a0000259634ec68b11fa4c73b45ddf794f
We should perhaps separate compression and decompression code (as in Impala) as gzip is more stateful than the other compressors. Closes apache#11 when merged. Author: Wes McKinney <wes@cloudera.com> Author: Konstantin Knizhnik <knizhnik@garret.ru> Closes apache#48 from wesm/PARQUET-456 and squashes the following commits: 5aeba2a [Wes McKinney] Comment typo 8e1f8f2 [Wes McKinney] Move test run to shell script and enable OS X 633fd71 [Wes McKinney] Port gzip codec code from Impala, expand tests, get them to pass a8d3c11 [Wes McKinney] Add compression round-trip test, gzip needs a bunch more work though 0bc8cf7 [Wes McKinney] Fix PATH_SUFFIXES for zlib 69548c9 [Konstantin Knizhnik] Add zlib to thirdparty build toolchain for compression codec Change-Id: Iecab77a0000259634ec68b11fa4c73b45ddf794f
We should perhaps separate compression and decompression code (as in Impala) as gzip is more stateful than the other compressors. Closes apache#11 when merged. Author: Wes McKinney <wes@cloudera.com> Author: Konstantin Knizhnik <knizhnik@garret.ru> Closes apache#48 from wesm/PARQUET-456 and squashes the following commits: 5aeba2a [Wes McKinney] Comment typo 8e1f8f2 [Wes McKinney] Move test run to shell script and enable OS X 633fd71 [Wes McKinney] Port gzip codec code from Impala, expand tests, get them to pass a8d3c11 [Wes McKinney] Add compression round-trip test, gzip needs a bunch more work though 0bc8cf7 [Wes McKinney] Fix PATH_SUFFIXES for zlib 69548c9 [Konstantin Knizhnik] Add zlib to thirdparty build toolchain for compression codec Change-Id: Iecab77a0000259634ec68b11fa4c73b45ddf794f
We should perhaps separate compression and decompression code (as in Impala) as gzip is more stateful than the other compressors. Closes apache#11 when merged. Author: Wes McKinney <wes@cloudera.com> Author: Konstantin Knizhnik <knizhnik@garret.ru> Closes apache#48 from wesm/PARQUET-456 and squashes the following commits: 5aeba2a [Wes McKinney] Comment typo 8e1f8f2 [Wes McKinney] Move test run to shell script and enable OS X 633fd71 [Wes McKinney] Port gzip codec code from Impala, expand tests, get them to pass a8d3c11 [Wes McKinney] Add compression round-trip test, gzip needs a bunch more work though 0bc8cf7 [Wes McKinney] Fix PATH_SUFFIXES for zlib 69548c9 [Konstantin Knizhnik] Add zlib to thirdparty build toolchain for compression codec Change-Id: Iecab77a0000259634ec68b11fa4c73b45ddf794f
This PR enables tests for `ARROW_COMPUTE`, `ARROW_DATASET`, `ARROW_FILESYSTEM`, `ARROW_HDFS`, `ARROW_ORC`, and `ARROW_IPC` (default on). #7131 enabled a minimal set of tests as a starting point. I confirmed that these tests pass locally with the current master. In the current TravisCI environment, we cannot see this result due to a lot of error messages in `arrow-utility-test`. ``` $ git log | head -1 commit ed5f534 % ctest ... Start 1: arrow-array-test 1/51 Test #1: arrow-array-test ..................... Passed 4.62 sec Start 2: arrow-buffer-test 2/51 Test #2: arrow-buffer-test .................... Passed 0.14 sec Start 3: arrow-extension-type-test 3/51 Test #3: arrow-extension-type-test ............ Passed 0.12 sec Start 4: arrow-misc-test 4/51 Test #4: arrow-misc-test ...................... Passed 0.14 sec Start 5: arrow-public-api-test 5/51 Test #5: arrow-public-api-test ................ Passed 0.12 sec Start 6: arrow-scalar-test 6/51 Test #6: arrow-scalar-test .................... Passed 0.13 sec Start 7: arrow-type-test 7/51 Test #7: arrow-type-test ...................... Passed 0.14 sec Start 8: arrow-table-test 8/51 Test #8: arrow-table-test ..................... Passed 0.13 sec Start 9: arrow-tensor-test 9/51 Test #9: arrow-tensor-test .................... Passed 0.13 sec Start 10: arrow-sparse-tensor-test 10/51 Test #10: arrow-sparse-tensor-test ............. Passed 0.16 sec Start 11: arrow-stl-test 11/51 Test #11: arrow-stl-test ....................... Passed 0.12 sec Start 12: arrow-concatenate-test 12/51 Test #12: arrow-concatenate-test ............... Passed 0.53 sec Start 13: arrow-diff-test 13/51 Test #13: arrow-diff-test ...................... Passed 1.45 sec Start 14: arrow-c-bridge-test 14/51 Test #14: arrow-c-bridge-test .................. Passed 0.18 sec Start 15: arrow-io-buffered-test 15/51 Test #15: arrow-io-buffered-test ............... Passed 0.20 sec Start 16: arrow-io-compressed-test 16/51 Test #16: arrow-io-compressed-test ............. Passed 3.48 sec Start 17: arrow-io-file-test 17/51 Test #17: arrow-io-file-test ................... Passed 0.74 sec Start 18: arrow-io-hdfs-test 18/51 Test #18: arrow-io-hdfs-test ................... Passed 0.12 sec Start 19: arrow-io-memory-test 19/51 Test #19: arrow-io-memory-test ................. Passed 2.77 sec Start 20: arrow-utility-test 20/51 Test #20: arrow-utility-test ...................***Failed 5.65 sec Start 21: arrow-threading-utility-test 21/51 Test #21: arrow-threading-utility-test ......... Passed 1.34 sec Start 22: arrow-compute-compute-test 22/51 Test #22: arrow-compute-compute-test ........... Passed 0.13 sec Start 23: arrow-compute-boolean-test 23/51 Test #23: arrow-compute-boolean-test ........... Passed 0.15 sec Start 24: arrow-compute-cast-test 24/51 Test #24: arrow-compute-cast-test .............. Passed 0.22 sec Start 25: arrow-compute-hash-test 25/51 Test #25: arrow-compute-hash-test .............. Passed 2.61 sec Start 26: arrow-compute-isin-test 26/51 Test #26: arrow-compute-isin-test .............. Passed 0.81 sec Start 27: arrow-compute-match-test 27/51 Test #27: arrow-compute-match-test ............. Passed 0.40 sec Start 28: arrow-compute-sort-to-indices-test 28/51 Test #28: arrow-compute-sort-to-indices-test ... Passed 3.33 sec Start 29: arrow-compute-nth-to-indices-test 29/51 Test #29: arrow-compute-nth-to-indices-test .... Passed 1.51 sec Start 30: arrow-compute-util-internal-test 30/51 Test #30: arrow-compute-util-internal-test ..... Passed 0.13 sec Start 31: arrow-compute-add-test 31/51 Test #31: arrow-compute-add-test ............... Passed 0.12 sec Start 32: arrow-compute-aggregate-test 32/51 Test #32: arrow-compute-aggregate-test ......... Passed 14.70 sec Start 33: arrow-compute-compare-test 33/51 Test #33: arrow-compute-compare-test ........... Passed 7.96 sec Start 34: arrow-compute-take-test 34/51 Test #34: arrow-compute-take-test .............. Passed 4.80 sec Start 35: arrow-compute-filter-test 35/51 Test #35: arrow-compute-filter-test ............ Passed 8.23 sec Start 36: arrow-dataset-dataset-test 36/51 Test #36: arrow-dataset-dataset-test ........... Passed 0.25 sec Start 37: arrow-dataset-discovery-test 37/51 Test #37: arrow-dataset-discovery-test ......... Passed 0.13 sec Start 38: arrow-dataset-file-ipc-test 38/51 Test #38: arrow-dataset-file-ipc-test .......... Passed 0.21 sec Start 39: arrow-dataset-file-test 39/51 Test #39: arrow-dataset-file-test .............. Passed 0.12 sec Start 40: arrow-dataset-filter-test 40/51 Test #40: arrow-dataset-filter-test ............ Passed 0.16 sec Start 41: arrow-dataset-partition-test 41/51 Test #41: arrow-dataset-partition-test ......... Passed 0.13 sec Start 42: arrow-dataset-scanner-test 42/51 Test #42: arrow-dataset-scanner-test ........... Passed 0.20 sec Start 43: arrow-filesystem-test 43/51 Test #43: arrow-filesystem-test ................ Passed 1.62 sec Start 44: arrow-hdfs-test 44/51 Test #44: arrow-hdfs-test ...................... Passed 0.13 sec Start 45: arrow-feather-test 45/51 Test #45: arrow-feather-test ................... Passed 0.91 sec Start 46: arrow-ipc-read-write-test 46/51 Test #46: arrow-ipc-read-write-test ............ Passed 5.77 sec Start 47: arrow-ipc-json-simple-test 47/51 Test #47: arrow-ipc-json-simple-test ........... Passed 0.16 sec Start 48: arrow-ipc-json-test 48/51 Test #48: arrow-ipc-json-test .................. Passed 0.27 sec Start 49: arrow-json-integration-test 49/51 Test #49: arrow-json-integration-test .......... Passed 0.13 sec Start 50: arrow-json-test 50/51 Test #50: arrow-json-test ...................... Passed 0.26 sec Start 51: arrow-orc-adapter-test 51/51 Test #51: arrow-orc-adapter-test ............... Passed 1.92 sec 98% tests passed, 1 tests failed out of 51 Label Time Summary: arrow-tests = 27.38 sec (27 tests) arrow_compute = 45.11 sec (14 tests) arrow_dataset = 1.21 sec (7 tests) arrow_ipc = 6.20 sec (3 tests) unittest = 79.91 sec (51 tests) Total Test time (real) = 79.99 sec The following tests FAILED: 20 - arrow-utility-test (Failed) Errors while running CTest ``` Closes #7142 from kiszk/ARROW-8754 Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
[C++] fix code style
* ARROW-11960: [C++][Gandiva] Support escape in LIKE Add gdv_fn_like_utf8_utf8_int8 function in Gandiva to support escape char in LIKE. An escape char is stored in an int8 type which is compatible with char type in C++. Closes apache#9700 from Crystrix/arrow-11960 Authored-by: crystrix <chenxi.li@live.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com> * ARROW-12567: [C++][Gandiva] Implement ILIKE SQL function Closes apache#10179 from jvictorhuguenin/feature/implement-sql-ilike and squashes the following commits: f160880 <frank400> Optimize holder constructor call 97e6e2d <frank400> Remove unnecessary Make method c2363b1 <frank400> Disable TryOptimize for ilike a484149 <frank400> Fix checkstyle on cmake file c6a8372 <frank400> Delete unnecessary holder 4be6cc6 <frank400> Fix redefined function b78085a <frank400> Fix miss include 2efd43e <frank400> Implement ilike function Authored-by: frank400 <j.victorhuguenin2018@gmail.com> Signed-off-by: Praveen <praveen@dremio.com> * ARROW-12410: [C++][Gandiva] Implement regexp_replace function on Gandiva Closes apache#10059 from rodrigojdebem/feature/implement-regexp-replace and squashes the following commits: baf2778 <rodrigojdebem> Add implementation for REGEXP_REPLACE Authored-by: rodrigojdebem <rodrigodebem1@gmail.com> Signed-off-by: Praveen <praveen@dremio.com> Co-authored-by: crystrix <chenxi.li@live.com> Co-authored-by: frank400 <j.victorhuguenin2018@gmail.com> Co-authored-by: rodrigojdebem <rodrigodebem1@gmail.com>
See also: https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#decimal