Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16510: [R] Add bindings for GCS filesystem #13404

Merged
merged 43 commits into from
Jun 26, 2022
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
2a7353a
Basic wiring for GCS in R
nealrichardson Jun 18, 2022
b203c88
Compiles now but symbol not found
nealrichardson Jun 18, 2022
f3c6e86
Update absl cmake for latest version in order to fix undefined symbol
nealrichardson Jun 19, 2022
ca19afb
GCS needs curl and openssl like S3
nealrichardson Jun 21, 2022
db82303
Add some basic tests that exercise the bindings, no actual or mock GC…
nealrichardson Jun 21, 2022
14b8ecd
Move type forwarding to cpp
nealrichardson Jun 21, 2022
d5fa958
Add ARROW_GCS wherever ARROW_S3 is mentioned in linux builds
nealrichardson Jun 21, 2022
71f9f6a
Turn on (bundled) ARROW_GCS in mac and win packages
nealrichardson Jun 21, 2022
bb76790
Try updating abseil deps for google-cloud-cpp from upstream
nealrichardson Jun 21, 2022
d4bcfdb
Add curl to PKGBUILD
nealrichardson Jun 21, 2022
d5b2b85
Try to define all the symbols
nealrichardson Jun 21, 2022
bfde6b5
absl::memory is header only
nealrichardson Jun 22, 2022
015560b
More header only
nealrichardson Jun 22, 2022
2533eb2
See if this gets us closer
nealrichardson Jun 22, 2022
f75253d
Add more absl to bundled libs
nealrichardson Jun 22, 2022
21e4c0b
Add more recursive dependencies of abseil libs
nealrichardson Jun 22, 2022
49b56bc
base_internal must be header only
nealrichardson Jun 22, 2022
d354704
whackamole
nealrichardson Jun 22, 2022
43a9201
Use pkg-config to determine dependencies
nealrichardson Jun 22, 2022
b51fe74
Add non-abseil libs
nealrichardson Jun 22, 2022
58d0b57
sigh
nealrichardson Jun 22, 2022
fbdbec0
Add jira issues to TODOs
nealrichardson Jun 22, 2022
be7acfd
Try -DCURL_STATICLIB for windows packages
nealrichardson Jun 22, 2022
3f9793d
Add Kou's patch
nealrichardson Jun 23, 2022
54e7d4e
Back out -DCURL_STATICLIB from configure.win
nealrichardson Jun 23, 2022
789988e
Turn on ARROW_VERBOSE_THIRDPARTY_BUILD to see if CURL_STATICLIB is be…
nealrichardson Jun 23, 2022
86539ec
:facepalm:
nealrichardson Jun 23, 2022
44f8419
Patch to google-cloud-cpp for -DCURL_STATICLIB on Windows
kou Jun 24, 2022
8627bb0
Add more GCS_LIBS
kou Jun 24, 2022
92d0918
Fix lint
kou Jun 24, 2022
52f6b45
Fix order
kou Jun 24, 2022
598c665
Add missing library
kou Jun 24, 2022
aa059b3
Add license header
kou Jun 24, 2022
d843122
Increase timeout
kou Jun 24, 2022
3192b0a
Remove needless CURL_STATICLIB check
kou Jun 24, 2022
1f9f9de
Upgrade google-cloud-cpp to 1.42.0 to resolve mingw issues
nealrichardson Jun 25, 2022
2dbed38
Add comment for google-cloud-cpp patch
nealrichardson Jun 25, 2022
ee4e8e6
Turn ARROW_GCS back off in mingw C++ workflow
nealrichardson Jun 25, 2022
226c31b
Fixes for failing nightly builds
nealrichardson Jun 25, 2022
c7a5f12
Try this for brew
nealrichardson Jun 25, 2022
a2a5e87
Add TODO
nealrichardson Jun 25, 2022
5dad768
Back out brew job change and note TODO. This will pass once it is mer…
nealrichardson Jun 25, 2022
ea76c70
Swap order in bundled static libs
nealrichardson Jun 26, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions .github/workflows/cpp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -276,8 +276,12 @@ jobs:
ARROW_DATASET: ON
ARROW_FLIGHT: ON
ARROW_GANDIVA: ON
# google-could-cpp uses _dupenv_s() but it can't be used with msvcrt.
# We need to use ucrt to use _dupenv_s().
# With GCS on,
# * MinGW 32 build OOMs (maybe turn off unity build?)
# * MinGW 64 fails to compile the GCS filesystem tests, some conflict
# with boost. First error says:
# D:/a/_temp/msys64/mingw64/include/boost/asio/detail/socket_types.hpp:24:4: error: #error WinSock.h has already been included
# TODO(ARROW-16906)
# ARROW_GCS: ON
ARROW_HDFS: OFF
ARROW_HOME: /mingw${{ matrix.mingw-n-bits }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/r.yml
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ jobs:
name: AMD64 Windows C++ RTools ${{ matrix.config.rtools }} ${{ matrix.config.arch }}
runs-on: windows-2019
if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
timeout-minutes: 60
timeout-minutes: 90
strategy:
fail-fast: false
matrix:
Expand Down
5 changes: 5 additions & 0 deletions ci/scripts/PKGBUILD
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ arch=("any")
url="https://arrow.apache.org/"
license=("Apache-2.0")
depends=("${MINGW_PACKAGE_PREFIX}-aws-sdk-cpp"
"${MINGW_PACKAGE_PREFIX}-curl" # for google-cloud-cpp bundled build
"${MINGW_PACKAGE_PREFIX}-libutf8proc"
"${MINGW_PACKAGE_PREFIX}-re2"
"${MINGW_PACKAGE_PREFIX}-thrift"
Expand Down Expand Up @@ -79,11 +80,13 @@ build() {
export PATH="/C/Rtools${MINGW_PREFIX/mingw/mingw_}/bin:$PATH"
export CPPFLAGS="${CPPFLAGS} -I${MINGW_PREFIX}/include"
export LIBS="-L${MINGW_PREFIX}/libs"
export ARROW_GCS=OFF
export ARROW_S3=OFF
export ARROW_WITH_RE2=OFF
# Without this, some dataset functionality segfaults
export CMAKE_UNITY_BUILD=ON
else
export ARROW_GCS=ON
export ARROW_S3=ON
export ARROW_WITH_RE2=ON
# Without this, some compute functionality segfaults in tests
Expand All @@ -101,6 +104,7 @@ build() {
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-DARROW_FILESYSTEM=ON \
-DARROW_GCS="${ARROW_GCS}" \
-DARROW_HDFS=OFF \
-DARROW_JEMALLOC=OFF \
-DARROW_JSON=ON \
Expand All @@ -112,6 +116,7 @@ build() {
-DARROW_SNAPPY_USE_SHARED=OFF \
-DARROW_USE_GLOG=OFF \
-DARROW_UTF8PROC_USE_SHARED=OFF \
-DARROW_VERBOSE_THIRDPARTY_BUILD=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_RE2="${ARROW_WITH_RE2}" \
-DARROW_WITH_SNAPPY=ON \
Expand Down
6 changes: 3 additions & 3 deletions ci/scripts/r_windows_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ if [ -d mingw64/lib/ ]; then
# These may be from https://dl.bintray.com/rtools/backports/
cp $MSYS_LIB_DIR/mingw64/lib/lib{thrift,snappy}.a $DST_DIR/${RWINLIB_LIB_DIR}/x64
# These are from https://dl.bintray.com/rtools/mingw{32,64}/
cp $MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a $DST_DIR/lib/x64
cp $MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a $DST_DIR/lib/x64
fi

# Same for the 32-bit versions
Expand All @@ -97,15 +97,15 @@ if [ -d mingw32/lib/ ]; then
mkdir -p $DST_DIR/lib/i386
mv mingw32/lib/*.a $DST_DIR/${RWINLIB_LIB_DIR}/i386
cp $MSYS_LIB_DIR/mingw32/lib/lib{thrift,snappy}.a $DST_DIR/${RWINLIB_LIB_DIR}/i386
cp $MSYS_LIB_DIR/mingw32/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a $DST_DIR/lib/i386
cp $MSYS_LIB_DIR/mingw32/lib/lib{zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a $DST_DIR/lib/i386
fi

# Do the same also for ucrt64
if [ -d ucrt64/lib/ ]; then
ls $MSYS_LIB_DIR/ucrt64/lib/
mkdir -p $DST_DIR/lib/x64-ucrt
mv ucrt64/lib/*.a $DST_DIR/lib/x64-ucrt
cp $MSYS_LIB_DIR/ucrt64/lib/lib{thrift,snappy,zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a $DST_DIR/lib/x64-ucrt
cp $MSYS_LIB_DIR/ucrt64/lib/lib{thrift,snappy,zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a $DST_DIR/lib/x64-ucrt
fi

# Create build artifact
Expand Down
31 changes: 31 additions & 0 deletions cpp/build-support/google-cloud-cpp-curl-static-windows.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

diff -ru google_cloud_cpp_ep.orig/cmake/FindCurlWithTargets.cmake google_cloud_cpp_ep/cmake/FindCurlWithTargets.cmake
--- google_cloud_cpp_ep.orig/cmake/FindCurlWithTargets.cmake 2022-04-05 06:00:53.000000000 +0900
+++ google_cloud_cpp_ep/cmake/FindCurlWithTargets.cmake 2022-06-24 10:06:00.177969962 +0900
@@ -68,6 +68,10 @@
TARGET CURL::libcurl
APPEND
PROPERTY INTERFACE_LINK_LIBRARIES crypt32 wsock32 ws2_32)
+ set_property(
+ TARGET CURL::libcurl
+ APPEND
+ PROPERTY INTERFACE_COMPILE_DEFINITIONS "CURL_STATICLIB")
endif ()
if (APPLE)
set_property(
Loading