-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Add azure-sdk-for-cpp to ThirdpartyToolchain #29847
Comments
It looks like the azure file system skeleton has merged #35903. So I think this is probably the next step. #12914 did a lot of the work but when I came to using this I had to change a few things:
I will rebase my changes described above onto main and open a draft PR. I have basically no experience with C++ but I have built a python wheel using these changes that we've been using in production for a while now. If someone who knows what they are talking about can point me in the right direction maybe I can get something ready for review. |
I think that preparing Azurite is the next step. Because we can't run our tests without Azurite. |
I opened the draft PR #36835. This PR does actually include azurite setup and a simple test that writes to azurite and reads it back. I just used the azurite changes direct from #12914 and it seems to work great. If we want to split the azurite setup to a different PR though, that makes sense to me. |
Yes. We should create a separated PR for Azurite. |
I've opened a PR for review #36835. I don't have experience with C++ so hopefully what I've done makes sense. I've just tried to copy the patterns I can see in the repo and debug when I problems. |
### Rationale for this change We want to use the Azure SDK for C++ to read/write to Azure blob storage. Obviously this is pretty important for building an `AzureFileSystem`. ### What changes are included in this PR? Builds the the relevant parts of the azure SDK as a cmake external project. Adds a couple of simple tests that just assert that the Azure SDK is working and a couple of lines in `AzureFileSystem` to initialise the blob storage client to ensure the build is working correctly in all environments. I started with the build setup from #12914 but I did make few changes. 1. Although its atypical for this project we chose to switch from cmake's `ExternalProject` to `FetchContent`. `FetchContent` is recomended by the Azure docs https://github.com/Azure/azure-sdk-for-cpp#cmake-project--fetch-content. It also solves a few problems including: automatically linking system curl and ssl instead of bootstrapping vcpkg and installing curl and ssl from there. 2. Only build one version of the Azure SDK for C++ because it contains all the components. Previously we were unnecessarily building 5 different versions of the whole thing on top of each other. This created race conditions for which version each component came from. 3. We are using `azure-core_1.10.2` which is a very recent version. There are a couple of important reasons for this 1. [an important managed identity fix](Azure/azure-sdk-for-cpp#4723), 2. [fixed support for curl versions < 7.71.0](Azure/azure-sdk-for-cpp#4792). There will be follow up PRs to enable Azure in the manylinux builds. We need to update `vcpkg` first so we can get a version of the Azure SDK which contains [an important managed identity fix](Azure/azure-sdk-for-cpp#4723). ### Are these changes tested? Yes. There is a simple test that just runs the Azure client against azurite. Additionally just initialising the client in `AzureFileSystem` goes a long way towards ensuring the build is working. ### Are there any user-facing changes? No * Closes: #29847 Lead-authored-by: Thomas Newton <thomas.w.newton@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Co-authored-by: shefali singh <shefalisingh@microsoft.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change We want to use the Azure SDK for C++ to read/write to Azure blob storage. Obviously this is pretty important for building an `AzureFileSystem`. ### What changes are included in this PR? Builds the the relevant parts of the azure SDK as a cmake external project. Adds a couple of simple tests that just assert that the Azure SDK is working and a couple of lines in `AzureFileSystem` to initialise the blob storage client to ensure the build is working correctly in all environments. I started with the build setup from apache#12914 but I did make few changes. 1. Although its atypical for this project we chose to switch from cmake's `ExternalProject` to `FetchContent`. `FetchContent` is recomended by the Azure docs https://github.com/Azure/azure-sdk-for-cpp#cmake-project--fetch-content. It also solves a few problems including: automatically linking system curl and ssl instead of bootstrapping vcpkg and installing curl and ssl from there. 2. Only build one version of the Azure SDK for C++ because it contains all the components. Previously we were unnecessarily building 5 different versions of the whole thing on top of each other. This created race conditions for which version each component came from. 3. We are using `azure-core_1.10.2` which is a very recent version. There are a couple of important reasons for this 1. [an important managed identity fix](Azure/azure-sdk-for-cpp#4723), 2. [fixed support for curl versions < 7.71.0](Azure/azure-sdk-for-cpp#4792). There will be follow up PRs to enable Azure in the manylinux builds. We need to update `vcpkg` first so we can get a version of the Azure SDK which contains [an important managed identity fix](Azure/azure-sdk-for-cpp#4723). ### Are these changes tested? Yes. There is a simple test that just runs the Azure client against azurite. Additionally just initialising the client in `AzureFileSystem` goes a long way towards ensuring the build is working. ### Are there any user-facing changes? No * Closes: apache#29847 Lead-authored-by: Thomas Newton <thomas.w.newton@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Co-authored-by: shefali singh <shefalisingh@microsoft.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change We want to use the Azure SDK for C++ to read/write to Azure blob storage. Obviously this is pretty important for building an `AzureFileSystem`. ### What changes are included in this PR? Builds the the relevant parts of the azure SDK as a cmake external project. Adds a couple of simple tests that just assert that the Azure SDK is working and a couple of lines in `AzureFileSystem` to initialise the blob storage client to ensure the build is working correctly in all environments. I started with the build setup from apache#12914 but I did make few changes. 1. Although its atypical for this project we chose to switch from cmake's `ExternalProject` to `FetchContent`. `FetchContent` is recomended by the Azure docs https://github.com/Azure/azure-sdk-for-cpp#cmake-project--fetch-content. It also solves a few problems including: automatically linking system curl and ssl instead of bootstrapping vcpkg and installing curl and ssl from there. 2. Only build one version of the Azure SDK for C++ because it contains all the components. Previously we were unnecessarily building 5 different versions of the whole thing on top of each other. This created race conditions for which version each component came from. 3. We are using `azure-core_1.10.2` which is a very recent version. There are a couple of important reasons for this 1. [an important managed identity fix](Azure/azure-sdk-for-cpp#4723), 2. [fixed support for curl versions < 7.71.0](Azure/azure-sdk-for-cpp#4792). There will be follow up PRs to enable Azure in the manylinux builds. We need to update `vcpkg` first so we can get a version of the Azure SDK which contains [an important managed identity fix](Azure/azure-sdk-for-cpp#4723). ### Are these changes tested? Yes. There is a simple test that just runs the Azure client against azurite. Additionally just initialising the client in `AzureFileSystem` goes a long way towards ensuring the build is working. ### Are there any user-facing changes? No * Closes: apache#29847 Lead-authored-by: Thomas Newton <thomas.w.newton@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Co-authored-by: shefali singh <shefalisingh@microsoft.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This is a requirement to be able to make progress on ARROW-9611
Reporter: Yesh
Related issues:
Note: This issue was originally created as ARROW-14270. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: