diff --git a/r/vignettes/developers/setup.Rmd b/r/vignettes/developers/setup.Rmd index 479af577aa848..de33e72407792 100644 --- a/r/vignettes/developers/setup.Rmd +++ b/r/vignettes/developers/setup.Rmd @@ -38,50 +38,32 @@ set -e set -x ``` - -```{bash, save=run & windows, hide=TRUE} -# For some reason CRAN Mirror goes missing in CI -echo 'options(repos=structure(c(CRAN="https://cloud.r-project.org")))' > $HOME/.Rprofile -``` - -Windows and macOS users who wish to contribute to the R package and -don't need to alter libarrow (Arrow's C++ library) may be able to obtain a -recent version of the library without building from source. - -### Linux - -On Linux, you can download a .zip file containing libarrow from the -[nightly repository](https://nightlies.apache.org/arrow/r/libarrow/bin/). - -The directory names correspond to the OpenSSL version the binaries built with: -- "linux-openssl-1.0" (OpenSSL 1.0) -- "linux-openssl-1.1" (OpenSSL 1.1) -- "linux-openssl-3.0" (OpenSSL 3.0) - -Version numbers in that repository correspond to dates. - -You'll need to create a `libarrow` directory inside the R package directory and unzip the zip file containing the compiled libarrow binary files into it. - -### macOS -On macOS, you can install libarrow using [Homebrew](https://brew.sh/): - -```bash -# For the released version: -brew install apache-arrow -# Or for a development version, you can try: -brew install apache-arrow --HEAD -``` - -### Windows - -On Windows, you can download a .zip file containing libarrow from the -[nightly repository](https://nightlies.apache.org/arrow/r/libarrow/bin/windows/). - -Version numbers in that repository correspond to dates. - -You can set the `RWINLIB_LOCAL` environment variable to point to the zip file containing libarrow before installing the arrow R package. - -## R and C++ +The Arrow R package is unique compared to other R packages that you may have +contributed to because it builds on top of the large and feature-rich Arrow C++ +implementation. Because the R package integrates tightly with Arrow C++, +it typically requires a dedicated copy of the library (i.e., it is usually +not possible to link to a system version of libarrow during development). + +## Option 1: Using nightly libarrow binaries + +On Linux, MacOS, and Windows you can use the same workflow you might use for another +package that contains compiled code (e.g., `R CMD INSTALL .` from +a terminal, `devtools::load_all()` from an R prompt, or `Install & Restart` from +RStudio). If the `arrow/r/libarrow` directory is not populated, the configure script will +attempt to download the latest nightly libarrow binary, extract it to the +`arrow/r/libarrow` directory (MacOS, Linux) or `arrow/r/windows` +directory (Windows), and continue building the R package as usual. + +Most of the time, you won't need to update your version of libarrow because +the R package rarely changes with updates to the C++ library; however, if you +start to get errors when rebuilding the R package, you may have to remove the +`libarrow` directory (MacOS, Linux) or `windows` directory (Windows) +and do a "clean" rebuild. You can do this from a terminal with +`R CMD INSTALL . --preclean`, from RStudio using the "Clean and Install" +option from "Build" tab, or using `make clean` if you are using the `Makefile` +located in the root of the R package. + +## Option 2: Use a local Arrow C++ development build If you need to alter both libarrow and the R package code, or if you can't get a binary version of the latest libarrow elsewhere, you'll need to build it from source. This section discusses how to set up a C++ libarrow build configured to work with the R package. For more general resources, see the [Arrow C++ developer guide](https://arrow.apache.org/docs/developers/cpp/building.html). @@ -103,43 +85,6 @@ sudo apt install -y cmake libcurl4-openssl-dev libssl-dev brew install cmake openssl ``` -#### Windows - -The package can be built on Windows using [RTools 4](https://cran.r-project.org/bin/windows/Rtools/). It can be built for mingw32 (i386), mingw64 (x64), or ucrt64 (UCRT x64). mingw64 is the recommended 64-bit installation. - -Open the corresponding RTools Bash, for example "Rtools MinGW 64-bit" for mingw64. - -Install CMake, ccache, and Ninja with: - -```{bash, save=run & windows} -pacman --sync --refresh --noconfirm \ - ${MINGW_PACKAGE_PREFIX}-{ccache,cmake,ninja,openssl} -export CMAKE_GENERATOR=Ninja -``` - -You will need to add R to your path. For a user-level installation, R will be at something like `~/Documents/R/R-4.1.2/bin`. For a global installation, R will be at something like `/c/Program\ Files/R/R-4.1.2/bin`. The R on your path needs to match the architecture you are compiling for, so if you are compiling on 32-bit specify `.../bin/i386` instead of `.../bin/x64`. - -```{bash} -export PATH=~/Documents/R/R-4.1.2/bin/x64:$PATH -``` - -You can install additional dependencies like so. Note that you are limited to the packages in [the RTools repo](https://github.com/r-windows/rtools-packages), which does not contain every dependency used by Arrow. - -```{bash, save=run & windows} -pacman --sync --refresh --noconfirm \ - ${MINGW_PACKAGE_PREFIX}-boost \ - ${MINGW_PACKAGE_PREFIX}-brotli \ - ${MINGW_PACKAGE_PREFIX}-lz4 \ - ${MINGW_PACKAGE_PREFIX}-protobuf \ - ${MINGW_PACKAGE_PREFIX}-snappy \ - ${MINGW_PACKAGE_PREFIX}-thrift \ - ${MINGW_PACKAGE_PREFIX}-zlib \ - ${MINGW_PACKAGE_PREFIX}-zstd \ - ${MINGW_PACKAGE_PREFIX}-aws-sdk-cpp \ - ${MINGW_PACKAGE_PREFIX}-re2 \ - ${MINGW_PACKAGE_PREFIX}-libutf8proc -``` - ### Step 2 - Configure the libarrow build We recommend that you configure libarrow to be built to a user-level directory rather than a system directory for your development work. This is so that the development version you are using doesn't overwrite a released version of libarrow you may already have installed, and so that you are also able work with more than one version of libarrow (by using different `ARROW_HOME` directories for the different versions). @@ -158,13 +103,6 @@ export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH echo "export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH" >> ~/.bash_profile ``` -_Special instructions on Windows:_ You will need to add `$ARROW_HOME/bin` to your `PATH` if you are using dynamic libraries (which is recommended). - -```{bash, save=run & windows} -export PATH=$ARROW_HOME/bin:$PATH -echo "export PATH=\"$ARROW_HOME/bin:$PATH\"" >> ~/.bash_profile -``` - Start by navigating in a terminal to the arrow repository. You will need to create a directory into which the C++ build will put its contents. We recommend that you make a `build` directory inside of the `cpp` directory of the Arrow git repository (it is git-ignored, so you won't accidentally check it in). Next, change directories to be inside `cpp/build`: ```{bash, save=run & !sys_install} @@ -197,32 +135,10 @@ cmake \ .. ``` -##### Windows - -```{bash, save=run & !sys_install & windows} -cmake \ - -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ - -DCMAKE_INSTALL_LIBDIR=lib \ - -DARROW_COMPUTE=ON \ - -DARROW_CSV=ON \ - -DARROW_DATASET=ON \ - -DARROW_EXTRA_ERROR_CONTEXT=ON \ - -DARROW_FILESYSTEM=ON \ - -DARROW_MIMALLOC=ON \ - -DARROW_JSON=ON \ - -DARROW_PARQUET=ON \ - -DARROW_WITH_SNAPPY=OFF \ - -DARROW_WITH_ZLIB=ON \ - .. -``` - #### {-} `..` refers to the C++ source directory: you're in `cpp/build` and the source is in `cpp`. -**For Windows**: some options, including `-DARROW_JEMALLOC`, are not supported on Windows. - - ```{bash, save=run & !sys_install, hide=TRUE} # For testing purposes, build with only shared libraries cmake \