Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-3366: [R] Dockerfile for docker-compose setup #2770

Closed
wants to merge 13 commits into from
Closed

ARROW-3366: [R] Dockerfile for docker-compose setup #2770

wants to merge 13 commits into from

Conversation

jameslamb
Copy link
Contributor

Thank you to @romainfrancois and others who have pushed forward the R side of this project!

This PR is my attempt to address ARROW-3336, providing a testing container for the R package.

This follows up on work done by @kszucs in #2572 in an R-specific way.

NOTE: This PR is a WIP. R CMD INSTALL currently fails because it cannot find wherever I installed arrow to. But I felt that this is far enough along to put up for review.

@romainfrancois
Copy link
Contributor

I don't really speak docker, but that LGTM

@kszucs kszucs self-requested a review October 16, 2018 08:20
@kszucs
Copy link
Member

kszucs commented Oct 16, 2018

@jameslamb great, thanks for implementing it!

I've recently changed the semantics of the docker-compose containers to optimize the build times, We need to adjust the image a little bit.

r/Dockerfile Outdated Show resolved Hide resolved
r/Dockerfile Outdated Show resolved Hide resolved
r/Dockerfile Outdated Show resolved Hide resolved
@wesm
Copy link
Member

wesm commented Oct 25, 2018

Is this ready to go?

@jameslamb
Copy link
Contributor Author

@wesm @romainfrancois sorry for the delay, no this is not ready to go yet.

I will update it this weekend and @ you when it's ready. Hit a weird problem with how R's search path for packages works, I think I know how to resolve it.

@jameslamb
Copy link
Contributor Author

@kszucs I got past the install issue (by updating LD_LIBRARY_PATH) but hitting a new error.

Running

docker-compose build cpp
docker-compose build r
docker-compose run r

Makes it through building the C++ package and compiling the source code in the R package, but fails to install the R library with this error:

installing to /usr/local/lib/R/site-library/arrow/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/arrow/libs/arrow.so':
  /usr/local/lib/R/site-library/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed
* removing '/usr/local/lib/R/site-library/arrow'

I found the same problem reported in #1319 so going to look into that diff.

@jameslamb
Copy link
Contributor Author

Oh also I rebased to master as of an hour ago, so this is reflective of the most recent changes to the project.

@codecov-io
Copy link

codecov-io commented Oct 27, 2018

Codecov Report

Merging #2770 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2770      +/-   ##
==========================================
+ Coverage   86.63%   86.63%   +<.01%     
==========================================
  Files         491      491              
  Lines       69438    69438              
==========================================
+ Hits        60157    60158       +1     
+ Misses       9189     9188       -1     
  Partials       92       92
Impacted Files Coverage Δ
cpp/src/plasma/thirdparty/ae/ae.c 72.03% <0%> (-0.95%) ⬇️
cpp/src/arrow/csv/column-builder.cc 97.31% <0%> (+2.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29657d8...9fcebe5. Read the comment docs.

@jameslamb
Copy link
Contributor Author

I put the R build instructions (like R CMD INSTALL and R CMD check calls) into a ci/docker_build_r.sh for consistency with how other languages set up their stuff.

But still stuck on this undefined symbol error. Any advice would be appreciated!

@wesm
Copy link
Member

wesm commented Oct 28, 2018

@jameslamb it looks like one part of the toolchain used -D_GLIBCXX_USE_CXX11_ABI=0 and another didn't

@kszucs
Copy link
Member

kszucs commented Oct 28, 2018

R CMD check runs now, however a couple of tests are failing.

docker-compose run r recompiles the R package every time, is it possible to cache that in the mounted /build/r directory?

@jameslamb
Copy link
Contributor Author

@kszucs thanks for the help with this!

IMHO it wouldn't be worthwhile to cache the built R package, since checking whether it can be recompiled (given some change in the arrow C++ lib) is an important piece of testing it.

As for the test errors...I can replicate them inside the container but they don't show up on my Mac or on Travis, so definitely something I need to look into. I don't think we need to worry about the portability warning in the Makefile right now (since this project has a way to go before it is CRAN-ready anyway) but I do think those test failures should block this PR. Going to look into them.

@jameslamb
Copy link
Contributor Author

I have add #2888 to deal with one problem with the R package (undeclared dependency) but I'm still struggling. Current error I get when trying to install the package:

** testing if installed package can be loaded
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/arrow/r/arrow.Rcheck/arrow/libs/arrow.so':
  /arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Error: loading failed
Execution halted
ERROR: loading failed

I'm actually able to install the package with

R CMD INSTALL arrow_0.0.0.9000.tar.gz

And can technically library it in

Rscript -e "library(arrow)"

But that undefined symbol error comes up when I run devtools::load_all(), devtools::test(), or R CMD check arrow_0.0.0.9000.tar.gz --as-cran.

I'm not sure what to do next. I've looked through the Travis setup for R (was succeeding at least as of this build) but nothing in there is striking me as materially different from what I've done.

@jameslamb
Copy link
Contributor Author

coming back to this tonight. I'm still stuck on that undefined symbol error. Will report back if I make progress.

@wesm
Copy link
Member

wesm commented Nov 8, 2018

You have a build toolchain problem (compiler flags). One component is using the gcc5 ABI while another is not.

@wesm
Copy link
Member

wesm commented Nov 8, 2018

In Travis CI I think we are using gcc < 5 so this can't happen there

@jameslamb
Copy link
Contributor Author

oh interesting ok! Looking into it.

@jameslamb
Copy link
Contributor Author

ahhhh I see this in .travis.yml

MATRIX_EVAL="CC=gcc-4.9 && CXX=g++-4.9"

And confirmed I'm running gcc 7.x in the container. Ok this is progress!

@wesm
Copy link
Member

wesm commented Nov 8, 2018

See my comment above for the flag you need to pass to the part that is being built with the newer ABI

@jameslamb
Copy link
Contributor Author

I'm already using this:

-D_GLIBCXX_USE_CXX11_ABI=0 -Wabi-tag

And I can tell from the R logs that R is respecting that flag. Is that the flag you're talking about?

The symbol it's complaining about is in c_glib/example/lua/stream-to-torch-tensor.lua.

Is there some other environment variable I need to be setting to avoid trying to build the torch stuff? It's not obvious to me looking at .travis.yml.

@wesm
Copy link
Member

wesm commented Nov 8, 2018

So in the error message

/arrow/r/arrow.Rcheck/arrow/libs/arrow.so: undefined symbol: _ZN5arrow9timestampENS_8TimeUnit4typeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

This means that whatever shared library is linking to libarrow.so has inlined some functions from the Arrow headers with the gcc5 ABI. Unmangled the symbol is

arrow::timestamp(arrow::TimeUnit::type, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)

so in some compilation unit the CXXFLAGS isn't being propagated

@wesm
Copy link
Member

wesm commented Nov 14, 2018

Here's what I'm seeing running this locally:

https://gist.github.com/wesm/ac8f6f8072620e0bb4c9de07f7deb5f3

So the extension appears to link fine. In this part

* checking whether package 'arrow' can be installed ... OK
* checking installed package size ... NOTE
  installed size is  9.5Mb
  sub-directories of 1Mb or more:
    R      1.6Mb
    libs   7.8Mb
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
Files 'README.md' or 'NEWS.md' cannot be checked without 'pandoc' being installed.

"checking whether package 'arrow' can be installed ... OK" took a LONG time. Is it compiling the extension again? I speculate that it's at this point that the CXXFLAGS in the environment is being cleansed. It seems logical to me that R CMD CHECK would want to be robust to snowflake-y environment variables

@romainfrancois is there a way to force a certain value in CXXFLAGS to be appended without relying on an environment variable?

@jameslamb
Copy link
Contributor Author

@wesm I'm going to try setting it directly in r/src/Makevars.in. I bet that's it.

Will report back shortly

@wesm
Copy link
Member

wesm commented Nov 14, 2018

That wouldn't be a long term solution (since this variable needs to match libarrow.so, which might have been built with the newer ABI), but at least it will unblock us here for now

@jameslamb
Copy link
Contributor Author

yeah I mean at least if that works, we've at least isolated the problem and I can open an issue documenting it for someone to pick up in the future.

@jameslamb
Copy link
Contributor Author

I ALMOST GOT IT!!!!

I think I'm one failing unit test away from this working. This test (https://github.com/apache/arrow/blob/master/r/tests/testthat/test-RecordBatch.R#L140) on RecordBatch is breaking the build. Somehow testthat::expect_error is not trapping the error, and the std::bad_alloc that gets thrown is breaking the tests.

I have a few ideas, testing now.

(also, btw, I force pushed because I rebased to as-of-5-minutes-ago master).

Updates to follow.

@@ -137,7 +137,6 @@ test_that("read_record_batch handles various streams (ARROW-3450, ARROW-3505)",
batch4 <- read_record_batch(mmap_file)
batch5 <- read_record_batch(bytes)
batch6 <- read_record_batch(buf_reader)
expect_error(read_record_batch(buf_reader))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romainfrancois I believe this PR is now working! However, I want to talk about this test.

I had to remove this to get R CMD CHECK or devtools::test() to complete. I think the issue is that expect_error cannot trap C++ exceptions. When running with this test restored, testing immediately stops with error:

terminate called after throwing an instance of 'std::bad_alloc'
    what():  std::bad_alloc
  Aborted

Have you ever seen that issue with expect_error() before? I tried googling around and nothing obvious jumped out to me. I have no idea how this could be working on Travis but not in my setup on docker :/.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question is why an exception is being thrown at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s what Rcpp::stop does, but this is supposed to be caught by the generated code.

@romainfrancois
Copy link
Contributor

I’ll have a look but this should not happen.

Copy link
Member

@kszucs kszucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks @jameslamb !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants