Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New CRAN issues #1308

Closed
eddelbuettel opened this issue Jun 3, 2024 · 16 comments
Closed

New CRAN issues #1308

eddelbuettel opened this issue Jun 3, 2024 · 16 comments

Comments

@eddelbuettel
Copy link
Member

Per incoming email

With current R-devel, we are seeing new segfaults and corresponding
UBSAN problems for packages

VAJointSurv fixest kdevine maxnodf o2geosocial openxlsx reservr vapour

with fixest and openxlsx showing

#0 0x7fd6892ef904 in Rcpp::traits::r_vector_cache<14, Rcpp::PreserveStorage>::ref(long) /data/gannet/ripley/R/test-dev/Rcpp/include/Rcpp/vector/traits.h:46
#1 0x7fd6892ef904 in Rcpp::Vector<14, Rcpp::PreserveStorage>::operator[](long) /data/gannet/ripley/R/test-dev/Rcpp/include/Rcpp/vector/Vector.h:340

and the others similarly. Can you please have a look?

[....] tells me that there are also new UBSAN issues for

BART FLOPART FLSSS LOPAR aum lme4 magi nanotime ondisc pedmod
rsparse supc xrnet

most of which point to Rcpp (lme4 seems from Eigen).

with short followup

Apparently this is from

r86629 | luke | 2024-05-27 00:29:48 +0200 (Mon, 27 May 2024) | 3 lines
Hide some more functions in inlined.c.
Also define STRICT_TYPECHECK when compiling inlined.c.

@kevinushey
Copy link
Contributor

I was able to reproduce this locally on macOS; I'll see if I can learn more.

@eddelbuettel
Copy link
Member Author

Which one did you try? openxlsx may be the smaller of the two.

@kevinushey
Copy link
Contributor

Indeed, I tried with openxlsx. Next step is to try and narrow down a reproducible example further...

@kevinushey
Copy link
Contributor

This seems like potentially a bug in openxlsx? Here's a more minimal example:

library(openxlsx)
trace(openxlsx:::read_workbook, quote(print(utils::ls.str())))
xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
read.xlsx(xlsxFile, 2, startRow = 3, colNames = FALSE, detectDates = TRUE)

The function read_workbook gets called with:

Browse[1]> ls.str()
clean_names : function (x, schar)  
cols_in :  int [1:88] 1 2 3 4 5 6 7 8 9 1 ...
hasColNames :  logi FALSE
hasSepNames :  chr "."
is_date :  logi [1:88] FALSE FALSE FALSE FALSE FALSE FALSE ...
nRows :  int 32
rows_in :  int [1:88] 3 3 3 3 3 3 3 3 3 4 ...
skipEmptyCols :  logi TRUE
skipEmptyRows :  logi TRUE
string_inds :  int(0) 
v :  chr [1:88] "1" "2" "3" "4" "5" "6" "7" "8" "9" "1" "2" "3" "4" "5" "6" "7" "8" "8" "2" "2" "3" "4" "4" ...

Note that string_inds is an empty vector. That eventually gets read with code in read_workbook.cpp:503:

  st_inds0[0] = string_inds[0];

So they're attempting to read from a zero-length integer vector, which fails. I'm not sure why that would've snuck by tests in the past, though.

@eddelbuettel
Copy link
Member Author

eddelbuettel commented Jun 3, 2024

We were just talking to @JanMarvin the other day in another issue so let's call him. I am sure he heard from CRAN too ...

Any insight from your end?

@kevinushey
Copy link
Contributor

It looks like vapour might be running into something similar? I'm not positive, but there is code like:

  GDALDatasetH ds = gdalraster::gdalH_open_dsn(dsource[0],   0);  

where gdalH_open_dsn has the definition:

// open the DSN, open a subdataset if sds > 0 (else you get the outer shell)
inline GDALDatasetH gdalH_open_dsn(const char * dsn, IntegerVector sds) {
  GDALDatasetH DS; 
  DS = GDALOpen(dsn, GA_ReadOnly);
  if (!DS) return nullptr; 
  if (sds[0] > 0 && gdal_has_subdataset((GDALDataset*) DS)) {
    CharacterVector sdsnames = gdal_subdataset_1((GDALDataset*)DS, sds[0]);
    if (sdsnames.length() > 0 && !sdsnames[0].empty()) {
      GDALClose((GDALDataset*) DS);
      DS = GDALOpen(sdsnames[0], GA_ReadOnly);
    }
    
  }
  return DS;
}

That is, an IntegerVector is being constructed implicitly from the int 0, which gives a zero-length integer vector rather than an integer vector of length 1 with value 0 (which the author probably expected / assumed). That implies the sds[0] check in that function will then give an out-of-bounds access into a length-zero vector.

@kevinushey
Copy link
Contributor

I also took a look at fixest and it looks like another variation on the same issue -- an attempt to read / write into a zero-length vector. Here's a reproducible example (distilled from the failing example code)

library(fixest)

set.seed(0)

base = iris
names(base) = c("y", "x1", "x2", "x3", "fe1")
base$fe2 = rep(1:5, 30)
base$y[1:5] = NA
base$x1[4:8] = NA
base$x2[4:21] = NA
base$x3[110:111] = NA
base$fe1[110:118] = NA
base$fe2[base$fe2 == 1] = 0
base$fe3 = sample(letters[1:5], 150, TRUE)
base$period = rep(1:50, 3)
base$x_cst = 1

res = feols(c(y, x1) ~ 1 | fe1 | x2 ~ x3, base)

And indeed, the failure occurs at code like the following:

    XtX(0, 0) = value;

where XtX is a 0 x 0 matrix. It seems like this is the common thread between the various issues.

All that said, I'm not sure what we should do:

  1. Inform package authors that the issue most likely lies in their code, and ask them to review the issues individually?
  2. Include some code in Rcpp to help alleviate these issues, e.g. by giving an R warning or error on an out-of-bounds access to an Rcpp vector / matrix?

@JanMarvin
Copy link

Hi @eddelbuettel I'm poking at it in ycphs/openxlsx#472 (but afaik nobody is actively working on openxlsx and I'm not its maintainer, I'm just following Rcpp 😄 )

@eddelbuettel
Copy link
Member Author

@JanMarvin Whoops. My bad.

@kevinushey We do it some places (vector/Subsetter.h) and it might be hard to do it "universally" :-/

@lrberge
Copy link

lrberge commented Jun 11, 2024

On the fixest bug. Thanks @kevinushey for looking into it!
It was definitely a bug. Now fixed.

@AshesITR
Copy link

reservr suffered from the same (UB working with length-0 input). Thank you for providing the hints here.

@tdhock
Copy link

tdhock commented Jun 19, 2024

Hi! I maintain 3 packages which are affected, all using Rcpp with the same pattern.

  • I defined a C++ function which inputs a Rcpp::NumericVector or IntegerVector called some_data
  • I call a C function and one of the arguments is &some_data[0] which I am using to get the pointer to the first element in the array

is that ok? If so I guess the fix will be provided in Rcpp?
If not, I can update my packages. is there another/better way to get a pointer to the first element in the array?

I don't think my issues involve length-0 vectors since in that case my code stops with an error before trying to get the pointer, for example https://github.com/tdhock/FLOPART/blob/main/src/interface.cpp

  int data_count = data_vec.size();
  if(data_count < 2){
    Rcpp::stop("need at least two data points");
  }

@Enchufa2
Copy link
Member

@tdhock The bug is most probably in your package. Rcpp can just warn (for now) or error (in a future release) so that you are aware of these bugs before CRAN UBSAN/ASAN checks are reported.

Note that the UBSAN/ASAN errors I see on CRAN are not on line 55, but on line 57, where you do this:

&label_type_vec[0], &label_start_vec[0], &label_end_vec[0], label_count,

And I don't see checks for length-0 vectos for these ones.

@eddelbuettel
Copy link
Member Author

@tdhock You could consider doing what I just did: reducing the test surface by not running the tests offending UBSAN.

@tdhock
Copy link

tdhock commented Jun 19, 2024

hi Inaki and Dirk thanks for the feedback, that is helpful.
indeed I will update my packages to avoid doing &some_vec[0] on zero-length vectors, which actually can be provided by the user/tests sometimes. In the example above, my suggested fix is:

  const int *label_type_ptr=0, *label_start_ptr=0, *label_end_ptr=0;
  if(0 < label_count){
    label_type_ptr = &label_type_vec[0];
    label_start_ptr = &label_start_vec[0];
    label_end_ptr = &label_end_vec[0];
  }
  int status = FLOPART
    (&data_vec[0], &weight_vec[0],
     data_count, penalty,
     label_type_ptr, label_start_ptr, label_end_ptr, label_count,

@eddelbuettel
Copy link
Member Author

This can be closed now. The issues appear to all be addressed based on checking a bit over half the packages listed.

Should this, or a related issue arise this can of course be reopened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants