RcppStdVector

The std::vector container is the cornerstone of the C++ Standard Library. When used with Rcpp however, there is an inherent cost in using std::vector as memory must be copied both from and to R internal data structures. The RcppStdVector package allows users to write Rcpp code using the Standard Library std::vector container without incurring an extra container copy operation. Consider the following:

std::vector<double>
my_func(std::vector<double>& x) { // copy from REALSXP to std::vector here
  ...
  return x; // additional copy from std::vector to REALSXP here
}

For large vectors, this entails a large overhead. Rcpp native vectors avoid these copies by holding an SEXP, which will be only copied on demand. Recall however that unless an Rcpp vector is duplicated, functions can modify the vector in-place, which violates the no-side-effects rule. This can be useful or problematic in cases where the SEXP is aliased to several R variables. Most modifying Rcpp functions will therefore usually implement cloning the input vector. Similarly, using RcppStdVector, we can write:

RcppStdVector::std_vec_real
my_func(RcppStdVector::std_vec_real& x) { // copy here from REALSXP to std::vector
  ...
  return x; // zero-overhead, no-copy return of REALSXP
}

The zero-overhead return is achieved by using a custom allocator that allocates an R vector instead of general heap memory. This R vector is simply retrieved from the allocator at the end. Special consideration is taken when the capacity of the std::vector is greater than its size so that there are no dangling elements. It is furthermore possible to construct a standard vector in-place over an R vector via placement new.

void my_func(RcppStandardVector::std_ivec_int& x) { // no copy
  // modify elements directly
}

The caveat to this is that any operation that causes allocation could lead to undefined behavior as the in-place allocator does not actually allocate and delete. Operations that grow the vector will throw an error. Provided overloads for assignment and copy construction will also throw.

There are several reasons one might wish to use std::vector over the native Rcpp vector types. First, as part of the Standard Library, std::vector is simple and well-documented. Compare this to this. Second, using std::vector everywhere minimizes code rewrites when moving between other C++ codebases and Rcpp. Third, std::vector supports amortized-constant appending elements. This is a very common idiom in C++ coding and its absense in Rcpp is a source of frustration. Consider the following functions:

SEXP test_copy_chr(Rcpp::CharacterVector& x) { // no copy
  RcppStdVector::std_vec_chr y;     // this IS a std::vector
  auto oi = std::back_inserter(y); // amortized-constant insertion
  std::copy(x.begin(), x.end(), oi);
  return Rcpp::wrap(y);  // no copy
}

SEXP test_copy_rcpp_chr(Rcpp::CharacterVector& x) { // no copy
  Rcpp::CharacterVector y;    // Rcpp type
  auto oi = std::back_inserter(y); // copies entire vector on each insertion
  std::copy(x.begin(), x.end(), oi);
  return Rcpp::wrap(y); // no copy
}

On my system, I get:

> microbenchmark(r1 <- test_copy_chr(x), r2 <- test_copy_rcpp_chr(x))
Unit: microseconds
                        expr        min         lq         mean      median           uq         max
      r1 <- test_copy_chr(x)     94.811    102.117     124.4515    125.6335     129.7775     218.154
 r2 <- test_copy_rcpp_chr(x) 935634.546 975367.489 1012093.6482 996092.4005 1023481.6770 1228233.548

where x contains 1e4 character strings. The sole drawback of this idiom is that up to 50% of the memory allocated to the std::vector may be unused.

RcppStdVector supports list and string-list types. For example the following copies a list of elements and duplicates each.

template <int RTYPE>
void dup_vec(SEXP& s) {
  auto y = RcppStdVector::from_sexp<RTYPE>(s);
  auto oi = std::back_inserter(y);
  y.reserve(2 * y.size());
  std::copy(RcppStdVector::begin<RTYPE>(s),
            RcppStdVector::end<RTYPE>(s), oi);
  s = PROTECT(Rcpp::wrap(y));
}

// [[Rcpp::export]]
SEXP test_double_it(RcppStdVector::std_vec_sxp& x) {
  for (auto s = x.begin(); s != x.end(); ++s) {
    switch(TYPEOF(*s)) {
    case REALSXP: dup_vec<REALSXP>(*s); break;
    case INTSXP:   dup_vec<INTSXP>(*s); break;
    case LGLSXP:   dup_vec<LGLSXP>(*s); break;
    case STRSXP:   dup_vec<STRSXP>(*s); break;
    case VECSXP:   dup_vec<VECSXP>(*s); break;
    default: Rcpp::stop("Unsupported column type");
    }
  }
  UNPROTECT(x.size());
  return Rcpp::wrap(x);
}

The reason for calling PROTECT is because when the vector y goes out of scope, its internal SEXP will be unprotected. I get the following:

> test_double_it(list(1:3, letters[1:3], (1:3) > 2, pi))
[[1]]
[1] 1 2 3 1 2 3

[[2]]
[1] "a" "b" "c" "a" "b" "c"

[[3]]
[1] FALSE FALSE  TRUE FALSE FALSE  TRUE

[[4]]
[1] 3.141593 3.141593

The following code demonstrates placement new construction of a std::vector over the memory owned by an R vector.

void test_inplace(RcppStdVector::std_ivec_int& x) {
  std::transform(std::begin(x), std::end(x),
                 std::begin(x), [](int a){ return 2 * a; });
}

This function doubles each element of an R vector without ever allocating.

> x <- 1:10
> test_inplace(x)
> x
 [1]  2  4  6  8 10 12 14 16 18 20
>

The basic idea of this experiment is to use the standard library components by adapting them to work on top of R’s memory system. This is what I always wanted from Rcpp: a thin lightweight layer that enables use of standard C++.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
R		R
inst/include		inst/include
man		man
src		src
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
RcppStdVector.Rproj		RcppStdVector.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

RcppStdVector

About

Licenses found

Releases

Packages

Languages

License

Licenses found

thk686/RcppStdVector

Folders and files

Latest commit

History

Repository files navigation

RcppStdVector

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages