Skip to content

Commit

Permalink
write_lines() now accepts a list of raw vectors (#594)
Browse files Browse the repository at this point in the history
Fixes #542
  • Loading branch information
jimhester authored Feb 8, 2017
1 parent 3526cc5 commit 20bf3a8
Show file tree
Hide file tree
Showing 7 changed files with 55 additions and 8 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
* Long spec declarations now print properly (#597).
* `read_table()` can now handle files with many lines of leading comments (#563).
* Whole number doubles are no longer written with a trailing `.0` decimal (#526).
* `write_lines()` now accepts a list of raw vectors (#542).

* parsing problems in `read_delim()` and `read_fwf()` when columns are skipped using col_types now report the correct column name (#573, @cb4ds)

Expand Down
4 changes: 4 additions & 0 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ write_lines_ <- function(lines, path, na, append = FALSE) {
invisible(.Call('readr_write_lines_', PACKAGE = 'readr', lines, path, na, append))
}

write_lines_raw_ <- function(x, path, append = FALSE) {
invisible(.Call('readr_write_lines_raw_', PACKAGE = 'readr', x, path, append))
}

write_file_raw_ <- function(x, path, append = FALSE) {
invisible(.Call('readr_write_file_raw_', PACKAGE = 'readr', x, path, append))
}
Expand Down
14 changes: 8 additions & 6 deletions R/lines.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
#' `read_lines()` reads up to `n_max` lines from a file. New lines are
#' not included in the output. `read_lines_raw()` produces a list of raw
#' vectors, and is useful for handling data with unknown encoding.
#' `write_lines()` takes a character vector, appending a new line
#' after each entry.
#' `write_lines()` takes a character vector or list of raw vectors, appending a
#' new line after each entry.
#'
#' @inheritParams datasource
#' @inheritParams read_delim
Expand Down Expand Up @@ -51,11 +51,13 @@ read_lines_raw <- function(file, skip = 0, n_max = -1L, progress = show_progress
#' @export
#' @rdname read_lines
write_lines <- function(x, path, na = "NA", append = FALSE) {
x <- as.character(x)

path <- normalizePath(path, mustWork = FALSE)

write_lines_(x, path, na, append)
if (is.list(x) && all(vapply(x, inherits, logical(1), "raw"))) {

This comment has been minimized.

Copy link
@jimhester

jimhester Feb 10, 2017

Author Collaborator

Turns out this test is quite slow, for a example 14 million line file 3/4 of the total time for write_lines() is spent here. I think

  is_raw <- is.list(x) && inherits(x[[1]], "raw")

Should give us the same answer the vast majority of time without the cost.

write_lines_raw_(x, path, append)
} else {
x <- as.character(x)
write_lines_(x, path, na, append)
}

invisible(x)
}
Expand Down
4 changes: 2 additions & 2 deletions man/read_lines.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions src/RcppExports.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,18 @@ BEGIN_RCPP
return R_NilValue;
END_RCPP
}
// write_lines_raw_
void write_lines_raw_(List x, const std::string& path, bool append);
RcppExport SEXP readr_write_lines_raw_(SEXP xSEXP, SEXP pathSEXP, SEXP appendSEXP) {
BEGIN_RCPP
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< List >::type x(xSEXP);
Rcpp::traits::input_parameter< const std::string& >::type path(pathSEXP);
Rcpp::traits::input_parameter< bool >::type append(appendSEXP);
write_lines_raw_(x, path, append);
return R_NilValue;
END_RCPP
}
// write_file_raw_
void write_file_raw_(RawVector x, const std::string& path, bool append);
RcppExport SEXP readr_write_file_raw_(SEXP xSEXP, SEXP pathSEXP, SEXP appendSEXP) {
Expand Down
18 changes: 18 additions & 0 deletions src/write.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,24 @@ void write_lines_(const CharacterVector &lines, const std::string &path, const s
return;
}

// [[Rcpp::export]]
void write_lines_raw_(List x, const std::string &path, bool append = false) {
std::ofstream output(path.c_str(), std::ofstream::binary | (append ? std::ofstream::app : std::ofstream::trunc));

if (output.fail()) {
stop("Failed to open '%s'.", path);
}

std::ostream_iterator<char> out = std::ostream_iterator<char>(output);
for (int i = 0;i < x.length();++i) {
RawVector y = x.at(i);
std::copy(y.begin(), y.end(), out);
*out++ = '\n';
}

return;
}

// [[Rcpp::export]]
void write_file_raw_(RawVector x, const std::string &path, bool append = false) {
std::ofstream output(path.c_str(), std::ofstream::binary | (append ? std::ofstream::app : std::ofstream::trunc));
Expand Down
10 changes: 10 additions & 0 deletions tests/testthat/test-write-lines.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,16 @@ test_that("write_lines can append to a file", {
expect_equal(read_lines(tmp), c("first", "last", "first", "last"))
})

test_that("write_lines accepts a list of raws", {
x <- lapply(seq_along(1:10), function(x) charToRaw(paste0(collapse = "", sample(letters, size = sample(0:22, 1)))))
tmp <- tempfile()
on.exit(unlink(tmp))

write_lines(x, tmp)

expect_equal(read_lines(tmp), vapply(x, rawToChar, character(1)))
})

# write_file ------------------------------------------------------------------
test_that("write_file round trips", {
tmp <- tempfile()
Expand Down

0 comments on commit 20bf3a8

Please sign in to comment.