Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for feather files in POST #77

Closed
wants to merge 10 commits into from
Closed
7 changes: 4 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,7 @@ Imports:
R6 (>= 2.0.0),
stringi (>= 0.3.0),
jsonlite (>= 0.9.16),
httpuv (>= 1.2.3),
crayon
httpuv (>= 1.2.3)
LazyData: TRUE
Suggests:
testthat (>= 0.11.0),
Expand All @@ -34,7 +33,9 @@ Suggests:
base64enc,
htmlwidgets,
visNetwork,
analogsea
analogsea,
feather,
webutils
Collate:
'content-types.R'
'cookie-parser.R'
Expand Down
3 changes: 2 additions & 1 deletion R/content-types.R
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,8 @@ knownContentTypes <- list(
docx='application/vnd.openxmlformats-officedocument.wordprocessingml.document',
dotx='application/vnd.openxmlformats-officedocument.wordprocessingml.template',
xlam='application/vnd.ms-excel.addin.macroEnabled.12',
xlsb='application/vnd.ms-excel.sheet.binary.macroEnabled.12')
xlsb='application/vnd.ms-excel.sheet.binary.macroEnabled.12',
feather='application/feather')

getContentType <- function(ext, defaultType='application/octet-stream') {
ct <- knownContentTypes[[tolower(ext)]]
Expand Down
44 changes: 42 additions & 2 deletions R/post-body.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
postBodyFilter <- function(req){
body <- req$rook.input$read_lines()
args <- parseBody(body)

if (!is.null(req$CONTENT_TYPE) && grepl("multipart/form-data; boundary=",
req$CONTENT_TYPE, fixed = TRUE)) {

boundary <- strsplit(req$CONTENT_TYPE, split = "=")[[1]][2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this vulnerable to having an = sign in the body of the message?

body <- req$rook.input$read()
args <- parseMultipart(body, boundary)
} else {
body <- req$rook.input$read_lines()
args <- parseBody(body)
}
req$postBody <- body
req$args <- c(req$args, args)
forward()
}


#' @importFrom utils URLdecode
#' @noRd
parseBody <- function(body){
Expand All @@ -24,3 +34,33 @@ parseBody <- function(body){
}
ret
}


#' @importFrom utils URLdecode
#' @noRd
parseMultipart <- function(body, boundary){
function(val, req, res, errorHandler){
tryCatch({
if (!requireNamespace("webutils", quietly = TRUE)) {
stop("The webutils package is not available but is required in order
to use the functionality to POST binary files",
call. = FALSE)
}
# Is there data in the request?
if (is.null(body) || length(body) == 0 || body == "") {
return(list())
}
parsed_binary <- webutils::parse_multipart(body, boundary)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there limits built in to any of these on how big of a file we'd attempt to parse? Anything to prevent a client from sending a 10GB file and running the server out of RAM?

file_name <- parsed_binary$myfile$filename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a hard time finding where myfile comes from? Is that baked in to parse_multipart's return value?

file_data <- parsed_binary$myfile$value

tmpfile <- tempfile()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. It's unfortunate that we're not able to clean up these temp files when we're done. But if we added an on.exit here the file would be gone before we could access the file downstream. Maybe I need to add a hook to the response object so that we can clean up any associated resources when it's done.

writeBin(file_data, tmpfile)

ret <- NULL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this any different than ret <-list(name = file_name, data=tmpfile)?

It surprised me that you could assign to a field on a NULL objec.t

ret$name <- file_name
ret$data <- tmpfile
ret
})
}
}
32 changes: 32 additions & 0 deletions inst/examples/13-binary-upload/plumber.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
library(plumber)
library(feather)

# Illustration of binary uploads and post-processing of the file on receipt.
# As a proof of concept we use a feather file: test.feather


# To upload and use the content of the file in a function:
# curl -X POST http://localhost:9080/upload -F 'myfile=@test.feather'

#* @post /upload
function(name, data) {
# work with binary files after upload
# in this example a feather file
content <- read_feather(data)
return(content)
}


# To upload and use the properties of the file in a function:
# curl -X POST http://localhost:9080/inspect -F 'myfile=@test.feather'

#* @post /inspect
function(name, data) {
file_info <- data.frame(
filename = name,
mtime = file.info(data)$mtime,
ctime = file.info(data)$ctime
)

return(file_info)
}
Binary file added inst/examples/13-binary-upload/test.feather
Binary file not shown.
Binary file added tests/testthat/files/static/test.feather
Binary file not shown.
8 changes: 7 additions & 1 deletion tests/testthat/test-postbody.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,11 @@ test_that("JSON is consumed on POST", {

test_that("Query strings on post are handled correctly", {
expect_equivalent(parseBody("a="), list()) # It's technically a named list()
expect_equal(parseBody("a=1&b=&c&d=1"), list(a="1", d="1"))
expect_equal(parseBody("a=1&b=&c&d=1"), list(a = "1", d = "1"))
})

test_that("Multipart binary is consumed on POST", {
# TODO: I'm not sure how to read the file in such a way that it can be used as a
# multipart upload. I expected that copying the rawToChar(req$input.rook$read())
# output would do the trick, but it did not.
})