Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plumber() and Plumber$new() will read source file as UTF-8 encoded #312

Closed
wants to merge 3 commits into from

Conversation

shrektan
Copy link
Contributor

@shrektan shrektan commented Oct 5, 2018

Related to #296

DESCRIPTION

R uses the encoding "unknown" or "native" when reading text files by default. However, the fact that the default encoding varies (not UTF-8) on Windows, leads to a big headache for our multiple-bytes charset users on Windows, which is we have to store the source file as the native encoding in order to use plumber.

Thanks to the great work of RStudio (especially, @yihui), the common practice of handling this issue is to read the file as UTF-8 encoding by default(may provide a param "encoding" allowing users to change) and this PR is going to address this.

WHAT THIS PR DOES?

  • add the new param encoding for plumber() and Plumber$new()
  • enable the UTF-8 as the default

As mentioned in #296, there's another issue that the function input should be marked as UTF-8, which I would like file another PR after this being merged.

Thanks.

@CLAassistant
Copy link

CLAassistant commented Oct 5, 2018

CLA assistant check
All committers have signed the CLA.

@codecov-io
Copy link

codecov-io commented Oct 5, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@9327541). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #312   +/-   ##
=========================================
  Coverage          ?   90.04%           
=========================================
  Files             ?       25           
  Lines             ?     1145           
  Branches          ?        0           
=========================================
  Hits              ?     1031           
  Misses            ?      114           
  Partials          ?        0
Impacted Files Coverage Δ
R/plumber.R 84.86% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9327541...6c8908f. Read the comment docs.

@yihui
Copy link
Member

yihui commented Oct 5, 2018

YES! Use UTF-8, or we can't be friends.

Perhaps in knitr 2.0 and rmarkdown 2.0, I'll just remove the encoding arguments, and force UTF-8 like what Pandoc has been doing for years, which is something I have envied for long.

@schloerke
Copy link
Collaborator

@yihui Do you think it's a good idea remove the argument and just force UTF-8?

@schloerke schloerke added type: enhancement Adds a new, backwards-compatible feature priority: high Must be fixed before next release labels Oct 5, 2018
@shrektan
Copy link
Contributor Author

shrektan commented Oct 5, 2018

At least, it’s a good idea to me, actually. :P

@yihui
Copy link
Member

yihui commented Oct 5, 2018

@schloerke Yes. That is what I have been doing in bookdown (since 2016) and blogdown (2017). In future packages, I'll always do this. The encoding argument brings enormous pain to developers, whereas it only bugs users a tiny bit when they are asked to save a text file in UTF-8 (they can set this as the default in RStudio).

@schloerke
Copy link
Collaborator

@shrektan Mind removing it from the args and adding it as a global package variable w/ comment to this PR? (Feel free to copy from below)

# Hard code UTF-8 file encoding
# Removes encoding headache at minor cost of setting encoding in editor
# https://github.com/trestletech/plumber/pull/312
utf8Encoding <- "UTF-8"


plumb <- function(file, dir = "."){
# ...
x <- source(entrypoint, encoding = utf8Encoding)
# etc...
}

@shrektan shrektan force-pushed the add-param-encoding branch from 798ca94 to 731704e Compare October 5, 2018 15:32
@shrektan
Copy link
Contributor Author

shrektan commented Oct 5, 2018

@schloerke I have removed it.

R/plumber.R Outdated Show resolved Hide resolved
R/plumber.R Outdated Show resolved Hide resolved
R/plumber.R Outdated Show resolved Hide resolved
Copy link
Collaborator

@schloerke schloerke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Should be last round of review

Thank you for the test file!

@shrektan shrektan changed the title add the encoding param for plumber() and Plumber$new() plumber() and Plumber$new() will read source file as UTF-8 encoded Oct 12, 2018
@shrektan
Copy link
Contributor Author

shrektan commented Oct 12, 2018

@schloerke I've made the changes you requested and add a NEWS file (I made a forced-push in order to have a clean history).

Can it be merged? It's very inconvenient to maintain/edit the file as encoding other than UTF-8. Moreover, it's consistent with shiny, which forces the UTF-8 encoding in the very early stage.

shiny 0.10.1

Added Unicode support for Windows. Shiny apps running on Windows must use the UTF-8 encoding for ui.R and server.R (also the optional global.R) if they contain non-ASCII characters. See this article for details and examples: http://shiny.rstudio.com/gallery/unicode-characters.html (#516)

In addition, there's a minor issue about using utf8Encoding to represent the literal string "UTF-8" (see the quote below). Although I made the change, I don't support it because it's not a "magic" value in R. As you can see in other packages like shiny or in pkgload, they almost always are written directly.

@shrektan Mind removing it from the args and adding it as a global package variable w/ comment to this PR? (Feel free to copy from below)

# Hard code UTF-8 file encoding
# Removes encoding headache at minor cost of setting encoding in editor
# https://github.com/trestletech/plumber/pull/312
utf8Encoding <- "UTF-8"


plumb <- function(file, dir = "."){
# ...
x <- source(entrypoint, encoding = utf8Encoding)
# etc...
}

@schloerke schloerke requested a review from jcheng5 October 19, 2018 16:32
@schloerke schloerke removed the request for review from jcheng5 October 26, 2018 18:38
@schloerke schloerke added the QA Submitted for QA label Oct 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high Must be fixed before next release QA Submitted for QA type: enhancement Adds a new, backwards-compatible feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants