-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add R - Jupyter - R Markdown - Data Science - Machine Learning DevConatiner #1314
base: main
Are you sure you want to change the base?
Conversation
There is an existing R, Jupyter datascience, and Anaconda definition in this repository. https://github.com/microsoft/vscode-dev-containers/tree/main/containers/r, https://github.com/microsoft/vscode-dev-containers/tree/main/containers/python-3-anaconda, and https://github.com/microsoft/vscode-dev-containers/tree/main/containers/jupyter-datascience-notebooks. Is there a reason we need to create a new one verses adapting what is there? //cc @dynamicwebpaige as well for feedback along with @kmehant and @eitsupi on the existing definition. |
Since this is a pure extension of the R definition, I don't think it is appropriate to include it in this repository. |
@eitsupi yes yes. It is an extension of the R definition to include Jupyter Notebooks support. I had a really hard time trying to run Jupyter Notebooks on the existing R container. Perhaps I could do a PR directly into the R definition instead of an entirely new definition? |
@Chuxel, the reason for this definition is that we were in a situation where we wanted an out of the box container to run .R, .Rmd and Jupyter Notebooks for students without much fuss/tweaking of the existing definitions. Hence the reason for the PR. cc @leestott |
Like the definitions of other languages, the definition of R is intended to include only the bare essentials. For example, my personal preference is the tidyverse packages, so I use edited container definitions like the following for my own use. I think it is a good idea to create a template repository or maintain documentation on VSCode Remote-Containers. If my understanding is correct, the VSCode Remote-Containers team are currently working on a mechanism to easily download third party container definitions in https://github.com/microsoft/dev-container-spec. |
@Chuxel So as @R-icntay mention the reason for this definition is that we were in a situation where we want an out of the box container to run .R, .Rmd and Jupyter Notebooks for students and educator workshops, We are in the process of launching some R modules on MS Learn + Learn R Jupyter Sandbox created. Our longer term we want a R+Jupyter+DataScience+Machine DevContainer image which will NOT require and tweaking of the existing definitions as this is a request from the Edu Community to have a preconfigured R+DataScience+ML image for Tinyverse/Tidy Model etc. Hence the reason for the PR. cc @dynamicwebpaige @eitsupi |
Yeah, reading through more, this has quite a bit of opinion in it that makes it a bit more than a "definition" in this repository. For example, things like gitlens are quality of life extensions rather than something to enable a scenario. VS Code has settings sync to allow you to pull across these types of extensions from your personal preferences, so for a general scenario, including them in a definition is a bit counterproductive - and can actually irritate developers. e.g. ... this is a lot:
I understand that this could make sense if a particular curriculum drove the desire to have these present, but part of me wonders whether what you are describing is more along the lines of the A GitHub template repository can have much more opinion in it based on the assumptions about the curriculum than something designed to drop into an arbitrary project like these are intended to do. You can then click to create a repo with the opinion in it. No objections to R + Jupyter... more trying to dig into intent given this extensions list. Is there perhaps a happy medium here? Otherwise using a template repository could be as or more effective if the desire is to be very opinionated. |
Thank you for tagging me into this issue, @Chuxel! And thank you to @R-icntay for proposing an To @Chuxel's point, above: adding too many extensions might result in performance degradations, or conflicts for shortcuts / hotkeys when using VS Code. Would it be possible just to include the R-centric and data analysis extensions in this devcontainer (as an example, the extension for Shiny snippets, but to remove the A couple of additional questions:
cc: @tanmayeekamath, as this devcontainer might be useful for the genomics team to review, once it has been created. |
My understanding is that extensions like GitLens (which I also use all the time!) should be defined in However, I think many users are unaware of this feature and try to include everything in
I prefer RMarkdown to ipynb and haven't written R in Jupyter on VSCode for a long time, although I do use VSCode Jupyter when writing Python. |
I tried about installing jupyter. It is enough to add the following contents to the existing Dockerfile. RUN apt-get update && apt-get -y install \
libzmq3-dev \
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/* \
&& install2.r --error --skipinstalled --ncpus -1 IRkernel \
&& rm -rf /tmp/downloaded_packages \
&& python3 -m pip --no-cache-dir install jupyter \
&& R --vanilla -s -e 'IRkernel::installspec(user = FALSE)' How about adding this to the documentation? This is the first time I've touched R with Jupyter since the Jupyter extension became VSCode Native Notebook, and it was a pretty good experience. However, there seems to be a problem that R variables do not show up in either jupyter variables or the vscode-R's R workspace. (microsoft/vscode-jupyter#5264) @renkun-ken As the primary developer of vscode-R, do you have any thoughts? |
@Chuxel , @dynamicwebpaige. Thank you for getting back with great feedback on this. Yes, I must admit, there was a bit of an overkill with the extensions. It would be possible to remove the extensions suggested by @Chuxel and everything would work just fine. In the devcontainer.json, they had been commented as // Other extensions that make life a little bit easier right off the bat @dynamicwebpaige, we wanted to use the devcontainer for workshops relating to an upcoming R course on Microsoft Learn. Ideally we want to use VS Code and VS Code Notebooks (since .ipynb is supported on Microsoft Learn). From what we have gathered from the learners community, part of making it easier for learners to ramp up on R and in extension R + VS Code is an out of the box environment where students can start running R code in no time. I only had a bit of a hiccup in setting up the R kernel for VS Code Notebooks in a devcontainer, but on a local machine it's really easy to do so. I use RStudio but am really enjoying VS Code. Tagging @leestott in case I missed anything. |
Thank you @eitsupi. Much neater implementation than the one I did. It would be great to have it documented somewhere. As a new user to docker and everything, it took a while to figure out how to make the R kernel visible to Jupyter. |
Note that the Rocker project has an image called This image is so huge that it may contain packages that are unnecessary for many users, but may be a good option for learning purposes. If you want to use this with VSCode Remote-Containers, I think you just need to rewrite cc @cboettig |
I merged in the updated comments in #1320 given the discussion here.
@R-icntay @eitsupi There is also a way where you can wire up an option for Notebook support that shows up in VS Code "Add Dev Container Config UX" based on a comment in the Dockerfile. We want to formalize this a bit more as we move forward on some of the repository proposals mentioned above, but it's in heavy use in the repo. What you can do is the following: # [Option] Enable Notebook support
ARG ENABLE_JUPYTER=false
RUN if [ "${ENABLE_JUPYTER}" = "true" ]; then
apt-get update && apt-get -y install libzmq3-dev \
&& apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/* \
&& install2.r --error --skipinstalled --ncpus -1 IRkernel \
&& rm -rf /tmp/downloaded_packages \
&& python3 -m pip --no-cache-dir install jupyter \
&& R --vanilla -s -e 'IRkernel::installspec(user = FALSE);
fi devcontainer.json then lists this as a build arg, and the UX will present the option and update it as appropriate.
You can also do this with the image - the Dockerfile could include the following:
The UX will update the "VARIANT" in devcontainer.json automatically based on what you pick. This is how all the version and image variants work for things like the Python definition: https://github.com/microsoft/vscode-dev-containers/blob/main/containers/python-3/.devcontainer/Dockerfile |
Thank you for your suggestion. That sounds good! |
@R-icntay @leestott @dynamicwebpaige |
Thanks again for opening the PR and for the discussion so far. As a heads up, our team has been actively focused on an updated plan for community contributions and this repo moving forward, which we've now outlined in this issue: #1589. This includes moving to a couple new repos for images (https://github.com/devcontainers/images) and Features (https://github.com/devcontainers/features). We anticipate to have a similar repo and distribution process for templates/definitions. We'll keep everyone updated (likely via another issue in this repo or comment on #1589) when our new templates repo is available and the process is defined. Please let me know if you have any questions, thank you! |
An environment to perform Data Science and Machine Learning in R with support for .R scripts, Jupyter Notebooks, and R Markdown Notebooks.