Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way for student to download PDF of Datahub's R ipynb to Submit On Gradescope #4112

Closed
sungsy12345 opened this issue Jan 16, 2023 · 18 comments
Assignees
Labels

Comments

@sungsy12345
Copy link

sungsy12345 commented Jan 16, 2023

Bug description

Hello. This is regarding EEP/IAS C118's usage of Datahub for assignments. We need students to go from an R ipynb on Datahub to PDF, so students can submit copies of assingments on Gradescope.

(1) Ideally, we want them to use "File > Download as > PDF via html (.html)" as in picture.

image

However, we run into the following error. "500 : Internal Server Error. The error was: nbconvert failed: PyQtWebEngine is not installed to support Qt PDF conversion. Please install nbconvert[qtpdf] to enable."

image

Ideally, becuase our course's instructional video introducing students to datahub and R recommends this method of going from ipynb to a pdf, we would like this method to work, rather than having to resort to another method.

(2) This is more minor issue. On a R ipynb notebook, images show up correctly, as in the first picture below. However, when I do "File > Print Preview", none of the images show up as in the second picture below. I am wondering if this is fixable, because having a preview work would give us an alternative method for printing to a pdf.

Image 1 of IPYNB with images displaying correctly
image

Image 2 after Print Preview - Images are gone
image

Thank you so much in advance.

Environment & setup

  • Hub:
  • Language:

How to reproduce

@ericvd-ucb
Copy link
Contributor

@balajialg any ideas here - this was the default and worked for many semesters in a row... other classes are gonna have this problem

@ryanlovett
Copy link
Collaborator

@sungsy12345 Does the last entry for "PDF via HTML" work for you? That is the "webpdf" action whereas the one you chose is "qtpdf". I agree that it is confusing -- there is an existing issue to eliminate this complexity.

@ryanlovett
Copy link
Collaborator

Our issue is #1461 and it was tracking jupyter/nbconvert#1273. We should disable qtpdf since it is not functional.

@sungsy12345
Copy link
Author

@ryanlovett Thank you so much for the support.

It looks like the options under "File" changed. Please see below pic.

image

I tried HTML, PDF, and WEbpdf. They all work mostly, but none of them includes the images (png) embedded in the notebook.

@ryanlovett
Copy link
Collaborator

@sungsy12345 Can you confirm that the last screenshot is from datahub and not from another jupyter service?

Can you provide a full screenshot, and specify the full URL?

@shaneknapp I've confirmed that setting c.QtPDFExporter.enabled = False in /opt/conda/etc/jupyter/jupyter_notebook_config.py disables the non-functioning item. I think z2jh prefers this in the deployment config though.

@sungsy12345
Copy link
Author

sungsy12345 commented Jan 17, 2023

@ryanlovett

Yes. I am pretty sure it is from UCB's datahub. Please see picture below.

If you have access to my folders, please feel freee to access and work with the same file. I am working on outputting to pdf the following: "ENVECON-118-SP23(S)/1_CodingBootcamp/Coding Bootcamp Part 1.ipynb"

URL: https://r.datahub.berkeley.edu/user/sysung/retro/notebooks/ENVECON-118-SP23(S)/1_CodingBootcamp/Coding%20Bootcamp%20Part%201.ipynb

image

I am also adding the pdf file that gets outputted without the embedded images.

Coding Bootcamp Part 1 (3).pdf

@ryanlovett
Copy link
Collaborator

Thanks, this clarifies that you're using retrolab which has different behavior than classic notebook. (the menu item labels of the export methods)

Someone will investigate why the webpdf exporter doesn't include images.

@sungsy12345
Copy link
Author

Thank you so much.

@balajialg I believe, from our meeting, we should be using retrolab and the way we used the nbgitpuller extension for the github to create links for students assumes we should use retrolab. If you have additional advice, please feel free to let us know.

@ryanlovett
Copy link
Collaborator

Just to add onto the findings, inserting via markdown rather than html tags seems to work for PDF (latex) but not Webpdf. e.g. ![Datahub Folder](images/datahub_folder2023.png)

@ryanlovett
Copy link
Collaborator

ryanlovett commented Jan 17, 2023

Setting the full path breaks the view in the notebook, but causes it to work in the webpdf export.

![Datahub Folder](/home/jovyan/images/datahub_folder2023.png)

(this is from a server where the notebook and images/ folder are in /home/jovyan/. @sungsy12345 has them at a deeper level.)

@ryanlovett
Copy link
Collaborator

This is happening because webpdf converts from a temporary html file in /tmp/. If the path to the image is image/foo, and if there is an image located in /tmp/image/foo, the exported PDF will contain the image.

If nbconvert is going to write the html file to /tmp/ then it should copy all assets there too (which is probably impossible). Otherwise it should just write to the cwd.

@ryanlovett
Copy link
Collaborator

@sungsy12345 I've a workaround for the bug, though it is not the most elegant. Instead of referencing the path to the image, you can copy/paste an image into a markdown cell. For example if I open the image in Preview on my Mac, type Command-C to copy the image, then go to datahub and type Command-V into a markdown cell, it will be inserted as ![image.png](attachment:<long-string>.png). There is no menu item within the notebook application for this paste operation. The notebook will both look and export correctly, though it doesn't reference the files on the filesystem. The image becomes embedded in the notebook.

How do you feel about this workaround?

@sungsy12345
Copy link
Author

@ryanlovett

Fantastic. I changed all the images to ![Datahub Folder](images/datahub_folder2023.png) and it does export in PDF correctly via Save and Export Notebook As > PDF and confirmed it works.

Let me try the workaround you have just suggested. If that that allows me to download in all formats, then I may go with this workaround. There is no reason to expect pushing "Coding Bootcamp Part 1.ipynb" to git for students' datahub would cause errors (i.e. lose images on notebook), even if we are not using separate images folder?

Thanks again for all your help!

@ryanlovett
Copy link
Collaborator

@sungsy12345 If you go with the copy/paste method and then upload the new notebook to git, it should work for everyone. You can even delete the images folder containing the actual images because they will have become embedded in the .ipynb file itself.

Copy/pasting makes it more difficult if a co-author of the .ipynb wants to make minor changes to the images. They'd have to download the image from the notebook, make the change, recopy/paste the image into the notebook, then upload the notebook to git.

@balajialg
Copy link
Contributor

balajialg commented Jan 17, 2023

Thank you so much @ryanlovett for quickly resolving @sungsy12345's request. Appreciate it. I will add an entry in https://ds-modules.github.io/curriculum-guide/faq/troubleshoot.html for instructors so that they can copy paste images instead of adding relative URLs in the markdown file (atleast in the near future).

There is no issue tracking the relative URL issue in Jupyter Notebook repository as seen here - https://github.com/jupyter/notebook/issues?q=is%3Aissue+pdf+image. I am thinking of raising an issue just so that the developers are aware that such a bug exists. Is that the right place to raise this request @ryanlovett

@ryanlovett
Copy link
Collaborator

@balajialg It is an issue with nbconvert rather than notebook. There appear to be several related issues, but I've opened jupyter/nbconvert#1938.

I'm not sure if the copy/paste method should be favored over the PDF (via LaTeX) method, just because I don't know what the downsides to it are. I also don't know what the equivalent way of copying images is on Windows. We should identify what the copy/paste keyboard shortcuts are on all platforms if we mention this in the docs.

We had been recommending the webpdf method because it was more reliable than latex, so this is frustrating.

Perhaps we haven't run into this before because most people who include images have been doing from code output which embeds the image in the ipynb.

@balajialg
Copy link
Contributor

Thanks for raising the issue with nbconvert @ryanlovett!

For documentation - I would rather err on the side of sharing instructors with the warning that the downsides to copy-paste approach are not clear and highlight that this approach is experimental (in the short term till something gets fixed with nbconvert). Will list the options for copy-paste across all platforms in the documentation (Ctrl C and Ctrl V remain the same in windows)

@balajialg
Copy link
Contributor

balajialg commented Jan 25, 2023

Updated the short-term resolution steps in the curricular documentation - https://ds-modules.github.io/curriculum-guide/faq/troubleshoot.html#other-hub-issues. Lets track the nbconvert issue created by @ryanlovett through issue #3696. Closing this issue considering there is a short-term fix. @sungsy12345 Please feel free to reopen if you want to have a discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants