[WIP] Follow symlinks and exclude hidden directories #36

albertmichaelj · 2020-07-01T14:18:44Z

This pull request adds two things. First, it makes the archive download process follow symlinks. I user symlinks to give the users of my JupyterHub deployment shared folders, and I want them to easily be able to extract all symlinked folders as well. This is the default behavior of using the zip command line utility, so I think it should also be the default behavior for this extension (though I'm not sure about other compression utilities).

The second behavior is more controversial. I exclude hidden directories. I personally think this is improved behavior because I have students use nbgitpuller to pull down content. When they download that folder to their own computer, they don't need the git folders. However, I'm not sure that this should be the default behavior for all possible use cases.

Ideally, I think both of these behaviors should be a setting with the default behavior being follow symlinks and download hidden files and folders. However, while I understand the Python code well, I have no experience with javascript (typescript?), and I can't figure out how to add the settings. If someone could work with me to add the settings component (or give me some directions), I think that would make this pull request make the most sense.

If I had boolean variables follow_symlinks and download_hidden (or similar), my code could very easily be updated in order to respect those settings.

… skipped.

hadim · 2020-07-01T22:08:50Z

Thanks for the PR.

I am fine having the default behavior excluding hidden directories as long as there is an option to disable it. And same for following symlinks.

That being said I am trying to integrate this extension upstream (see jupyter-server/jupyter_server#251) and it will probably take some time since jupyter_server is still in the early stage.

In the meantime, I don't have a lot of bandwidth to help with this PR. I'm sorry about that.

I am also a Python dev (much more than a JS dev). If you have some time to dedicate to JLab extension development, the doc is very helpful: https://jupyterlab.readthedocs.io/en/stable/developer/extension_dev.html

albertmichaelj · 2020-07-02T00:00:08Z

This makes sense. I have also contributed to the NBGitPuller extension, and in that extension, they set environmental variables to determine settings. Given that these are settings that do not impact the user visible portion, does it make sense to do something similar and just set environmental variables to override the default behavior?

If not, could you confirm my understanding? Is the only way to get the config from the javascript portion to the Python portion via a url argument? If so, I can probably figure out some way to hack the javascript to get variables from the settings into the url that is triggered when the downloadArchiveRequest is called, but there is no guarantee that I will follow best practices. This is literally the first time I've written any javascript at all.

hadim · 2020-07-02T00:27:38Z

I am not sure env variables are the best way to do that since the jupyter server could be running somewhere else than the machine running the frontend (better ask a jupyter dev for this).

Yes, the only way to send a "config" (or parameters) to the backend from the frontend is using URL arguments. The extension already uses it to send the archive format and a token (see https://github.com/hadim/jupyter-archive/blob/77ee922b5daba406814a5c9f0fca69730f08e9ed/src/index.ts#L54).

If you want to give it a first try maybe you can hardcode the arguments in the JS at the moment and I'll help you add it properly as a JLab setting later:

fullurl.searchParams.append("archiveToken", token(20));
fullurl.searchParams.append("archiveFormat", archiveFormat);
fullurl.searchParams.append("followSymlinks", true);
fullurl.searchParams.append("downloadHidden", false);

Or you can even skip entirely the JS to start with and only work on the backend Python extension adding your logic there (see https://github.com/hadim/jupyter-archive/blob/77ee922b5daba406814a5c9f0fca69730f08e9ed/jupyter_archive/handlers.py#L95).

Good luck!

albertmichaelj · 2020-07-02T13:38:13Z

@hadim Thanks for working with me on this! I'll work on getting a more complete pull request together (though based on how far down the typescript rabbit hole I go, I'm not sure if I will figure out how to pull settings in). I'll at least get it to work with hard coded values, and I'll create schema for the new setting variables.

It may take a day or two to find the time.

…ork.

albertmichaelj · 2020-07-11T02:12:11Z

Sorry for the delay in getting back to this, but it took me a little longer than I expected to find the time. I think I have everything working correctly. I've created settings where you can turn on or off following symlinks and downloading hidden directories using the Advanced Settings Editor. It behaves exactly as expected.

I also got excited and I updated the test_download in the tests to test for these things. The tests seem to be working correctly.

I made the choice of throwing up a 400 error if a URL tries to pass invalid parameters for followSymlinks and downloadHidden. The only valid parameters are "true" and "false" (the settings editor checks to make sure it's one of those), but if someone hand crafted a URL, they would get a 400 error if they put something else in. In the python code, I set a default value of "true" for followSymlinks and "false" for downloadHidden. These default values are the default if they are not encoded in the URL string. I'm not sure if this is correct or not. If you want to require that the URL string always has these parameters (which it should given that the frontend will always send them with this code), then we can remove the default value.

I am not at all convinced that my code is following best practices, so please feel free to criticize and make suggestions. I am also happy to change whatever to get this included. This functionality is exactly what I've been needing for my use case for JupyterLab, and I'm very happy to push it forward.

hadim · 2020-07-11T13:30:43Z

Awesome work @albertmichaelj. I reviewed it and I everything looks ok to me. I am fine with the default you've made and the 404 error.

I think it's best to allow the URL args to NOT have those options and use default in that case.

I will merge this and release a new version. The conda package should come soon after that.

FYI: this extension might get integrated into jupyter_server: jupyter-server/jupyter_server#252. I'll try to add those options in it.

Thanks again!

hadim · 2020-07-11T13:31:18Z

And thanks for the tests too! This is very important and useful.

albertmichaelj · 2020-07-11T16:18:35Z

Wonderful! Thanks! Let me know if you need anything from me related to the jupyter_server stuff!

Made it so that symlinks are followed and hidden file/directories are…

e50b27d

… skipped.

Michael Albert added 9 commits July 10, 2020 21:20

Parameterized the archive_and_download function

c939330

Got arguments for symlinks and hidden files from url.

01a8874

Added new parameters.

9010883

Fixed indentation

0e93368

Added parameters to downloadArchiveRequest

c5d7dfc

Appended parameters

f243008

Added variables to get values from settings

be7620f

Add test of symlinks and hidden files/folders.

4a7e38b

Accidentally uploaded the intentionally incorrect test. This should w…

c3f986d

…ork.

hadim merged commit 2fa8904 into jupyterlab-contrib:master Jul 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Follow symlinks and exclude hidden directories #36

[WIP] Follow symlinks and exclude hidden directories #36

albertmichaelj commented Jul 1, 2020

hadim commented Jul 1, 2020

albertmichaelj commented Jul 2, 2020

hadim commented Jul 2, 2020

albertmichaelj commented Jul 2, 2020

albertmichaelj commented Jul 11, 2020

hadim commented Jul 11, 2020

hadim commented Jul 11, 2020

albertmichaelj commented Jul 11, 2020

[WIP] Follow symlinks and exclude hidden directories #36

[WIP] Follow symlinks and exclude hidden directories #36

Conversation

albertmichaelj commented Jul 1, 2020

hadim commented Jul 1, 2020

albertmichaelj commented Jul 2, 2020

hadim commented Jul 2, 2020

albertmichaelj commented Jul 2, 2020

albertmichaelj commented Jul 11, 2020

hadim commented Jul 11, 2020

hadim commented Jul 11, 2020

albertmichaelj commented Jul 11, 2020