Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Allow absolute paths in build_script_files #681

Merged
merged 1 commit into from
Sep 7, 2019

Conversation

Xarthisius
Copy link
Contributor

@Xarthisius Xarthisius commented May 20, 2019

Fixes #673

This PR allows to provide an absolute path for source files in get_build_script_files. As a result, plugins can inject files outside of r2d's repository. In case, source file path is relative, the old behavior is preserved, i.e. path is assumed to be relative to the directory where repo2docker/buildpacks/base.py is.

@yuvipanda PTAL!

Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya! I've left some comments inline :) This would also need some tests, but the general approach is 👍

src_parts = src_path.split('/')
src_path = os.path.join(os.path.dirname(__file__), *src_parts)
BLOCKSIZE = 65536
hash_md5 = hashlib.md5()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get away with hashing the absolute path instead of having to hash the contents of the file. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also let's use sha256, it should be fast enough - especially if we're just hashing the filename.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that build scripts are small enough so that it won't matter, but I'm happy to oblige.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to sha256 and computing slug using filepath only.

while len(buf) > 0:
hash_md5.update(buf)
buf = fh.read(BLOCKSIZE)
return hash_md5.hexdigest(), src_path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably include at least some part of the filename in the resulting filepath we return, to make debugging easier. How about something like: assemble_files/<escaped-file-path-truncated>-<6-chars-of-hash> or something like that?

See https://github.com/jupyterhub/binderhub/blob/6b2908d7aaf4a7ec62beed0019de54db06494214/binderhub/builder.py#L127 where we do something like that, keeping in mind the 255 char linux filepath limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5447508

@@ -481,13 +482,38 @@ def render(self):
labels=self.get_labels(),
build_script_directives=build_script_directives,
assemble_script_directives=assemble_script_directives,
build_script_files=self.get_build_script_files(),
build_script_files={
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably move this to a temporary variable for clarity, and add a comment about what it's doing.

@Xarthisius Xarthisius force-pushed the abspath_in_scripts branch from f4cf551 to 9f1fa93 Compare May 21, 2019 18:36
@Xarthisius
Copy link
Contributor Author

@yuvipanda I believe I addressed all your inline comments and added tests.

return escapism.escape(s, safe=safe_chars, escape_char='-')

src_path_slug = escape(src_path)
filename = 'assemble_files/{name}-{hash}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
filename = 'assemble_files/{name}-{hash}'
filename = 'build_script_files/{name}-{hash}'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@yuvipanda
Copy link
Collaborator

@Xarthisius one minor change I proposed (making up for an original mistake I made in my review!) otherwise LGTM.

The failing test is due to #684 I believe.

@yuvipanda
Copy link
Collaborator

@Xarthisius I've now merged #684. Can you rebase to see if all the tests pass?

Thank you

@Xarthisius Xarthisius force-pushed the abspath_in_scripts branch from 9f1fa93 to a863949 Compare May 23, 2019 00:24
@Xarthisius Xarthisius force-pushed the abspath_in_scripts branch from a863949 to a7a9a76 Compare June 4, 2019 13:42
@Xarthisius Xarthisius force-pushed the abspath_in_scripts branch from a7a9a76 to 24234d9 Compare June 28, 2019 13:44
@Xarthisius Xarthisius changed the title [WIP] Allow absolute paths in build_script_files. Fixes #673 [MRG] Allow absolute paths in build_script_files. Fixes #673 Jun 28, 2019
@willingc willingc requested review from betatim and minrk September 7, 2019 08:56
@minrk minrk changed the title [MRG] Allow absolute paths in build_script_files. Fixes #673 [MRG] Allow absolute paths in build_script_files Sep 7, 2019
@minrk minrk merged commit 4f428c3 into jupyterhub:master Sep 7, 2019
markmo pushed a commit to markmo/repo2docker that referenced this pull request Jan 22, 2021
[MRG] Allow absolute paths in build_script_files. Fixes jupyterhub#673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a hook for injecting files to tarfile with a Docker build context
4 participants