Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable cache for VFS #209

Merged
merged 5 commits into from
Mar 22, 2024
Merged

feat: enable cache for VFS #209

merged 5 commits into from
Mar 22, 2024

Conversation

npmacl
Copy link
Contributor

@npmacl npmacl commented Mar 13, 2024

What was the problem/requirement? (What/Why)

  • VFS had a local cache feature developed for it, which will save files locally after the first access rather than downloading from s3 on subsequent accesses. We don't add the flag in the launch command to use this feature however.

What was the solution? (How)

Change the launch command to enable the cache.

What is the impact of this change?

Depending on the job, should result in improved performance.

How was this change tested?

  • added unit tests, ran hatch run all:test, hatch run lint, hatch run fmt
  • Tested submitting some jobs to a worker, verified no issues.

Was this change documented?

no

Is this a breaking change?

no

@npmacl npmacl marked this pull request as ready for review March 13, 2024 20:01
@npmacl npmacl requested a review from a team as a code owner March 13, 2024 20:01
@jusiskin jusiskin marked this pull request as draft March 13, 2024 20:23
@npmacl npmacl marked this pull request as ready for review March 20, 2024 22:05
src/deadline/job_attachments/download.py Outdated Show resolved Hide resolved
asset_cache_hash_path = vfs_cache_dir / cas_prefix
_ensure_paths_within_directory(str(asset_cache_hash_path), [cas_prefix])

vfs_cache_dir.mkdir(mode=0o700, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the directory already exists, does this ensure its mode is 0o700?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. My thinking was that there shouldn't be anything else creating this directory first, but it makes sense to change this to creating directory and then call os.chmod which will work even if it already exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would do both - create the directory requesting the mode, and then also do an os.chmod.

src/deadline/job_attachments/download.py Outdated Show resolved Hide resolved
src/deadline/job_attachments/download.py Outdated Show resolved Hide resolved
@npmacl npmacl force-pushed the npmac_enable_vfs_cache branch 2 times, most recently from 50b6620 to 53e6dce Compare March 21, 2024 21:54
@npmacl npmacl requested a review from mwiebe March 21, 2024 22:05

# Validate hashes are alphanumeric
for path in decoded_manifest.paths:
if re.fullmatch("[a-zA-Z0-9]+", path.hash) is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you happen to profile this bit as well on large manifests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this validation in a for loop and timed running through 1 million alphanumerical strings 25 characters long, it took ~13.5 seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to optimize this is to create a compiled regex at file global scope, and then use that to do the match here. This code is compiling the regex and doing the match every time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - not great, but not absolutely terrible. I think maybe we can revisit and possibly remove this down the line as we make other security improvements if this was just another offshoot of the symlink/running as deadline-worker issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gave that a try, and it brought it down to ~4.5 seconds. I'll update that, thanks Mark!

Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>
mwiebe
mwiebe previously approved these changes Mar 22, 2024
Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>
Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>
@npmacl npmacl enabled auto-merge (squash) March 22, 2024 22:56
@npmacl npmacl merged commit 91dfa83 into mainline Mar 22, 2024
18 checks passed
@npmacl npmacl deleted the npmac_enable_vfs_cache branch March 22, 2024 23:02
baxeaz pushed a commit that referenced this pull request Mar 23, 2024
* feat: enable cache for VFS

Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>

* switched to precompiling alphanumerical regex

Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>

* ran fmt

Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>

---------

Signed-off-by: Nathan MacLeod <142927985+npmacl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants