-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Enable users to switch and install conda env in jupyter task #10337
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The front end part LGTM.
Discussed with @SbloodyS, it seems using |
"tar -xzf %s -C jupyter_env && " + | ||
"source jupyter_env/bin/activate"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to let users to define this path if the user's path is different from this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, env tar
will be downloaded from resource center
before task execution and removed when task execution completed.
Lines 141 to 145 in f90f0f8
// copy hdfs/minio file to local | |
List<Pair<String, String>> fileDownloads = downloadCheck(taskExecutionContext.getExecutePath(), taskExecutionContext.getResources()); | |
if (!fileDownloads.isEmpty()){ | |
downloadResource(taskExecutionContext.getExecutePath(), logger, fileDownloads); | |
} |
Lines 199 to 203 in f90f0f8
} finally { | |
TaskExecutionContextCacheManager.removeByTaskInstanceId(taskExecutionContext.getTaskInstanceId()); | |
taskCallbackService.sendTaskExecuteResponseCommand(taskExecutionContext); | |
clearTaskExecPath(); | |
} |
However, I think it is a good idea to give a choice to define the path and install the env on workers without getting removed after task execution. In this way, the same env will not need to be downloaded every time.
WDYT @SbloodyS @zhongjiajie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better. But what i mean is if the user's the compressed tar package path after decompression is inconsistent with this hard code path. If we hard code this. We should put this to the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure about it. Could you take a look? @zhongjiajie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better. But what i mean is if the user's the compressed tar package path after decompression is inconsistent with this hard code path.
Seems if users use
conda pack
to pack their environment, when they runtar
they will get the same directory structure.
I test locally, and it is 👍. but should we add some docs to tell use run conda pack
to create the conda tarball?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better. But what i mean is if the user's the compressed tar package path after decompression is inconsistent with this hard code path.
Seems if users use
conda pack
to pack their environment, when they runtar
they will get the same directory structure.I test locally, and it is 👍. but should we add some docs to tell use run
conda pack
to create the conda tarball?
@zhongjiajie Sure, that's a good idea. I will add some docs to instruct users on conda pack
. BTW, is there any substitute for source jupyter_env/bin/activate
? @SbloodyS and I discussed about this and it is possible that some companies ban developers from using source
cmd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I think it is a good idea to give a choice to define the path and install the env on workers without getting removed after task execution. In this way, the same env will not need to be downloaded every time.
but I do not think so, I think they can use shell task to do that. It is odd if we do not remove resources after the task execute finish, using two cases:
- we only support not deleting resources after task is done: although the worker has already installed the env, but we still have resource configurated in task definition, So it means worker will re-install the conda env to worker. Or maybe users have to change the task definition and remove the resource after the first time they ran the task.(it is odd)
- We have given users an option about resources install and delete. then have to change the option after they ran the task first time too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I think it is a good idea to give a choice to define the path and install the env on workers without getting removed after task execution. In this way, the same env will not need to be downloaded every time.
but I do not think so, I think they can use shell task to do that. It is odd if we do not remove resources after the task execute finish, using two cases:
- we only support not deleting resources after task is done: although the worker has already installed the env, but we still have resource configurated in task definition, So it means worker will re-install the conda env to worker. Or maybe users have to change the task definition and remove the resource after the first time they ran the task.(it is odd)
- We have given users an option about resources install and delete. then have to change the option after they ran the task first time too.
Alright. This makes sense to me. Since the tasks are usually triggered multiple times, considering persisting the env could be complicated and might cause unexpected issues. If users want to persist the env, it is better to use a shell task and only trigger it once at the very beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that what I mean
Codecov Report
@@ Coverage Diff @@
## dev #10337 +/- ##
=========================================
Coverage 40.87% 40.87%
- Complexity 4851 4853 +2
=========================================
Files 886 886
Lines 36032 36039 +7
Branches 3998 3999 +1
=========================================
+ Hits 14727 14731 +4
- Misses 19848 19854 +6
+ Partials 1457 1454 -3
Continue to review full report at Codecov.
|
Related docs added in the latest commit : ) |
and we still have discussion in #10337 (comment), I wonder whether conda have some command like |
I haven't found such things so far. However, using |
Yeah, I can not find either, I think maybe we should use |
Yes, IMHO, there is no perfect solution here to avoid using |
SGTM, So please add some docs to hint users about the |
Kudos, SonarCloud Quality Gate passed! |
Caveats to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backend part LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and frontend jus simply add resource usage, merge
Purpose of the pull request
jupyter task
. If the conda environment user needs for his/her jupyter task is not too large like bigger than 100M, the user could upload the packed conda environment toresource center
and select it when creatingjupyter task
,jupyter task plugin
will automatically install the conda env on target worker before executing jupyter notes.resource center
and select the one needed for a specific task.jupyter task plugin
Brief change log
Verify this pull request