-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Conversation
fix docker in aml (microsoft#2633)
@@ -122,7 +122,7 @@ export class OpenPaiEnvironmentService extends EnvironmentService { | |||
break; | |||
case 'SUCCEEDED': | |||
case 'FAILED': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a break above Failed, so that succeeded env won't fail experiment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
if (!amlClient) { | ||
throw new Error('AML client not initialized!'); | ||
if (!amlClient) { | ||
return Promise.reject('AML client not initialized!'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, why throw error doesn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no try catch outside the function, throw this error directly will not be shown in web ui.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try it? there are a lot of throw error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I tried it. If we only use throw Error without catch, the error information will only shown in nnictl log stderr
instead of webui.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The throw new Error
here should work without try catch
because there is a catch
in nnimanager and this promise is called from nnimanager. It works in other training service. Would you please check why it is not working?
@@ -98,8 +98,7 @@ export class AMLEnvironmentService extends EnvironmentService { | |||
environment.setFinalStatus('SUCCEEDED'); | |||
break; | |||
case 'FAILED': | |||
environment.setFinalStatus('FAILED'); | |||
break; | |||
return Promise.reject(`AML: job ${environment.jobId} is failed!`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides returning reject, do we need to set final status here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need to set status for environment here. when return Promise.reject, the experiment become error
status, and nniManager stops. this final status is not useful anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, the status is not visible so far. But if we expose it to webui in future, it would be a hole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
No description provided.