-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Conversation
SparkSnail
commented
Jul 23, 2020
•
edited
Loading
edited
- upgrade restserver v1 to v2
- remove password configuration in PAI mode, since in v2 the password related API is not supported.
- show error information when get unvalid token or unvalid http host.
Merge master
merge master
merge master
merge master
merge master
merge master
merge master
Chinese translation (microsoft#2458)
merge master
merge master
would you add this change to fix mac pipeline? https://github.com/microsoft/nni/pull/2715/files |
@@ -103,7 +103,6 @@ export namespace ValidationSchemas { | |||
}), | |||
pai_config: joi.object({ // eslint-disable-line @typescript-eslint/camelcase | |||
userName: joi.string().min(1).required(), | |||
passWord: joi.string().min(1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also update doc accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PAIMode doc has already removed the password configuration, while it keeps in our code. Only need to remove code this time.
// Status code 200 for success | ||
if ((error !== undefined && error !== null) || response.statusCode !== 200) { | ||
const errorMessage: string = (error !== undefined && error !== null) ? error.message : | ||
`PAI Training service: get job info for trial ${paiTrialJob.id} from PAI Cluster failed!, http code:${response.statusCode}, http body: ${JSON.stringify(_body)}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why _body
needs JSON.stringify
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested the response data of PAI, seems this list API in v2 returns a object, and need to use JSON.stringify to decode the content, while the submit API returns a string directly, does not need to do this.
@@ -248,19 +244,6 @@ export class OpenPaiEnvironmentService extends EnvironmentService { | |||
return deferred.promise; | |||
} | |||
|
|||
private async refreshPlatform(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove the refresh logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to remove, as it only refresh pai token. And it's a pattern for future platform, which need regular refresh credential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squirrelsc you mean we will add dedicated token refresh logic in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, if there are similar requirements, here is a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then, why remove pai token refresh? no need to refresh token any more on pai?
updated. |
@@ -178,7 +177,8 @@ abstract class PAITrainingService implements TrainingService { | |||
const deferred: Deferred<void> = new Deferred<void>(); | |||
|
|||
request(stopJobRequest, (error: Error, response: request.Response, _body: any) => { | |||
if ((error !== undefined && error !== null) || response.statusCode >= 400) { | |||
// Status code 202 for success. | |||
if ((error !== undefined && error !== null) || response.statusCode !== 202) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better keep original status code. The HTTP status code is standard. < 400 is normal state. 4xx is permission issue, and 5xx are server errros. Don't hard code to 202 only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a case that when submit job to http request, pai returns a 3xx response, but the job is not submitted correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3xx means redirection, and it should be handled in most HTTP sdk internally. If it happens here, you can check if the sdk is not used correctly. For example, disabled auto redirect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, add followAllRedirects:true
to handle redirect.
this.log.error(`PAI Training service: get job info for trial ${paiTrialJob.id} from PAI Cluster failed!`); | ||
// Status code 200 for success | ||
if ((error !== undefined && error !== null) || response.statusCode >= 400) { | ||
// The job refresh time could be ealier than job submission, so it might return 404 error code, need refactor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why there is no log here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 404 error is always shown, it is caused by NNI's job refresh logic, not expected to shown. The original logic use statusCode>=500
to avoid log 404 error, it is not reasonable. We should refactor the logic to avoid refresh job while it is not submitted. If add a error log here, will cause confuse for users currently, will add a log here after fixing the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks