Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Add native support for v2 config #3466

Merged
merged 28 commits into from
Apr 9, 2021
Merged

Conversation

liuzhe-lz
Copy link
Contributor

No description provided.

@@ -66,6 +68,7 @@ class ExperimentConfig(ConfigBase):
assessor: Optional[_AlgorithmConfig] = None
advisor: Optional[_AlgorithmConfig] = None
training_service: Union[TrainingServiceConfig, List[TrainingServiceConfig]]
_deprecated: Optional[str, Any] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what this contains?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated fields as well as fields not supported by v2 (yet)...
It appears in at least 3 places and I placed the comments somewhere else. Because this file is "more visible" to end users.
This is a workaround for backward compatibility.

remove_algo_meta_data(meta['builtinName'])
save_algo_meta_data(meta)
else:
print_error(f'Cannot overwrite builtin algorithm')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need return?

Copy link
Contributor Author

@liuzhe-lz liuzhe-lz Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the old logic but seems not.
We have no restriction on meta_list's order. So as you can see the old code calls verify_algo_import on a part of the list. I believe the correct logic is either verify all or verify none.

@@ -160,7 +160,7 @@ class NNIRestHandler {
}

private startExperiment(router: Router): void {
router.post('/experiment', expressJoi(ValidationSchemas.STARTEXPERIMENT), (req: Request, res: Response) => {
router.post('/experiment', (req: Request, res: Response) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not need to validate now?

Copy link
Contributor Author

@liuzhe-lz liuzhe-lz Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think validate in launcher is enough. It's not DRY to validate twice.
If a user calls this API from a custom client, it's at their own risks.

@@ -10,7 +10,7 @@ import { AzureStorage, KeyVaultConfig, KubernetesClusterConfig, KubernetesCluste
} from '../kubernetesConfig';

// operator types that kubeflow supported
export type KubeflowOperator = 'tf-operator' | 'pytorch-operator' ;
export type KubeflowOperator = string; // 'tf-operator' | 'pytorch-operator'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unpack logic before this yields string, and tsc complained a lot type mismatch, so...
Too lazy.

@SparkSnail SparkSnail marked this pull request as draft March 26, 2021 08:57
@liuzhe-lz liuzhe-lz marked this pull request as ready for review April 7, 2021 02:19
data['logDir'] = data.pop('experimentWorkingDirectory')
def to_v2(v1) -> ExperimentConfig:
platform = v1.pop('trainingServicePlatform')
assert platform in ['local', 'remote', 'openpai']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if aml kubeflow ...

Copy link
Contributor Author

@liuzhe-lz liuzhe-lz Apr 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K8S based will stick with v1.
AML is a mistake and has been fixed.

chai.assert.equal(environmentService.testGetEnvironments().size, 2, "as env not reused, so only 2 envs should be here.");
const trials = await trialDispatcher.listTrialJobs();
chai.assert.equal(trials.length, 2, "there should be 2 trials");
//trialDispatcher.setClusterMetadata(
Copy link
Contributor

@SparkSnail SparkSnail Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the UT case fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot...
please let me fix it during bug bash

classArgs?: object;
}

export interface ExperimentConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miss versionCheck?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently version check is bound to debug. I think it's enough for this release.

maxExperimentDuration?: string;
maxTrialNumber?: number;
nniManagerIp?: string;
//useAnnotation: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not support annotation anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation is handled inside nnictl. All other modules does not aware whether it's user written code & search space, or it's de-annotated code & search space.
nni.Experiment does not support annotation for now.

@SparkSnail SparkSnail merged commit 817ec68 into microsoft:master Apr 9, 2021
@liuzhe-lz liuzhe-lz deleted the dev-config2 branch June 17, 2021 03:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants