Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

sharedstorage support remote umount and fix bug #3456

Merged
merged 4 commits into from
Apr 7, 2021

Conversation

J-shang
Copy link
Contributor

@J-shang J-shang commented Mar 18, 2021

No description provided.

@J-shang J-shang closed this Mar 22, 2021
@J-shang J-shang reopened this Mar 22, 2021
@@ -248,17 +257,20 @@ export class RemoteEnvironmentService extends EnvironmentService {
}
this.environmentExecutorManagerMap.set(environment.id, executorManager);
const executor = await this.getExecutor(environment.id);
let remoteWorkingRoot: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remoteWorkingRoot -> remoteWorkingRootDir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify it

executor.joinPath(executor.getRemoteExperimentRootDir(getExperimentId()),
'envs', environment.id)
remoteWorkingRoot = executor.getRemoteExperimentRootDir(getExperimentId());
environment.runnerWorkingFolder = executor.joinPath(remoteWorkingRoot, 'envs', environment.id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated with line 263

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix it

}
environment.command = `cd ${environment.runnerWorkingFolder} && \
environment.command = `cd ${remoteWorkingRoot} && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change to remoteWorkingRoot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because ${environment.runnerWorkingFolder} is ./exp_id/env/runner_id and we have ${environment.command}=`mkdir -p envs/${envId} && cd envs/${envId} && ${environment.command}` in trialDispatcher.py, so change dir to ./exp_id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

command also contains sh ../install_nni.sh, if you only change root folder of the script, it can not find install_nni.sh file. I have another PR to fix this: #3472, perhaps we could merge the logic.

@@ -348,6 +348,11 @@ class TrialDispatcher implements TrainingService {
for (const commandChannel of this.commandChannelSet) {
await commandChannel.stop();
}
if (this.useSharedStorage) {
this.log.info(`stopping shared storage...`)
component.get<SharedStorageService>(SharedStorageService).cleanUp();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, add it

throw new Error(result.stderr);
}
} catch (error) {
const errorMessage: string = `${this.storageType} Shared Storage: Mount ${this.nfsServer}:${this.exportedDirectory} to ${this.localMountPoint} failed, error is ${error}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix it

throw new Error(result.stderr);
}
} catch (error) {
const errorMessage: string = `${this.storageType} Shared Storage: get account key failed, error is ${error}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the message contains get account key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a mistake, thank you, fix it

@@ -124,6 +124,12 @@ class LinuxCommands extends OsCommands {
command = `bash '${script}'`;
} else {
script = script.replace(/"/g, '\\"');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like resolving escape characters but can't understand what exactly it is doing.
Please describe what kind of script are you trying to escape or unescape. And if it is possible, we should avoid handling escape manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wants to solve the situation that bash -c nested echo "some command \\"something\\"". This happens in sharedstorage mount command.

@J-shang J-shang requested review from SparkSnail and liuzhe-lz April 6, 2021 05:03
@SparkSnail SparkSnail merged commit 76f3990 into microsoft:master Apr 7, 2021
acured pushed a commit to acured/nni that referenced this pull request Apr 7, 2021
@J-shang J-shang deleted the update-sharedstorage branch April 12, 2021 09:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants