-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Support monitor mode when creating or resuming a new experiment #1933
Support monitor mode when creating or resuming a new experiment #1933
Conversation
merge master
merge master
Update evolution doc (microsoft#1493)
merge master
merge master
merge master
augment pylintrc (microsoft#1643)
fix console.log (microsoft#1636)
merge master
merge master
merge master
merge master
Filter prune algo implementation (microsoft#1655)
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
merge master
…nictl-foreground
docs/en_US/Tutorial/Nnictl.md
Outdated
@@ -49,6 +49,7 @@ nnictl support commands: | |||
|--config, -c| True| |YAML configure file of the experiment| | |||
|--port, -p|False| |the port of restful server| | |||
|--debug, -d|False||set debug mode| | |||
|--monitor, -m|False|set monitor mode| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing vertical separator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
time.sleep(args.time) | ||
if auto_exit: | ||
status = get_experiment_status(port) | ||
if status in ['DONE', 'ERROR', 'STOPPED']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can print dispatcher and nnimanager log here (if the status is error). Because if user is running it in a container, when the program exit, the container is destroyed too. There is no way to retrieve the error info. Another option is to disable auto_exit in case --debug
is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can print dispatcher and nnimanager log here (if the status is error). Because if user is running it in a container, when the program exit, the container is destroyed too. There is no way to retrieve the error info. Another option is to disable auto_exit in case
--debug
is set.
nniManager.log content maybe too long, maybe it's not suitable to show these content in screen. Users can mount NNI's logDir in container to their local path, the logDir contains log files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the case is not for screen. For screen user, they will never seek for foreground. For container users, they don't care. Furthermore, I don't think os.system("clear")
has any effect if they are using container. Printing a lot is expected.
tools/nni_cmd/nnictl_utils.py
Outdated
print_error('please input a positive integer as time interval, the unit is second.') | ||
exit(1) | ||
def set_monitor(auto_exit, time_interval, port=None, pid=None): | ||
'''set the experiment monitor engine''' | ||
while True: | ||
try: | ||
os.system('clear') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this os.system('clear')
working on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Support monitor mode when creating or resuming a new experiment (microsoft#1933)
No description provided.