Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

feature: export experiment results #2706

Merged
merged 20 commits into from
Aug 12, 2020

Conversation

tabVersion
Copy link
Contributor

No description provided.

@tabVersion
Copy link
Contributor Author

@QuanluZhang @SparkSnail PR is ready for review.

@scarlett2018 scarlett2018 mentioned this pull request Jul 22, 2020
66 tasks
@@ -24,6 +24,7 @@ nnictl support commands:
* [nnictl package](#package)
* [nnictl ss_gen](#ss_gen)
* [nnictl --version](#version)
* [nnictl export_results](#export-results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference of this command from nnictl experiment export in https://nni.readthedocs.io/en/latest/Tutorial/Nnictl.html#manage-experiment-information

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It dumps not only experiment settings, trial final results, but also all intermediate results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should combine the commands, otherwise, the commands are messy. please try to merge your command to nnictl experiment export

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will add a CLI option to specify whether intermediate results are required or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused command in doc.

@tabVersion tabVersion requested a review from QuanluZhang July 22, 2020 10:17
if response is not None and check_response(response):
content = json.loads(response.text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the content of response.text and what is the content of intermediate_results.text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response.text contains basic experiment settings, such as hyperparameters. intermediate_results.text only include trial id, intermediate result and timestamp.

trial_records = []
for record in content:
record_value = json.loads(record['value'])
if not isinstance(record_value, (float, int)):
formated_record = {**record['parameter'], **record_value, **{'id': record['id']}}
if args.intermediate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so many checks of if args.intermediate, could you refactor the logic flow a little bit to make it simpler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manage to use only two if args.intermediate.
One is for requesting for intermediate data (at 711th line).
The other is for controlling the output of csv file (at 726th line).

return groupby

def trans_intermediate_dict(record):
return {'intermediate': '[' + str(reduce(lambda x, y: x + ',' + y, record)) + ']'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know the content of record, but could we use ','.join(record) instead of str(reduce(lambda x, y: x + ',' + y, record))?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the content of record?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record is a list with float numbers in it.
','.join() is a good solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested intermediate results with dict metric. I will do it this afternoon.

@QuanluZhang QuanluZhang requested a review from leckie-chn July 31, 2020 00:44
@@ -465,13 +465,14 @@ Debug mode will disable version check function in Trialkeeper.
|id| False| |ID of the experiment |
|--filename, -f| True| |File path of the output file |
|--type| True| |Type of output file, only support "csv" and "json"|
|--intermediate, -i|False||Is intermediate results required|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are intermediate results included

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

grammar fix
print_error('Unknown type: %s' % args.type)
exit(1)
if not running:
print_error('Restful server is not Running')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running -> running

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if response is not None and check_response(response):
content = json.loads(response.text)
if args.intermediate:
intermediate_results = rest_get(metric_data_url(rest_port), REST_TIME_OUT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my point, use intermediate_results_response is more meaningful.

sorted(intermediate_results, key=lambda x: x['timestamp'])
groupby = dict()
for content in intermediate_results:
groupby.setdefault(content['trialJobId'], []).append(eval(content['data']))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use eval()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned data is serialized. For example, "\"{\\\"default\\\": 93.64, \\\"other_metric\\\": 2.0}\"".
Using eval() can easily handle the \ part and reconstruct the data structure from the string. It is troublesome for users to handle serialized data.
The data from the server can be trusted so I think using eval here is safe here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not simply use json.loads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks that json.loads() is a good solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please double check whether it works properly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will. Thanks for reminding me.

@ultmaster ultmaster merged commit d654eff into microsoft:master Aug 12, 2020
LovPe pushed a commit to LovPe/nni that referenced this pull request Aug 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants