Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

NAS visualization #2085

Merged
merged 15 commits into from
Mar 3, 2020
Merged

NAS visualization #2085

merged 15 commits into from
Mar 3, 2020

Conversation

ultmaster
Copy link
Contributor

@ultmaster ultmaster commented Feb 20, 2020

This PR is open for code comparison against master. Not ready for review yet.

Working items:

  • Makefile
  • Fix UI fail bug when key is missing.

Might do if I have time:

  • Use proto graph and log in protobuf (instead of NodePy)
  • Configurable node elimination on UI
  • Configurable node collapse on UI
  • Display attributes of ops on UI

I will post some preview here when I think it's ready.

Ready for review

Deferred items:

  • Add a summary of candidates along with weights in expansion panel.
  • Layer choice candidate infer is still buggy. Need investigation.
  • Add panzoom.
  • Add focus mode showing mutable nodes only.
  • Support for build and entrypoint in non-source build. (need help)
  • Hide primitive nodes (like ListConstruct)
  • Merge multiple edges between two nodes/clusters.
  • Design might need improving (theme, graph style). I know it's ugly, but I can't bend the curves. I tried...
  • Responsive width & height.
  • Clean up unused dependencies.
  • Hide nodes with weight 0.

These items might be done in separate PRs.

image

image

image

image

How to test:

Change DARTS search into something like this:

class MyTrainer(DartsTrainer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.status_writer = open("log", "w")

    def _logits_and_loss(self, X, y):
        self.mutator.reset()
        logits = self.model(X)
        loss = self.loss(logits, y)
        print(json.dumps(self.mutator.status()), file=self.status_writer)
        self.status_writer.flush()
        return logits, loss


model = CNN(32, 3, 16, 10, 8)
model.cuda()
mutator = DartsMutator(model)
vis_graph = mutator.graph(torch.randn((1, 3, 32, 32)).cuda())
with open("graph.json", "w") as f:
    json.dump(vis_graph, f)

dataset_train, dataset_valid = datasets.get_dataset("cifar10")
criterion = nn.CrossEntropyLoss()
optim = torch.optim.SGD(model.parameters(), 0.025, momentum=0.9, weight_decay=3.0E-4)
trainer = MyTrainer(model=model,
                    mutator=mutator,
                    loss=criterion,
                    metrics=lambda output, target: accuracy(output, target, topk=(1,)),
                    optimizer=optim,
                    num_epochs=1,
                    dataset_train=dataset_train,
                    dataset_valid=dataset_valid,
                    batch_size=64,
                    log_frequency=10,
                    arc_learning_rate=0.1,
                    unrolled=False)
trainer.train()

This will write the graph to graph.json before starts, and write each step as a line into a file called log.

Launch nnictl webui nas --logdir /path/to/the/directory/containing/two/files. Find your UI at port 6667 6060.

@ultmaster ultmaster marked this pull request as ready for review February 24, 2020 15:35
if (graph === undefined || activation === undefined)
return;
const weights = graph.weightFromMutables(activation);
console.log(weights.size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought could delete this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will delete all console.log before merge.

if (graphChanged)
this.expandSet = lodash.cloneDeep(graph.defaultExpandSet);
const graphEl = this.graphElements();
console.log(graphEl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add no-console: true in your lint

Comment on lines 6 to 20
"@bardit/cytoscape-expand-collapse": "^2.0.3",
"@material-ui/core": "^4.9.3",
"@material-ui/icons": "^4.9.1",
"@testing-library/jest-dom": "^4.2.4",
"@testing-library/react": "^9.3.2",
"@testing-library/user-event": "^7.1.2",
"@types/cytoscape": "^3.14.0",
"@types/d3-graphviz": "^2.6.3",
"@types/dagre-d3": "^0.4.39",
"@types/graphlib-dot": "^0.6.1",
"@types/jest": "^24.0.0",
"@types/lodash": "^4.14.149",
"@types/node": "^12.0.0",
"@types/react": "^16.9.0",
"@types/react-dom": "^16.9.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of this dependencies maybe could put into devDependencies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in future item: Clean up unused dependencies.

<ListItem className={classes.listSubtitle}>Inputs ({info.inputs.length})</ListItem>
{
info.inputs.map((item, i) => <ListItem className={classes.listItem} key={`input${i}`}>{item}</ListItem>)
}</React.Fragment>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

</React.Fragment>换个行吧

Comment on lines +226 to +241
const handleSliderChange = (event: ChangeEvent<{}>, value: number | number[]) => {
this.setState({ sliderValue: value as number });
};
const handleSettingsDialogToggle = (value: boolean) => () => {
this.setState({ settingsOpen: value });
};
const handleSettingsChange = (name: string) => (event: React.ChangeEvent<HTMLInputElement>) => {
this.setState({
...this.state,
[name]: event.target.checked
}, () => {
this.setState({
graph: new Graph(this.state.graphData, this.state.hideSidechainNodes),
})
});
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下个版本考虑把这些方法挪出render() 这个周期吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Will do.

'text-wrap': 'wrap',
'text-valign': 'center',
'text-halign': 'center',
'font-family': '"Roboto", "Helvetica", "Arial", sans-serif',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果跟WebUI风格保持一致的话,也应该选Segoe UI作为第一字体?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aligning with tensorboard. So I used Material-UI + Roboto.

Comment on lines +40 to +43
'padding-left': '8px',
'padding-right': '8px',
'padding-top': '8px',
'padding-bottom': '8px',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

padding: 8px;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Comment on lines +78 to +81
'padding-top': '30px',
'padding-bottom': '10px',
'padding-right': '10px',
'padding-left': '10px',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也可以合成一条

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Cytoscape use its own style format. I tried to merge them and it didn't work.

const { classes } = this.props;
const { selectedNode, graph } = this.state;
if (graph === undefined)
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between null and undefined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

React suggests null when returning a None element. Fixed. Thanks.

}
this.defaultExpandSet = this.getDefaultExpandSet(graphData.mutable);
this.mutableEdges = this.inferMutableEdges(graphData.mutable);
console.log(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove console.log()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1 @@
/// <reference types="react-scripts" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ultmaster better to explain why

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's explained in facebook/create-react-app#6560. I guess we should respect the practice of create-react-app.

};
})
.catch(error => {
console.error('Error during service worker registration:', error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use Alert(error)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is auto-generated. I don't want to modify it.

@@ -0,0 +1,5 @@
// jest-dom adds custom jest matchers for asserting on DOM nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this useful? It seems that there is no main logic code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's auto-generated. Removing it for now.

@@ -158,6 +158,10 @@ def parse_args():
parser_webui_url = parser_webui_subparsers.add_parser('url', help='show the url of web ui')
parser_webui_url.add_argument('id', nargs='?', help='the id of experiment')
parser_webui_url.set_defaults(func=webui_url)
parser_webui_nas = parser_webui_subparsers.add_parser('nas', help='show nas ui')
parser_webui_nas.add_argument('--port', default=6060, type=int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add help= to explain the argement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines 112 to 124
for mutable in self.mutables.traverse(deduplicate=False):
modules = mutable.name.split(".")
path = [
{"type": self.model.__class__.__name__, "name": ""}
]
m = self.model
for module in modules:
m = getattr(m, module)
path.append({
"type": m.__class__.__name__,
"name": module
})
result["mutable"][mutable.key].append(path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to add comments to explain this part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@@ -203,7 +207,7 @@ def parse_args():
parser_tensorboard_start = parser_tensorboard_subparsers.add_parser('start', help='start tensorboard')
parser_tensorboard_start.add_argument('id', nargs='?', help='the id of experiment')
parser_tensorboard_start.add_argument('--trial_id', '-T', dest='trial_id', help='the id of trial')
parser_tensorboard_start.add_argument('--port', dest='port', default=6006, help='the port to start tensorboard')
parser_tensorboard_start.add_argument('--port', dest='port', default=6060, help='the port to start tensorboard')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change the default value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an accident. Sorry.

@QuanluZhang
Copy link
Contributor

QuanluZhang commented Feb 28, 2020

@ultmaster could you write doc for it, including how to change user code, how to start the nas ui. From the code, I can get how you get the super graph, but I did not get how the log files come from, how to generate those log files for nas ui.

@ultmaster
Copy link
Contributor Author

ultmaster commented Feb 28, 2020

@ultmaster could you write doc for it, including how to change user code, how to start the nas ui.

I introduced briefly how to use it in the description of this PR, but it's for test only.

The docs will be in next release. This is because currently, the only way to launch it is to build from source and launch it with nnictl under nni directly, which is because, release integration is not done.

BTW, as discussed in the last meeting, I need help to integrate it into release (including changing all the Makefiles and make it work on pypi version).

Ideally, graph logging should be registered as hook (callback) into trainer. But since on_batch_end is not implemented at all, we will only be able to dump status after each epoch.

@ultmaster ultmaster merged commit 9987014 into microsoft:master Mar 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants