Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TorchDISC] compile disc nodes with a fake cluster algorithm #173

Merged

Conversation

Yancey1989
Copy link
Collaborator

@Yancey1989 Yancey1989 commented Mar 17, 2022

This PR implements the compilationo stage with a fake cluster algorithm.

@Yancey1989 Yancey1989 changed the title [WIP] compile disc nodes with a fake cluster algorithm [WIP] [TorchDISC] compile disc nodes with a fake cluster algorithm Mar 18, 2022
@Yancey1989 Yancey1989 changed the title [WIP] [TorchDISC] compile disc nodes with a fake cluster algorithm [TorchDISC] compile disc nodes with a fake cluster algorithm Mar 18, 2022
Copy link
Collaborator

@tanyokwok tanyokwok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

const std::set<int8_t> TSBackendDeviceType::supported_device_types_ = {
(int8_t)at::kCPU, (int8_t)at::kCUDA};

class DISCBackendImpl : public torch::lazy::BackendImplInterface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should discuss this with PyTorch LTC. We could propose our roadmap and our requirements on LTC?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, maybe we should write a detailed design document before communicating with LTC team.

@Yancey1989 Yancey1989 merged commit ab2bca8 into alibaba:features/torch_disc_devel Mar 18, 2022
@Yancey1989 Yancey1989 deleted the compile_disc_nodes branch March 18, 2022 10:37
// 2. conversion from torchscript Graph to mhlo Dialect on DISC nodes
// 2. register DISC engine
InferShapes(graph, arguments);
FusionDiscNodes(graph);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better avoid to use "fusion" here, which can easily be misunderstood

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Actually, this function just CLUSTER instead of FUSION, maybe better to ClusterDiscNodes ?

// 1. clustering and group DISC nodes
// 2. conversion from torchscript Graph to mhlo Dialect on DISC nodes
// 2. register DISC engine
InferShapes(graph, arguments);
Copy link
Contributor

@linearhit linearhit Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are involving the lazy arguments for dtype and rank analysis, am i understanding it correct?
Without an explicit "static_rank_backward_analysis" process, it should be not enough for us to guarantee that all the compile-time-constants for static rank to be known at compile time. To be discussed offline.

Copy link
Contributor

@linearhit linearhit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark for offline discussion.

Yancey1989 added a commit that referenced this pull request Apr 13, 2022
compile Disc nodes with a fake cluster algorithm.
Yancey1989 added a commit to Yancey1989/BladeDISC that referenced this pull request Apr 18, 2022
…#173)

compile Disc nodes with a fake cluster algorithm.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants