-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable eager spmd #7341
Enable eager spmd #7341
Conversation
3ec9f92
to
4244dcb
Compare
How could SPMD possibly work for eager mode? |
consider eager mode as calling |
Then how sharding propogation and auto partition work? I assume they don't carry states from last graph? |
The
we will compile a graph for |
Okay, that's fair. |
@@ -2742,7 +2742,9 @@ void XLANativeFunctions::_propagate_xla_data(const at::Tensor& input, | |||
|
|||
// 2) Aid SPMD. | |||
XLATensor::ShardingSpecPtr sharding = input_tensor->sharding_spec(); | |||
if (sharding && sharding->sharding.type() != xla::OpSharding::UNKNOWN) { | |||
// don't propagate sharding in eager mode. | |||
if (!XLAGraphExecutor::Get()->UseEagerMode() && sharding && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It complained about the output tensor already has a sharding and we can't propagate to it. This happens in the backward. I didn't spend enough time to debug it but I don't expect user to actually run eager mode with step fn(forward and backward), I only expect them to run it with some data preprocessing on device so I just quickly unblock myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks, Jack!
No description provided.