-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IsVirtualDeivce() and refactor #6726
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,12 +36,10 @@ class DeviceData : public XlaNode { | |
// backend data with a partitioned one in the node operands. Note that | ||
// this is permitted only if the node holds a placeholder. | ||
void Assign(std::shared_ptr<torch::lazy::BackendData> data) { | ||
// TODO(yeounoh) check if the existing data is a placeholder after we | ||
// address the issue where some of the sync tensors spill with device node. | ||
XLA_CHECK(data->shape() == data_->shape()) | ||
<< "Shape mismatch: expected (" << data_->shape().to_string() | ||
<< "), actual (" << data->shape().to_string() << ")"; | ||
data_ = data; | ||
data_.reset(data.get()); | ||
Comment on lines
-44
to
+42
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what's the difference between these two? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better memory management, this is refactoring to reuse the existing data_ to hold the new data. |
||
} | ||
|
||
static DeviceData* Cast(const torch::lazy::Node* node); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -346,7 +346,9 @@ class XLAGraphExecutor : public torch::lazy::LazyGraphExecutor { | |
std::vector<size_t> SetBufferDonors(LoweringContext* lowering_ctx); | ||
|
||
// We don't use upstream Compile to have BuildInputOutputAliases. | ||
CompilationResult Compile(const std::vector<XLATensorPtr>& tensors, | ||
// TODO(yeounoh) auto-sharding can change tensors shardings, which needs to be | ||
// accounted for in Dynamo integration. | ||
CompilationResult Compile(std::vector<XLATensorPtr>& tensors, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can see how this will become problematic for the dynamo, since in dynamo we will first dry run the compilation and don't execute the compiled program. Please leave a TODO somewhere for the dynamo integration with auto-sharding. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense I will also add a section in the design doc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
absl::Span<const std::string> devices, | ||
const SyncTensorCollection& coll, | ||
PostOrderData* po_data, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will
GetXLAShardingSpec
return when sharding_spec is null?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""
empty string as expected.