-
Notifications
You must be signed in to change notification settings - Fork 22.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Profiler] Unify the device(CUDA, XPU, PrivateUse1) in torch profiler post processing #123247
[Profiler] Unify the device(CUDA, XPU, PrivateUse1) in torch profiler post processing #123247
Conversation
…processing Use use_device to identify the required device Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/123247
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 2509aa8 with merge base 58e403c (): UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @ezyang @aaronenyeshi @valentinandrei Could you help review this PR? Thank you :) |
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks great! Thank you for the change to consolidate the devices, it improves the code readability and maintainability in the future.
A few comments:
- Could we keep use_cuda as an argument for awhile. We need to mark it for deprecation in case anyone is currently using it.
- I wonder why self.use_device is None for privateuseone. Why not support all valids types: cuda, xpu, privateuseone? cc @fwenguang , @NmomoN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This significantly improves the readability of this code and I appreciate the effort going into refactoring this. Overall looks good to me. Few minor comments and we can approve
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Just a few small changes, and should be good to ship. Check the lint failures too please.
Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Hi, @aaronenyeshi @DanilBaibak Thank you for help. It do be a bug from my code change here: Suggested code change: Thank you. |
… post processing (#123247) This PR unifies the CUDA, XPU and PrivateUse1 in the torch profiler. Now CUDA, XPU and PrivateUse1 can together use string object `use_device` to distinguish each other and share one device path for calculating kineto time durations and memory statistics for post processing. #suppress-api-compatibility-check Co-authored-by: Aaron Enye Shi <enye.shi@gmail.com> Pull Request resolved: #123247 Approved by: https://github.com/aaronenyeshi, https://github.com/gujinghui
…profiler post processing (pytorch#123247)" This reverts commit 768ce2c. Reverted pytorch#123247 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](pytorch#123247 (comment)))
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Re: pytorch/pytorch#123247 This PR changed `self_cuda_time_total` to `self_device_time_total` in API calls.
… post processing (pytorch#123247) This PR unifies the CUDA, XPU and PrivateUse1 in the torch profiler. Now CUDA, XPU and PrivateUse1 can together use string object `use_device` to distinguish each other and share one device path for calculating kineto time durations and memory statistics for post processing. #suppress-api-compatibility-check Co-authored-by: Aaron Enye Shi <enye.shi@gmail.com> Pull Request resolved: pytorch#123247 Approved by: https://github.com/aaronenyeshi, https://github.com/gujinghui
…profiler post processing (#123247)" This reverts commit 768ce2c. Reverted #123247 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](#123247 (comment)))
… post processing (#123247) This PR unifies the CUDA, XPU and PrivateUse1 in the torch profiler. Now CUDA, XPU and PrivateUse1 can together use string object `use_device` to distinguish each other and share one device path for calculating kineto time durations and memory statistics for post processing. #suppress-api-compatibility-check Co-authored-by: Aaron Enye Shi <enye.shi@gmail.com> Pull Request resolved: #123247 Approved by: https://github.com/aaronenyeshi
if self.use_device and self.use_device != _get_privateuse1_backend_name(): | ||
warn(f"{self.use_device} doesn't support profile.") | ||
VALID_DEVICE_OPTIONS = ["cuda", "xpu", "privateuseone"] | ||
if self.use_device not in VALID_DEVICE_OPTIONS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This results in an annoying UserWarning: The None is not a valid device option.
warning whenever CPU profiler is used :( PR is coming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @malfet
This fixes a logic regression introduced by #123247 where ```python if self.use_device and self.use_device != _get_privateuse1_backend_name(): ``` was replaced with ```python VALID_DEVICE_OPTIONS = ["cuda", "xpu", "privateuseone"] if self.use_device not in VALID_DEVICE_OPTIONS: ``` That triggers a warning every time code is invoke with `self.use_device` set to None
Hmm, why this PR added |
In this PR, we removed the
|
This fixes a logic regression introduced by #123247 where ```python if self.use_device and self.use_device != _get_privateuse1_backend_name(): ``` was replaced with ```python VALID_DEVICE_OPTIONS = ["cuda", "xpu", "privateuseone"] if self.use_device not in VALID_DEVICE_OPTIONS: ``` That triggers a warning every time code is invoke with `self.use_device` set to None This change also skips all the checks which are useless if `use_device` is None to begin with Pull Request resolved: #125654 Approved by: https://github.com/aaronenyeshi
This PR unifies the CUDA, XPU and PrivateUse1 in the torch profiler. Now CUDA, XPU and PrivateUse1 can together use string object
use_device
to distinguish each other and share one device path for calculating kineto time durations and memory statistics for post processing.#suppress-api-compatibility-check
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k @ezyang @gchanan @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @rohan-varma @aakhundov