-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[refactor] Optimize vector and matrix ndarray fill #3921
Conversation
✔️ Deploy Preview for jovial-fermat-aa59dc ready! 🔨 Explore the source changes: 8ee89cb 🔍 Inspect the deploy log: https://app.netlify.com/sites/jovial-fermat-aa59dc/deploys/61cecde6f926ab000763e5ae 😎 Browse the preview: https://deploy-preview-3921--jovial-fermat-aa59dc.netlify.app |
/format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/format |
Speedup ndarray
fill
usingcuMemset32D
andstd::fill
instead of the kernel-based approach. Measured speedups (using the test script from PR 3907:ti.Matrix.ndarray(4, 4, ti.f32, N)
ti.init(arch=ti.cuda)
iterations
ti.init(arch=ti.cpu)
iterations