Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor] Optimize vector and matrix ndarray fill #3921

Merged
merged 8 commits into from
Jan 3, 2022

Conversation

qiao-bo
Copy link
Collaborator

@qiao-bo qiao-bo commented Dec 31, 2021

Speedup ndarray fill using cuMemset32D and std::fill instead of the kernel-based approach. Measured speedups (using the test script from PR 3907:

ti.Matrix.ndarray(4, 4, ti.f32, N)

  1. ti.init(arch=ti.cuda)
iterations Master Head
10 0.457 s 0.050 ms
100 4.705 s 0.475 ms
1000 65.971 s 4.374 ms
  1. ti.init(arch=ti.cpu)
iterations Master Head
10 0.407 s 0.150 s
100 4.144 s 1.488 s
1000 42.127 s 14.693 s

@netlify
Copy link

netlify bot commented Dec 31, 2021

✔️ Deploy Preview for jovial-fermat-aa59dc ready!

🔨 Explore the source changes: 8ee89cb

🔍 Inspect the deploy log: https://app.netlify.com/sites/jovial-fermat-aa59dc/deploys/61cecde6f926ab000763e5ae

😎 Browse the preview: https://deploy-preview-3921--jovial-fermat-aa59dc.netlify.app

python/taichi/lang/_ndarray.py Outdated Show resolved Hide resolved
python/taichi/lang/matrix.py Outdated Show resolved Hide resolved
@qiao-bo
Copy link
Collaborator Author

qiao-bo commented Dec 31, 2021

/format

Copy link
Contributor

@strongoier strongoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@qiao-bo qiao-bo requested a review from k-ye December 31, 2021 09:17
@qiao-bo
Copy link
Collaborator Author

qiao-bo commented Dec 31, 2021

/format

@qiao-bo qiao-bo merged commit 21916b5 into taichi-dev:master Jan 3, 2022
@qiao-bo qiao-bo deleted the fill_matrix branch January 3, 2022 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants