Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opengl] Optimize range_for for ndarrays #3884

Merged
merged 4 commits into from
Dec 29, 2021

Conversation

ailzhang
Copy link
Contributor

@ailzhang ailzhang commented Dec 27, 2021

Stack from ghstack:

Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

[ghstack-poisoned]
ailzhang pushed a commit that referenced this pull request Dec 27, 2021
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

ghstack-source-id: ae80e9c8d92ee131a37f54bff10a710dd21e79de
Pull Request resolved: #3884
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

[ghstack-poisoned]
ailzhang pushed a commit that referenced this pull request Dec 27, 2021
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

ghstack-source-id: bdb9feba35fa8cd6cc99b826679dde353d706e07
Pull Request resolved: #3884
@ailzhang ailzhang changed the title [refactor] Optimize range_for for ndarrays [codegen] Optimize range_for for ndarrays Dec 28, 2021
@@ -756,7 +762,8 @@ class RangeForStmt : public Stmt {
bit_vectorize,
num_cpu_threads,
block_dim,
strictly_serialized);
strictly_serialized,
range_of_array);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter name seems slightly confusing?

@@ -85,19 +85,35 @@ class Offloader {
} else {
offloaded->block_dim = s->block_dim;
}
if (auto val = s->begin->cast<ConstStmt>()) {

// TODO: We need to update codegen for each backend gradually so let's
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we merge this in let's also notify people working on the vulkan ndarray this change

@@ -202,7 +202,7 @@ def init(d: ti.i32, density1: ti.any_arr(), density2: ti.any_arr(),

@ti.test(arch=ti.opengl)
def test_opengl_exceed_max_ssbo():
# 7 ndarrays + gtmp + args > 8 (maximum allowed)
# 8 ndarrays + args > 8 (maximum allowed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want this test? This seems a bit arch specific & a lot of devices support more

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this test was mainly a self reminder to double check how many ssbos we create in normal cases - we can remove it later :D

Copy link
Collaborator

@bobcao3 bobcao3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a few minor nitpicks

Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

[ghstack-poisoned]
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

[ghstack-poisoned]
@ailzhang ailzhang changed the title [codegen] Optimize range_for for ndarrays [opengl] Optimize range_for for ndarrays Dec 29, 2021
@ailzhang ailzhang merged commit 94d21c9 into gh/ailzhang/40/base Dec 29, 2021
ailzhang pushed a commit that referenced this pull request Dec 29, 2021
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

ghstack-source-id: 3e2c2827bf752d6397cfc7780f2a79f9f2853a3d
Pull Request resolved: #3884
quadpixels pushed a commit to quadpixels/taichi that referenced this pull request Jan 5, 2022
Note this PR reduces number of generated shaders for mpm88 from 10 to 6.
There's still one gtmp related shader remaining for temporary values
used across multiple shaders. Whether to further remove that shader need
more benchmark so let's just get rid of addtional shaders introduced by
ndarray range_for for now.

ghstack-source-id: 3e2c2827bf752d6397cfc7780f2a79f9f2853a3d
Pull Request resolved: taichi-dev#3884
@ailzhang ailzhang deleted the gh/ailzhang/40/head branch January 5, 2022 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants