Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

element_shape should be treated as compile time information for Ndarrays #3903

Open
Tracked by #3988
ailzhang opened this issue Dec 29, 2021 · 4 comments
Open
Tracked by #3988
Labels
potential bug Something that looks like a bug but not yet confirmed

Comments

@ailzhang
Copy link
Contributor

ailzhang commented Dec 29, 2021

Now they're treated at runtime information.
Repro script:

import taichi as ti
import numpy as np
ti.init(arch=ti.opengl, log_level=ti.TRACE)

def _test_compiled_functions():
    @ti.kernel
    def func(a: ti.any_arr(element_dim=1)):
        for i in range(5):
            for j in range(4):
                a[i][j * j] = j * j

    v = ti.Vector.ndarray(10, ti.i32, 5)
    func(v)
    assert ti.get_runtime().get_num_compiled_functions() == 1
    v = ti.Vector.ndarray(11, ti.i32, 5)
    func(v)
    print(ti.get_runtime().get_num_compiled_functions())

_test_compiled_functions()

Note that although we compiled twice for the above program, the generated opengl kernels are indeed identical (shown below). In other words, we ought to have used element_shape as compile time information but we didn't.
This can also be verified in the code snippet below that _args_i32_[16 + 0 * 8 + 1] (element_shape=10 or 11), which should have been a constant, but now it's loaded as a runtime variable.
Side note: the reason we recompile for different element_shape now is that we force that to happen in python. https://github.com/taichi-dev/taichi/blob/master/python/taichi/lang/kernel_impl.py#L271-L289 Unfortunately we only enforce recompile but didn't make use of the compile time information at all.

#version 430 core
layout(local_size_x = 128, local_size_y = 1, local_size_z = 1) in;
precision highp float;
layout(std430, binding = 2) buffer args_i32 { int _args_i32_[];};
layout(std430, binding = 4) buffer arr0_i32 { int _arr0_i32_[];};

const float inf = 1.0f / 0.0f;
const float nan = 0.0f / 0.0f;
void func_c44_00()
{ // range for
  // range known at compile time
  int _sid0 = int(gl_GlobalInvocationID.x);
  for (int _sid = _sid0; _sid < (5); _sid += int(gl_WorkGroupSize.x * gl_NumWorkGroups.x)) {
    int _itv = 0 + _sid;
      int B = _itv;
      int C = int(0);
      int D = int(4);
      for (int E_ = C; E_ < D; E_ += 1) {
        int E = E_;
        int F = E;
        int G = F * F;
        int _li_I = 0;
        int _s0_arr0 = _args_i32_[16 + 0 * 8 + 0];
        int _s1_arr0 = _args_i32_[16 + 0 * 8 + 1];
        { // linear seek
          _li_I *= _s0_arr0;
          _li_I += B;
          _li_I *= _s1_arr0;
          _li_I += G;
        }
        int I = _li_I << 2;
        _arr0_i32_[I >> 2] = G;
      }
  }
}

void main()
{
  func_c44_00();
}

A potential fix is https://github.com/taichi-dev/taichi/blob/master/taichi/program/ndarray.h#L30,

class Ndarray {
  // this should be separate field_shape (runtime info, copied to args) 
  // and static element_shape (compile time info, hard-coded in shader)    
  std::vector<int> shape;  
}
@ailzhang ailzhang added the potential bug Something that looks like a bug but not yet confirmed label Dec 29, 2021
@k-ye k-ye changed the title element_shape should be treated at compile time information for ndarrays element_shape should be treated as compile time information for Ndarrays Dec 29, 2021
@k-ye
Copy link
Member

k-ye commented Dec 29, 2021

+1 for using the element_shape as a compile-time info.

For the potential fix, I belive DataType can represent either a PrimititveType or a TensorType, the latter can encode the element shape information.

@strongoier
Copy link
Contributor

element_shape is not used in compiled results simply because we use ExternalPtrStmt for quick implementation at its birth. We may design a new statement in CHI IR if we want to change the way to access external pointer elements.

@ailzhang
Copy link
Contributor Author

@strongoier Agreed, we'll need to build a few additional passes for ndarrays as well. Currently I'm hacking locally in opengl codegen #3900 :P

@turbo0628
Copy link
Member

The above PR didn't finish the TensorType thing. Shall polish in a follow-up work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential bug Something that looks like a bug but not yet confirmed
Projects
None yet
Development

No branches or pull requests

4 participants