Different value on a matrix when print is present and absent #6663

lin-hitonami · 2022-11-18T08:49:37Z

Describe the bug
Different value on a matrix when print is present and absent
Originally posted in https://forum.taichi-lang.cn/t/topic/3547/4

To Reproduce

import taichi as ti
ti.init(ti.cpu,dynamic_index=True)


@ti.func
def jacob_eigen_test(a:ti.template()):
    p = ti.math.eye(a.n)
    tol = 1.0e-7
    sig = ti.Vector.zero(ti.f32,a.n)
    aMax = 1.0
    print('p1',p[0,0])
    while aMax > tol:
        print('p2',p[0,0])
        aMax = 0

        for i in range(a.n):      # Update transformation matrix
            p[i,0] = -1

    for ii in range(a.n):
        sig[ii] = a[ii,ii]
    return sig, p

@ti.kernel
def test():
    test_S =ti.math.mat4(0)
    Sig, P = jacob_eigen_test(test_S)
    # print(Sig,P)

@ti.kernel
def test2():
    test_S =ti.math.mat4(0)
    Sig, P = jacob_eigen_test(test_S)
    print(Sig,P)


test()
print('test2')
test2()

Log/Screenshots

p1 1.000000
p2 1.000000
test2
p1 1.000000
p2 0.000000
[0.000000, 0.000000, 0.000000, 0.000000] [[-1.000000, 0.000000, 0.000000, 0.000000], [-1.000000, 1.000000, 0.000000, 0.000000], [-1.000000, 0.000000, 1.000000, 0.000000], [-1.000000, 0.000000, 0.000000, 1.000000]]

Additional comments
The IR of test:

kernel {
$0 = offloaded  
body {
  <f32> $1 = const -1.0
  <i32> $2 = const 4
  <i32> $3 = const 0
  <i32> $4 = const 1
  <i32> $5 = const 2
  <f32> $6 = const 0.0
  <f32> $7 = const 1.0
  <[Tensor (4, 4) f32]> $8 = alloca
  <*f32> $9 = shift ptr [$8 + $3]
  <f32> $10 : local store [$9 <- $7]
  <f32> $11 = const 1e-07
  <f32> $12 = alloca
  <f32> $13 : local store [$12 <- $7]
  print "p1 ", $7, "\n"
  $15 : while true {
    <f32> $16 = local load [$12]
    <i32> $17 = cmp_gt $16 $11
    <i32> $18 = bit_and $17 $4
    $19 : if $18 {
    } else {
      $20 : while control nullptr, $3
    }
    <f32> $21 = local load [$9]
    print "p2 ", $21, "\n"
    <f32> $23 : local store [$12 <- $6]
    $24 : for in range($3, $2) block_dim=adaptive {
      <i32> $25 = loop $24 index 0
      <i32> $26 = bit_shl $25 $5
      <*f32> $27 = shift ptr [$8 + $26]
      <f32> $28 : local store [$27 <- $1]
    }
  }
}
}

The IR of test2:

kernel {
$0 = offloaded  
body {
  <f32> $1 = const -1.0
  <i32> $2 = const 4
  <i32> $3 = const 0
  <i32> $4 = const 1
  <i32> $5 = const 2
  <i32> $6 = const 3
  <i32> $7 = const 5
  <i32> $8 = const 6
  <i32> $9 = const 7
  <i32> $10 = const 8
  <i32> $11 = const 9
  <i32> $12 = const 10
  <i32> $13 = const 11
  <i32> $14 = const 12
  <i32> $15 = const 13
  <i32> $16 = const 14
  <i32> $17 = const 15
  <f32> $18 = const 0.0
  <[Tensor (4, 4) f32]> $19 = global tmp var (offset = 0 B)
  <*f32> $20 = shift ptr [$19 + $3]
  $21 : global store [$20 <- $18]
  <*f32> $22 = shift ptr [$19 + $2]
  $23 : global store [$22 <- $18]
  <*f32> $24 = shift ptr [$19 + $10]
  $25 : global store [$24 <- $18]
  <*f32> $26 = shift ptr [$19 + $14]
  $27 : global store [$26 <- $18]
  <i32> $28 = const 16
  <*f32> $29 = shift ptr [$19 + $28]
  $30 : global store [$29 <- $18]
  <i32> $31 = const 20
  <*f32> $32 = shift ptr [$19 + $31]
  $33 : global store [$32 <- $18]
  <i32> $34 = const 24
  <*f32> $35 = shift ptr [$19 + $34]
  $36 : global store [$35 <- $18]
  <i32> $37 = const 28
  <*f32> $38 = shift ptr [$19 + $37]
  $39 : global store [$38 <- $18]
  <i32> $40 = const 32
  <*f32> $41 = shift ptr [$19 + $40]
  $42 : global store [$41 <- $18]
  <i32> $43 = const 36
  <*f32> $44 = shift ptr [$19 + $43]
  $45 : global store [$44 <- $18]
  <i32> $46 = const 40
  <*f32> $47 = shift ptr [$19 + $46]
  $48 : global store [$47 <- $18]
  <i32> $49 = const 44
  <*f32> $50 = shift ptr [$19 + $49]
  $51 : global store [$50 <- $18]
  <i32> $52 = const 48
  <*f32> $53 = shift ptr [$19 + $52]
  $54 : global store [$53 <- $18]
  <i32> $55 = const 52
  <*f32> $56 = shift ptr [$19 + $55]
  $57 : global store [$56 <- $18]
  <i32> $58 = const 56
  <*f32> $59 = shift ptr [$19 + $58]
  $60 : global store [$59 <- $18]
  <i32> $61 = const 60
  <*f32> $62 = shift ptr [$19 + $61]
  $63 : global store [$62 <- $18]
  <*f32> $64 = shift ptr [$19 + $4]
  $65 : global store [$64 <- $18]
  <*f32> $66 = shift ptr [$19 + $5]
  $67 : global store [$66 <- $18]
  <*f32> $68 = shift ptr [$19 + $6]
  $69 : global store [$68 <- $18]
  <*f32> $70 = shift ptr [$19 + $7]
  $71 : global store [$70 <- $18]
  <*f32> $72 = shift ptr [$19 + $8]
  $73 : global store [$72 <- $18]
  <*f32> $74 = shift ptr [$19 + $9]
  $75 : global store [$74 <- $18]
  <*f32> $76 = shift ptr [$19 + $11]
  $77 : global store [$76 <- $18]
  <*f32> $78 = shift ptr [$19 + $12]
  $79 : global store [$78 <- $18]
  <*f32> $80 = shift ptr [$19 + $13]
  $81 : global store [$80 <- $18]
  <*f32> $82 = shift ptr [$19 + $15]
  $83 : global store [$82 <- $18]
  <*f32> $84 = shift ptr [$19 + $16]
  $85 : global store [$84 <- $18]
  <*f32> $86 = shift ptr [$19 + $17]
  $87 : global store [$86 <- $18]
  <f32> $88 = const 1.0
  <[Tensor (4, 4) f32]> $89 = global tmp var (offset = 80 B)
  <*f32> $90 = shift ptr [$89 + $3]
  <*f32> $91 = shift ptr [$89 + $2]
  $92 : global store [$91 <- $18]
  <*f32> $93 = shift ptr [$89 + $10]
  $94 : global store [$93 <- $18]
  <*f32> $95 = shift ptr [$89 + $14]
  $96 : global store [$95 <- $18]
  <*f32> $97 = shift ptr [$89 + $28]
  $98 : global store [$97 <- $18]
  <*f32> $99 = shift ptr [$89 + $31]
  $100 : global store [$99 <- $18]
  <*f32> $101 = shift ptr [$89 + $34]
  $102 : global store [$101 <- $18]
  <*f32> $103 = shift ptr [$89 + $37]
  $104 : global store [$103 <- $18]
  <*f32> $105 = shift ptr [$89 + $40]
  $106 : global store [$105 <- $18]
  <*f32> $107 = shift ptr [$89 + $43]
  $108 : global store [$107 <- $18]
  <*f32> $109 = shift ptr [$89 + $46]
  $110 : global store [$109 <- $18]
  <*f32> $111 = shift ptr [$89 + $49]
  $112 : global store [$111 <- $18]
  <*f32> $113 = shift ptr [$89 + $52]
  $114 : global store [$113 <- $18]
  <*f32> $115 = shift ptr [$89 + $55]
  $116 : global store [$115 <- $18]
  <*f32> $117 = shift ptr [$89 + $58]
  $118 : global store [$117 <- $18]
  <*f32> $119 = shift ptr [$89 + $61]
  $120 : global store [$119 <- $18]
  $121 : global store [$90 <- $88]
  <*f32> $122 = shift ptr [$89 + $4]
  $123 : global store [$122 <- $18]
  <*f32> $124 = shift ptr [$89 + $5]
  $125 : global store [$124 <- $18]
  <*f32> $126 = shift ptr [$89 + $6]
  $127 : global store [$126 <- $18]
  <*f32> $128 = shift ptr [$89 + $7]
  $129 : global store [$128 <- $88]
  <*f32> $130 = shift ptr [$89 + $8]
  $131 : global store [$130 <- $18]
  <*f32> $132 = shift ptr [$89 + $9]
  $133 : global store [$132 <- $18]
  <*f32> $134 = shift ptr [$89 + $11]
  $135 : global store [$134 <- $18]
  <*f32> $136 = shift ptr [$89 + $12]
  $137 : global store [$136 <- $88]
  <*f32> $138 = shift ptr [$89 + $13]
  $139 : global store [$138 <- $18]
  <*f32> $140 = shift ptr [$89 + $15]
  $141 : global store [$140 <- $18]
  <*f32> $142 = shift ptr [$89 + $16]
  $143 : global store [$142 <- $18]
  <*f32> $144 = shift ptr [$89 + $17]
  $145 : global store [$144 <- $88]
  <f32> $146 = const 1e-07
  <[Tensor (4) f32]> $147 = global tmp var (offset = 64 B)
  <*f32> $148 = shift ptr [$147 + $3]
  $149 : global store [$148 <- $18]
  <*f32> $150 = shift ptr [$147 + $2]
  $151 : global store [$150 <- $18]
  <*f32> $152 = shift ptr [$147 + $10]
  $153 : global store [$152 <- $18]
  <*f32> $154 = shift ptr [$147 + $14]
  $155 : global store [$154 <- $18]
  <*f32> $156 = shift ptr [$147 + $4]
  $157 : global store [$156 <- $18]
  <*f32> $158 = shift ptr [$147 + $5]
  $159 : global store [$158 <- $18]
  <*f32> $160 = shift ptr [$147 + $6]
  $161 : global store [$160 <- $18]
  <f32> $162 = alloca
  <f32> $163 : local store [$162 <- $88]
  print "p1 ", $88, "\n"
  $165 : while true {
    <f32> $166 = local load [$162]
    <i32> $167 = cmp_gt $166 $146
    <i32> $168 = bit_and $167 $4
    $169 : if $168 {
    } else {
      $170 : while control nullptr, $3
    }
    <f32> $171 = global load $90
    print "p2 ", $171, "\n"
    <f32> $173 : local store [$162 <- $18]
    $174 : for in range($3, $2) block_dim=adaptive {
      <i32> $175 = loop $174 index 0
      <i32> $176 = bit_shl $175 $5
      <*f32> $177 = shift ptr [$89 + $176]
      $178 : global store [$177 <- $1]
    }
  }
}
}

The text was updated successfully, but these errors were encountered:

lin-hitonami · 2022-11-18T08:51:46Z

Is this related to the dynamic index? @strongoier

lin-hitonami · 2022-11-21T06:57:59Z

This problem exists on taichi 0.9.2 (the oldest version on the pypi).

strongoier · 2022-12-06T13:45:55Z

A simpler code snippet:

import taichi as ti
ti.init(ti.cpu, dynamic_index=True)

@ti.func
def jacob_eigen_test():
    p = ti.Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
    loop = 1
    sig = ti.Vector([0, 0, 0, 0])
    print('p1', p[0, 0])
    while loop == 1:
        print('p2', p[0, 0])
        loop = 0
        p[0, 0] = -1
    for i in range(1):
        sig[i] = 2
    return sig, p

@ti.kernel
def test():
    Sig, P = jacob_eigen_test()

@ti.kernel
def test2():
    Sig, P = jacob_eigen_test()
    print(Sig,P)


test()
print('test2')
test2()

Issue: fix #6663 ### Brief Summary In `MatrixPtrStmt`, when `origin` is `GlobalTemporaryStmt`, the semantics of `offset` has changed from the number of bytes to the number of elements. This PR fixes the outdated usage which may overwrite the global tmp buffer. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…v#6820) Issue: fix taichi-dev#6663 ### Brief Summary In `MatrixPtrStmt`, when `origin` is `GlobalTemporaryStmt`, the semantics of `offset` has changed from the number of bytes to the number of elements. This PR fixes the outdated usage which may overwrite the global tmp buffer. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

lin-hitonami added the potential bug Something that looks like a bug but not yet confirmed label Nov 18, 2022

ailzhang assigned strongoier and lin-hitonami Nov 25, 2022

strongoier mentioned this issue Dec 6, 2022

[Bug] Avoid overwriting global tmp with dynamic_index=True #6820

Merged

strongoier closed this as completed in #6820 Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different value on a matrix when print is present and absent #6663

Different value on a matrix when print is present and absent #6663

lin-hitonami commented Nov 18, 2022 •

edited

Loading

lin-hitonami commented Nov 18, 2022

lin-hitonami commented Nov 21, 2022

strongoier commented Dec 6, 2022

Different value on a matrix when print is present and absent #6663

Different value on a matrix when print is present and absent #6663

Comments

lin-hitonami commented Nov 18, 2022 • edited Loading

lin-hitonami commented Nov 18, 2022

lin-hitonami commented Nov 21, 2022

strongoier commented Dec 6, 2022

lin-hitonami commented Nov 18, 2022 •

edited

Loading