[DISCUSS] Revisit Buffer Boundary Check & Region Cover Check #138

spectrometerHBH · 2020-10-02T13:44:36Z

I've been revisiting compute_at and working on reverse_compute_at these days, and I've been thinking over the issue of buffer boundary check apache/tvm#6596.

I put some of my thoughts here for discussion.

A buffer boundary check problem in current compute_at

The basic logic of current compute_at(input_block, input_loop) is

do checks
calculate the consume requirement of consumers under input_loop (may be relaxed)
generate new loops outside input_block to cover the requirement (may produce more)

Hence, after compute_at, input_block may produce more than before, which consumes more input and can lead to access out of input buffers' boundary.

A clear solution to this problem is to add a predicate in block.where to do buffer boundary check.

Region Cover Check

I introduce region cover check in compute_at to restrict the data flow. It is still important in reverse_compute_at.
For example (see also #102 )

# Before
for i in [0,4):
    for k:
       opaque_produce(C[i*4, i*4 + 4])
for j in [3, 30):
      D[j] = C[j]+1

# After(wrong)
for i in [0, 4):
    for k:
       opaque_produce(C[i*4, i*4 + 4])
    for j in 4:
        D[j] = C[j]+1

# After(correct)
for i in [0, 4):
    for k:
       opaque_produce(C[i*4, i*4 + 4])
    for j in 4:
      if i*4+j in [3, 30):
        D[j] = C[j]+1

Before the transformation, D actually doesn't consume all the output of opaque_produce, and may consume elements out of the output of opaque_produce, we have to add a predicate to fix the data flow.

However, it is not trivial to judge whether the region cover is satisfied.

Region Cover Check is non-trivial

See also #45
To check if A's output covers B's input under some loop i, we can not relax A's output region, otherwise, the data flow will be problematic and the check is meaningless.

for ii in [0, 1)
    # A access {ii, ii+2, ii+4, ii+6}, which will be relaxed to [ii, ii+7)
    for io in [0, 4)
        A[io*2+ii] = io
# A's output actually doesn't cover B's input
# But with relaxation we will miss this fact
for j in [0, 7):
    B[j] = A[j]

compute_at(A, j) # which should have been banned by region cover check, but it will slip through

The text was updated successfully, but these errors were encountered:

spectrometerHBH · 2020-10-02T13:45:11Z

cc @tqchen @Hzfengsy @junrushao1994

spectrometerHBH · 2020-10-02T14:14:52Z

An insight from above examples is that block itself is not enough, the detailed data flow and dependency between blocks matter. But to calculate the exact IO regions is intractable.

I'm trying to figure out ways to get around it.

Hzfengsy · 2020-10-02T14:42:57Z

As far as I know, if a block is continuous, (which means it can produce any part of the buffer A as we want). It always can be compute_at, but may bring repeated calculation. However, if a block has limited elements that can be produced (e.g., a block can only write A[1], A[3], but cannot write A[2], A[4]), the compute_at for this block must be baned.

tqchen · 2020-10-02T14:48:47Z

In the particular case of reverse_compute_at. Let us focus on a more restricted condition than compute at that does not allow relaxation, which is most of our use cases. (e.g. conv2d follows bias add and bn

tqchen · 2020-10-02T15:33:07Z

To also specific discuss the problem below, there are possibly two options to check A's covering, both are related:

K0: detect and analyze split/fuse patterns(aka quasi affine mappings), which is mainly something of interest
- fuse pattern, two iterators => one iterator: x = xo * stride + xi, xi in [0, stride]
- split pattern, one iterators => two iterators: xo = floordiv(x, stride), xi = floormod(x, stride)
K1: Introduce another interval analysis that is conservative(only rounds down)

NOTE that the iterator coupling matters if we want to do conservative analysis. For example xo * 3 + xi is only can only be rewritten to another iterator, if xo and xi are "independent iterators". So in order to do such analysis, we need to maintain a set of currently active iterators, and "consume" an iterator(e.g. it can no longer be used to followup transformations unless for a explicit subexpr look up) once it is used by a split or possibly fuse pattern.

The analysis goes as follows, for each subexpr, we want to map it to an iterator plus the conservative bound of that iterator.

The first thing we do is lookup e.g. if (xo * stride + xi, is already mapped to x) we return x.
The a bit more tricky part is how to deal with iterator splitting aka x % b0 and x / b0 case. Notably such splitting can happen in multiple ways(multiple splits). It seems to related to the canonical form(SplitExpr) in canonical simplify, where multiple splits(that are compatible) can be canonicalized to ((x[i] / b0[i]) % b1[i])

Related, canonical simplify: https://github.com/apache/incubator-tvm/blob/master/src/arith/canonical_simplify.cc#L88

tqchen · 2020-10-11T22:56:28Z

apache/tvm#6667 Should provide the necessary utility. We might still make some enhancement based on the usecases we find(e.g. add predication => constraint, and better symbolic support) but the overal infra should solve the problem to some extent

tqchen closed this as completed Dec 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DISCUSS] Revisit Buffer Boundary Check & Region Cover Check #138

[DISCUSS] Revisit Buffer Boundary Check & Region Cover Check #138

spectrometerHBH commented Oct 2, 2020 •

edited

Loading

spectrometerHBH commented Oct 2, 2020

spectrometerHBH commented Oct 2, 2020 •

edited

Loading

Hzfengsy commented Oct 2, 2020

tqchen commented Oct 2, 2020

tqchen commented Oct 2, 2020 •

edited

Loading

tqchen commented Oct 11, 2020

[DISCUSS] Revisit Buffer Boundary Check & Region Cover Check #138

[DISCUSS] Revisit Buffer Boundary Check & Region Cover Check #138

Comments

spectrometerHBH commented Oct 2, 2020 • edited Loading

A buffer boundary check problem in current compute_at

Region Cover Check

Region Cover Check is non-trivial

spectrometerHBH commented Oct 2, 2020

spectrometerHBH commented Oct 2, 2020 • edited Loading

Hzfengsy commented Oct 2, 2020

tqchen commented Oct 2, 2020

tqchen commented Oct 2, 2020 • edited Loading

tqchen commented Oct 11, 2020

spectrometerHBH commented Oct 2, 2020 •

edited

Loading

spectrometerHBH commented Oct 2, 2020 •

edited

Loading

tqchen commented Oct 2, 2020 •

edited

Loading