Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit f499e60
Author: Masahiro Masuda <masahi129@gmail.com>
Date:   Fri Apr 15 04:11:02 2022 +0900

    Squashed commit of the following:

    commit dcb628d
    Author: Masahiro Masuda <masahi129@gmail.com>
    Date:   Thu Apr 14 17:10:27 2022 +0900

        Squashed commit of the following:

        commit dd956ec
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 16:53:34 2022 +0900

            add conv2d relay test

        commit 7291e47
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 16:46:05 2022 +0900

            add dense and bmm test

        commit a957dde
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 16:32:43 2022 +0900

            conv2d topi test working

        commit 6d53c50
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 11:33:38 2022 +0900

            add mattr kind

        commit 3761bd7
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 11:12:14 2022 +0900

            update dot prod intrin

        commit e781ee1
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 11:02:43 2022 +0900

            black

        commit b2208a7
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 10:58:10 2022 +0900

            cleanup

        commit f8bc306
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Thu Apr 14 10:35:02 2022 +0900

            [ROCM] Support dp4a on AMDGPU by sdot4 intrinsic

            commit 0225f2b
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 08:56:10 2022 +0900

                share op strategy between cuda and rocm

            commit 762c7e8
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 08:28:34 2022 +0900

                fixed rocm batch_matmul strategy for mixed i8i8i32

            commit ce53e8d
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 06:17:30 2022 +0900

                add rocm sdot4 TIR intrin

            commit f4562b9
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 06:03:44 2022 +0900

                rocm sdot4 works

            commit 6cc6280
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 05:32:07 2022 +0900

                more wip

            commit 0602f4a
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 14 03:47:37 2022 +0900

                Squashed commit of the following:

                commit 65b8bcf
                Author: Masahiro Masuda <masahi129@gmail.com>
                Date:   Wed Apr 13 20:36:49 2022 +0900

                    [WIP] adding DP4A support to rocm

                commit 4f8f308
                Author: Masahiro Masuda <masahi129@gmail.com>
                Date:   Wed Apr 13 14:03:25 2022 +0900

                    Squashed commit of the following:

                    commit 1711be3
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 13 13:11:40 2022 +0900

                        fixed condition for real

                    commit 8a48fb5
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 13 09:57:42 2022 +0900

                        Revert "Skip applying sch_rule when both ann and sch_rule are defined"

                        This reverts commit 4915c6a.

                    commit daea033
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Mon Apr 11 09:31:05 2022 +0900

                        [Metaschedule] Support rocm and spirv

                    commit eb0cae2
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 13 07:25:04 2022 +0900

                        dp4a works

                    commit 4915c6a
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 13 06:13:45 2022 +0900

                        Skip applying sch_rule when both ann and sch_rule are defined

                    commit 7b3d71c
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 13 04:40:31 2022 +0900

                        fixed intrin description

                    commit 7666cd7
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Tue Apr 12 19:59:47 2022 +0900

                        add DP4A intrin

                    commit 7086bdb
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Tue Apr 12 19:03:44 2022 +0900

                        works

                    commit db34397
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Tue Apr 12 12:49:52 2022 +0900

                        more hack to tensorize loop mapping to make resnet50 e2e work

                    commit 2409674
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Mon Apr 11 13:40:59 2022 +0900

                        wip support pad + qnn.conv2d folding

                    commit 613cb7e
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Sun Apr 10 12:04:08 2022 +0900

                        hack to tensorize loop mapping to make conv2d work

                    commit 9e4f9df
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Sun Apr 10 11:34:13 2022 +0900

                        wrap tensorize with try/catch

                    commit d4b496d
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Sun Apr 10 11:33:39 2022 +0900

                        revert change in task_scheduler.cc

                    commit 476129b
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Sat Apr 9 05:54:10 2022 +0900

                        try / catch in ThreadedApply

                    commit d8226ff
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Fri Apr 8 17:17:59 2022 +0900

                        filter out invalid candidate

                    commit 2632899
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Fri Apr 8 10:09:48 2022 +0900

                        try graceful exit in parallel_for_dynamic

                    commit 9d6741c
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Fri Apr 8 09:35:51 2022 +0900

                        [QNN] Fix broadcast for invalid axis

                    commit 6ccde09
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 20:51:15 2022 +0900

                        refactor rewrite_tensorize

                    commit 2ce2066
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 20:48:17 2022 +0900

                        allow missing schedule_rule in post order apply

                    commit 3a69353
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 19:42:48 2022 +0900

                        refactor rewrite_tensorize

                    commit 43e0b2f
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 18:25:14 2022 +0900

                        rewrite_vnni -> rewrite_tensorize

                    commit 823797e
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 18:12:12 2022 +0900

                        VNNI -> WithIntrin

                    commit 4284a47
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 17:45:41 2022 +0900

                        introduce TileForIntrin

                    commit b87ef32
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 17:34:04 2022 +0900

                        move TilingwithTensorIntrin to auto_tensorize.cc

                    commit 2fc118b
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 17:28:45 2022 +0900

                        clean up headers

                    commit d8b2aa3
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 17:09:32 2022 +0900

                        clean up using namespace

                    commit eb05d25
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 17:03:05 2022 +0900

                        refactored init

                    commit 5e6b0a0
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 16:57:14 2022 +0900

                        compiled

                    commit 2b8c430
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 12:51:55 2022 +0900

                        wip MultiLevelTiling refactor

                    commit 7c21a9f
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:58:33 2022 +0900

                        function doc string not supported by tvmscript

                    commit 40f9742
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:56:45 2022 +0900

                        update vnni intrin name

                    commit 4814f82
                    Merge: e0c5eb8 07bbb38
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:44:47 2022 +0900

                        Merge branch 'tir-tensor-intrin' into auto-tensorize-vnni

                    commit 07bbb38
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:24:56 2022 +0900

                        more lint fix

                    commit 15e60b4
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:16:08 2022 +0900

                        black

                    commit 7a757fe
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:12:54 2022 +0900

                        pylint

                    commit 9a3e508
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:58:52 2022 +0900

                        simplify import

                    commit d8e43ec
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:52:50 2022 +0900

                        use vectorlow/high in arm intrin

                    commit 625cd27
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:34:57 2022 +0900

                        fixed offset factor

                    commit 69e72b6
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:12:02 2022 +0900

                        Add ARM intrin

                    commit 1351fde
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 08:27:27 2022 +0900

                        use buffer syntax sugar

                    commit 0ced85f
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 08:17:43 2022 +0900

                        rename vnni.py to x86.py

                    commit 38a5aca
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:24:44 2022 +0900

                        add VNNI unittest

                    commit 88b763e
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:10:06 2022 +0900

                        refactored existing test using VNNI intrin

                    commit 711a007
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:04:58 2022 +0900

                        [TIR] Add VNNI dot product intrinsic for TIR

                    commit e0c5eb8
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:42:26 2022 +0900

                        merge fix

                    commit b171748
                    Merge: 71fe3bd 82e152a
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:33:59 2022 +0900

                        Merge branch 'tir-tensor-intrin' into auto-tensorize-vnni

                    commit 71fe3bd
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 06:57:38 2022 +0900

                        move tensor intrin under tir

                    commit 0c51bad
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 06:12:39 2022 +0900

                        remove log

                    commit fed910e
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 06:11:22 2022 +0900

                        more revert

                    commit 7150aff
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 06:10:44 2022 +0900

                        revert stmt_functor change

                    commit 155107b
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 06:10:09 2022 +0900

                        refactored RewriteVNNI a bit

                    commit ca15255
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 05:41:13 2022 +0900

                        add RewriteVNNI

                    commit dc9f71d
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 05:38:56 2022 +0900

                        vectorized init loop

                    commit fcc31ee
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 04:55:36 2022 +0900

                        tensorize worked

                    commit 2b53437
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 6 19:11:05 2022 +0900

                        TilingwithTensorIntrin works

                    commit 86baa31
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Wed Apr 6 08:58:27 2022 +0900

                        Ported auto-tensorization code

                    commit 82e152a
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:24:56 2022 +0900

                        more lint fix

                    commit 88d9bdd
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:16:08 2022 +0900

                        black

                    commit 31fe7eb
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 11:12:54 2022 +0900

                        pylint

                    commit 7876754
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:58:52 2022 +0900

                        simplify import

                    commit 56f2e9a
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:52:50 2022 +0900

                        use vectorlow/high in arm intrin

                    commit 995cc8d
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:34:57 2022 +0900

                        fixed offset factor

                    commit 86bbd49
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 10:12:02 2022 +0900

                        Add ARM intrin

                    commit 120fd96
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 08:27:27 2022 +0900

                        use buffer syntax sugar

                    commit 0f0682d
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 08:17:43 2022 +0900

                        rename vnni.py to x86.py

                    commit f88c31e
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:24:44 2022 +0900

                        add VNNI unittest

                    commit 6cc8009
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:10:06 2022 +0900

                        refactored existing test using VNNI intrin

                    commit 11a29c7
                    Author: Masahiro Masuda <masahi129@gmail.com>
                    Date:   Thu Apr 7 07:04:58 2022 +0900

                        [TIR] Add VNNI dot product intrinsic for TIR

        commit e370ed4
        Author: Chris Sullivan <csullivan@octoml.ai>
        Date:   Wed Apr 13 15:19:41 2022 -0700

            [Hexagon] Less aggressive adb state clean up (apache#10909)

            * Only remove port forwarding applied in a session
            to avoid affecting global adb state.

            * Send SIGINT to attempt to allow remote
            server to cleanup and undbind port in
            deconstruction

            * Only attempt to forward ports not in use by
            adb or the system.

        commit ce8f83e
        Author: Christian Convey <cconvey@octoml.ai>
        Date:   Wed Apr 13 16:25:39 2022 -0400

            [hexagon] 'add_hvx' test to explore HVX usage. (apache#10604)

            Add a unit test named 'add_hvx' to explore how various
            scheduling choices, tensor sizes, etc. impact efficient usage of Hexagon
            HVX units.

    commit 0602f4a
    Author: Masahiro Masuda <masahi129@gmail.com>
    Date:   Thu Apr 14 03:47:37 2022 +0900

        Squashed commit of the following:

        commit 65b8bcf
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Wed Apr 13 20:36:49 2022 +0900

            [WIP] adding DP4A support to rocm

        commit 4f8f308
        Author: Masahiro Masuda <masahi129@gmail.com>
        Date:   Wed Apr 13 14:03:25 2022 +0900

            Squashed commit of the following:

            commit 1711be3
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 13 13:11:40 2022 +0900

                fixed condition for real

            commit 8a48fb5
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 13 09:57:42 2022 +0900

                Revert "Skip applying sch_rule when both ann and sch_rule are defined"

                This reverts commit 4915c6a.

            commit daea033
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Mon Apr 11 09:31:05 2022 +0900

                [Metaschedule] Support rocm and spirv

            commit eb0cae2
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 13 07:25:04 2022 +0900

                dp4a works

            commit 4915c6a
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 13 06:13:45 2022 +0900

                Skip applying sch_rule when both ann and sch_rule are defined

            commit 7b3d71c
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 13 04:40:31 2022 +0900

                fixed intrin description

            commit 7666cd7
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Tue Apr 12 19:59:47 2022 +0900

                add DP4A intrin

            commit 7086bdb
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Tue Apr 12 19:03:44 2022 +0900

                works

            commit db34397
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Tue Apr 12 12:49:52 2022 +0900

                more hack to tensorize loop mapping to make resnet50 e2e work

            commit 2409674
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Mon Apr 11 13:40:59 2022 +0900

                wip support pad + qnn.conv2d folding

            commit 613cb7e
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Sun Apr 10 12:04:08 2022 +0900

                hack to tensorize loop mapping to make conv2d work

            commit 9e4f9df
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Sun Apr 10 11:34:13 2022 +0900

                wrap tensorize with try/catch

            commit d4b496d
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Sun Apr 10 11:33:39 2022 +0900

                revert change in task_scheduler.cc

            commit 476129b
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Sat Apr 9 05:54:10 2022 +0900

                try / catch in ThreadedApply

            commit d8226ff
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Fri Apr 8 17:17:59 2022 +0900

                filter out invalid candidate

            commit 2632899
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Fri Apr 8 10:09:48 2022 +0900

                try graceful exit in parallel_for_dynamic

            commit 9d6741c
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Fri Apr 8 09:35:51 2022 +0900

                [QNN] Fix broadcast for invalid axis

            commit 6ccde09
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 20:51:15 2022 +0900

                refactor rewrite_tensorize

            commit 2ce2066
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 20:48:17 2022 +0900

                allow missing schedule_rule in post order apply

            commit 3a69353
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 19:42:48 2022 +0900

                refactor rewrite_tensorize

            commit 43e0b2f
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 18:25:14 2022 +0900

                rewrite_vnni -> rewrite_tensorize

            commit 823797e
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 18:12:12 2022 +0900

                VNNI -> WithIntrin

            commit 4284a47
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 17:45:41 2022 +0900

                introduce TileForIntrin

            commit b87ef32
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 17:34:04 2022 +0900

                move TilingwithTensorIntrin to auto_tensorize.cc

            commit 2fc118b
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 17:28:45 2022 +0900

                clean up headers

            commit d8b2aa3
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 17:09:32 2022 +0900

                clean up using namespace

            commit eb05d25
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 17:03:05 2022 +0900

                refactored init

            commit 5e6b0a0
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 16:57:14 2022 +0900

                compiled

            commit 2b8c430
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 12:51:55 2022 +0900

                wip MultiLevelTiling refactor

            commit 7c21a9f
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:58:33 2022 +0900

                function doc string not supported by tvmscript

            commit 40f9742
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:56:45 2022 +0900

                update vnni intrin name

            commit 4814f82
            Merge: e0c5eb8 07bbb38
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:44:47 2022 +0900

                Merge branch 'tir-tensor-intrin' into auto-tensorize-vnni

            commit 07bbb38
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:24:56 2022 +0900

                more lint fix

            commit 15e60b4
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:16:08 2022 +0900

                black

            commit 7a757fe
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:12:54 2022 +0900

                pylint

            commit 9a3e508
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:58:52 2022 +0900

                simplify import

            commit d8e43ec
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:52:50 2022 +0900

                use vectorlow/high in arm intrin

            commit 625cd27
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:34:57 2022 +0900

                fixed offset factor

            commit 69e72b6
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:12:02 2022 +0900

                Add ARM intrin

            commit 1351fde
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 08:27:27 2022 +0900

                use buffer syntax sugar

            commit 0ced85f
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 08:17:43 2022 +0900

                rename vnni.py to x86.py

            commit 38a5aca
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:24:44 2022 +0900

                add VNNI unittest

            commit 88b763e
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:10:06 2022 +0900

                refactored existing test using VNNI intrin

            commit 711a007
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:04:58 2022 +0900

                [TIR] Add VNNI dot product intrinsic for TIR

            commit e0c5eb8
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:42:26 2022 +0900

                merge fix

            commit b171748
            Merge: 71fe3bd 82e152a
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:33:59 2022 +0900

                Merge branch 'tir-tensor-intrin' into auto-tensorize-vnni

            commit 71fe3bd
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 06:57:38 2022 +0900

                move tensor intrin under tir

            commit 0c51bad
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 06:12:39 2022 +0900

                remove log

            commit fed910e
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 06:11:22 2022 +0900

                more revert

            commit 7150aff
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 06:10:44 2022 +0900

                revert stmt_functor change

            commit 155107b
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 06:10:09 2022 +0900

                refactored RewriteVNNI a bit

            commit ca15255
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 05:41:13 2022 +0900

                add RewriteVNNI

            commit dc9f71d
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 05:38:56 2022 +0900

                vectorized init loop

            commit fcc31ee
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 04:55:36 2022 +0900

                tensorize worked

            commit 2b53437
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 6 19:11:05 2022 +0900

                TilingwithTensorIntrin works

            commit 86baa31
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Wed Apr 6 08:58:27 2022 +0900

                Ported auto-tensorization code

            commit 82e152a
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:24:56 2022 +0900

                more lint fix

            commit 88d9bdd
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:16:08 2022 +0900

                black

            commit 31fe7eb
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 11:12:54 2022 +0900

                pylint

            commit 7876754
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:58:52 2022 +0900

                simplify import

            commit 56f2e9a
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:52:50 2022 +0900

                use vectorlow/high in arm intrin

            commit 995cc8d
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:34:57 2022 +0900

                fixed offset factor

            commit 86bbd49
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 10:12:02 2022 +0900

                Add ARM intrin

            commit 120fd96
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 08:27:27 2022 +0900

                use buffer syntax sugar

            commit 0f0682d
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 08:17:43 2022 +0900

                rename vnni.py to x86.py

            commit f88c31e
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:24:44 2022 +0900

                add VNNI unittest

            commit 6cc8009
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:10:06 2022 +0900

                refactored existing test using VNNI intrin

            commit 11a29c7
            Author: Masahiro Masuda <masahi129@gmail.com>
            Date:   Thu Apr 7 07:04:58 2022 +0900

                [TIR] Add VNNI dot product intrinsic for TIR
  • Loading branch information
masahi committed Apr 15, 2022
1 parent b99a096 commit 90e7d89
Show file tree
Hide file tree
Showing 15 changed files with 611 additions and 23 deletions.
10 changes: 10 additions & 0 deletions include/tvm/meta_schedule/schedule_rule.h
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,16 @@ class ScheduleRule : public runtime::ObjectRef {
Optional<Array<Integer>> vector_load_lens, //
Optional<Map<String, ObjectRef>> reuse_read, //
Optional<Map<String, ObjectRef>> reuse_write);

TVM_DLL static ScheduleRule MultiLevelTilingWithIntrin(
String intrin_name, //
String structure, //
Optional<Array<String>> tile_binds, //
Optional<Integer> max_innermost_factor, //
Optional<Array<Integer>> vector_load_lens, //
Optional<Map<String, ObjectRef>> reuse_read, //
Optional<Map<String, ObjectRef>> reuse_write);

/*!
* \brief Create a rule: add-rfactor to some blocks if needed
* \param max_jobs_per_core The maximum number of jobs to be launched per CPU core. It sets the
Expand Down
5 changes: 5 additions & 0 deletions include/tvm/tir/stmt.h
Original file line number Diff line number Diff line change
Expand Up @@ -1509,6 +1509,11 @@ constexpr const char* meta_schedule_unroll_explicit = "meta_schedule.unroll_expl
/*! \brief Mark auto-unroll setting on the block. */
constexpr const char* meta_schedule_unroll_implicit = "meta_schedule.unroll_implicit";

/*!
* \brief Mark that the block should be further rewritten using tensorization.
*/
constexpr const char* meta_schedule_auto_tensorize = "meta_schedule.auto_tensorize";

/*!
* \brief Check if attr_key is a pragma key extension
* \param attr_key The attr key to be compared
Expand Down
1 change: 1 addition & 0 deletions python/tvm/meta_schedule/postproc/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@
from .rewrite_reduction_block import RewriteReductionBlock
from .rewrite_unbound_block import RewriteUnboundBlock
from .verify_gpu_code import VerifyGPUCode
from .rewrite_tensorize import RewriteTensorize
33 changes: 33 additions & 0 deletions python/tvm/meta_schedule/postproc/rewrite_tensorize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""A postprocessor that tensorize related components."""

from tvm._ffi.registry import register_object
from .. import _ffi_api
from .postproc import Postproc
import tvm.tir.tensor_intrin


@register_object("meta_schedule.RewriteTensorize")
class RewriteTensorize(Postproc):
"""A postprocessor that tensorize related components."""

def __init__(self, vectorize_init_loop=False) -> None:
self.__init_handle_by_constructor__(
_ffi_api.PostprocRewriteTensorize, # type: ignore # pylint: disable=no-member
vectorize_init_loop
)
2 changes: 1 addition & 1 deletion python/tvm/meta_schedule/schedule_rule/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from .add_rfactor import AddRFactor
from .auto_inline import AutoInline
from .cross_thread_reduction import CrossThreadReduction
from .multi_level_tiling import MultiLevelTiling, ReuseType
from .multi_level_tiling import MultiLevelTiling, MultiLevelTilingWithIntrin, ReuseType
from .parallel_vectorize_unroll import ParallelizeVectorizeUnroll
from .random_compute_location import RandomComputeLocation
from .schedule_rule import PyScheduleRule, ScheduleRule
47 changes: 47 additions & 0 deletions python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,50 @@ def __init__(
reuse_read.as_dict() if reuse_read is not None else None,
reuse_write.as_dict() if reuse_write is not None else None,
)


@register_object("meta_schedule.MultiLevelTilingWithIntrin")
class MultiLevelTilingWithIntrin(ScheduleRule):
"""Multi-level tiling with reuse.
Parameters
----------
structure : str
The tiling structure. Recommended:
- 'SSRSRS' on CPU
- 'SSSRRSRS' on GPU
tile_bind : Optional[List[str]]
For each level of tiles, which thread axis it is bound to. Recommended:
- None on CPU
- [blockIdx.x, vthread.x, threadIdx.x] on GPU
max_innermost_factor : Optional[int]
The maximum size of the innermost factor. None means no limit
vector_load_lens : Optional[List[int]]
The length of vector lane in vectorized cooperative fetching.
None means disable vectorization
reuse_read : Optional[ReuseType]
Data reuse configuration for reading. None means no reuse.
reuse_write : Optional[ReuseType]
Data reuse configuration for writing. None means no reuse.
"""

def __init__(
self,
intrin_name: str,
structure: str,
tile_binds: Optional[List[str]] = None,
max_innermost_factor: Optional[int] = None,
vector_load_lens: Optional[List[int]] = None,
reuse_read: Optional[ReuseType] = None,
reuse_write: Optional[ReuseType] = None,
) -> None:
self.__init_handle_by_constructor__(
_ffi_api.ScheduleRuleMultiLevelTilingWithIntrin, # type: ignore # pylint: disable=no-member
intrin_name,
structure,
tile_binds,
max_innermost_factor,
vector_load_lens,
reuse_read.as_dict() if reuse_read is not None else None,
reuse_write.as_dict() if reuse_write is not None else None,
)
104 changes: 104 additions & 0 deletions src/meta_schedule/postproc/rewrite_tensorize.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
#include <algorithm>

#include "../utils.h"
#include "tvm/runtime/container/base.h"

namespace tvm {
namespace meta_schedule {

using tir::BlockRV;
using tir::LoopRV;

void ApplyTensorization(const tir::Schedule& sch, const String& func_name,
const tir::PrimFuncNode* func, bool vectorize_init_loop) {
std::vector<std::pair<std::string, std::function<void(tir::BlockRV)>>> jobs;

tir::PostOrderVisit(func->body, [=, &jobs](const ObjectRef& obj) -> bool {
if (const auto* block = obj.as<tir::BlockNode>()) {
tir::StmtSRef block_sref = sch->GetSRef(block);
if (Optional<String> intrin_name =
tir::GetAnn<String>(block_sref, tir::attr::meta_schedule_auto_tensorize)) {
std::string block_name = block_sref->StmtAs<tir::BlockNode>()->name_hint;
if (block_name.find("init") == std::string::npos) {
jobs.emplace_back(block_name, [sch, intrin_name](tir::BlockRV block) {
try {
sch->Tensorize(block, intrin_name.value());
} catch (const std::exception& e) {
LOG(WARNING) << "Tensorize failed with error " << e.what();
}
});
} else if (vectorize_init_loop) {
jobs.emplace_back(block_name, [sch](tir::BlockRV block) {
Array<BlockRV> child_blocks = sch->GetChildBlocks(block);
ICHECK(child_blocks.size() == 1);
Array<LoopRV> init_loops = sch->GetLoops(child_blocks[0]);
ICHECK(init_loops.size() == 1);
sch->Vectorize(init_loops[0]);
});
}
}
}
return true;
});

for (auto kv : jobs) {
tir::BlockRV block = sch->GetBlock(kv.first, func_name);
sch->Unannotate(block, tir::attr::meta_schedule_auto_tensorize);
kv.second(block);
}
}

class RewriteTensorizeNode : public PostprocNode {
public:
void InitializeWithTuneContext(const TuneContext& context) final {}

bool Apply(const tir::Schedule& sch) final;

void VisitAttrs(tvm::AttrVisitor* v) {}

bool vectorize_init_loop = false;

static constexpr const char* _type_key = "meta_schedule.RewriteTensorize";
TVM_DECLARE_FINAL_OBJECT_INFO(RewriteTensorizeNode, PostprocNode);
};

bool RewriteTensorizeNode::Apply(const tir::Schedule& sch) {
for (const auto& kv : sch->mod()->functions) {
GlobalVar g_var = kv.first;
BaseFunc base_func = kv.second;
if (const tir::PrimFuncNode* prim_func = base_func.as<tir::PrimFuncNode>()) {
ApplyTensorization(sch, g_var->name_hint, prim_func, vectorize_init_loop);
}
}
return true;
}

Postproc RewriteTensorize(bool vectorize_init_loop) {
ObjectPtr<RewriteTensorizeNode> n = make_object<RewriteTensorizeNode>();
n->vectorize_init_loop = vectorize_init_loop;
return Postproc(n);
}

TVM_REGISTER_NODE_TYPE(RewriteTensorizeNode);
TVM_REGISTER_GLOBAL("meta_schedule.PostprocRewriteTensorize").set_body_typed(RewriteTensorize);

} // namespace meta_schedule
} // namespace tvm
99 changes: 99 additions & 0 deletions src/meta_schedule/schedule_rule/auto_tensorize.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#include "auto_tensorize.h"

#include "../../tir/schedule/analysis.h"

namespace tvm {
namespace meta_schedule {

using tir::LoopRV;

Optional<LoopRV> TilingwithTensorIntrin(const tir::Schedule& sch, const tir::BlockRV& block_rv,
const String& intrin_name) {
Optional<tir::TensorizeInfo> opt_tensorize_info = GetTensorizeLoopMapping(
sch->state(), sch->GetSRef(block_rv), tir::TensorIntrin::Get(intrin_name)->desc);
if (!opt_tensorize_info) return NullOpt;
const tir::TensorizeInfoNode* info = opt_tensorize_info.value().get();
// Construct a mapping from tir loops back to LoopRVs
Map<tir::StmtSRef, LoopRV> loop2rv;
{
Array<LoopRV> loop_rvs = sch->GetLoops(block_rv);
for (const LoopRV& loop_rv : loop_rvs) {
loop2rv.Set(sch->GetSRef(loop_rv), loop_rv);
}
}
// Split the loops
arith::Analyzer analyzer;
std::unordered_set<const tir::StmtSRefNode*> inner_loops;
std::vector<LoopRV> reorder_suffix;
reorder_suffix.resize(info->loop_map.size());
for (const auto& kv : info->loop_map) {
// Extract mapping (block_loop => desc_loop)
const tir::StmtSRef& block_loop_sref = kv.first;
const tir::ForNode* block_loop = block_loop_sref->StmtAs<tir::ForNode>();
const tir::ForNode* desc_loop = kv.second.get();
ICHECK(block_loop != nullptr && desc_loop != nullptr);
// Extract the loop extent
PrimExpr block_extent = analyzer.Simplify(block_loop->extent);
PrimExpr desc_extent = analyzer.Simplify(desc_loop->extent);
const auto* int_block_extent = block_extent.as<IntImmNode>();
const auto* int_desc_extent = desc_extent.as<IntImmNode>();
ICHECK(int_block_extent != nullptr && int_desc_extent != nullptr);
// Check divisibility
int64_t total = int_block_extent->value;
int64_t inner = int_desc_extent->value;
ICHECK_EQ(total % inner, 0);
int64_t outer = int_block_extent->value / int_desc_extent->value;
// Do the split
Array<LoopRV> split = sch->Split(loop2rv.at(block_loop_sref), {Integer(outer), Integer(inner)});
ICHECK_EQ(split.size(), 2);
inner_loops.insert(sch->GetSRef(split[1]).operator->());
// The inner split will be reordered to the loop domain that is tensorized
int desc_loop_index = info->desc_loop_indexer.at(GetRef<tir::For>(desc_loop));
reorder_suffix[desc_loop_index] = split[1];
}
// Reorder the loops
std::vector<LoopRV> reorder_list;
bool meet = false;
Array<LoopRV> all_loops = sch->GetLoops(block_rv);
for (const LoopRV& loop : all_loops) {
if (inner_loops.count(sch->GetSRef(loop).operator->())) {
meet = true;
} else if (meet) {
reorder_list.push_back(loop);
}
}
reorder_list.insert(reorder_list.end(), reorder_suffix.begin(), reorder_suffix.end());
sch->Reorder(reorder_list);
ICHECK(!reorder_suffix.empty());
return reorder_suffix[0];
}

tir::BlockRV TileForIntrin(tir::Schedule sch, tir::BlockRV block, const std::string& intrin_name) {
Optional<tir::LoopRV> tiled_loop_rv = TilingwithTensorIntrin(sch, block, intrin_name);
ICHECK(tiled_loop_rv.defined());
tir::BlockRV outer_block = sch->Blockize(tiled_loop_rv.value());
sch->Annotate(outer_block, tir::attr::meta_schedule_auto_tensorize, String(intrin_name));
return outer_block;
}

} // namespace meta_schedule
} // namespace tvm
35 changes: 35 additions & 0 deletions src/meta_schedule/schedule_rule/auto_tensorize.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
#ifndef TVM_META_SCHEDULE_SCHEDULE_RULE_AUTO_TENSORIZE_H_
#define TVM_META_SCHEDULE_SCHEDULE_RULE_AUTO_TENSORIZE_H_

#include <tvm/tir/schedule/schedule.h>

namespace tvm {
namespace meta_schedule {

Optional<tir::LoopRV> TilingwithTensorIntrin(const tir::Schedule& sch, const tir::BlockRV& block_rv,
const String& intrin_name);

tir::BlockRV TileForIntrin(tir::Schedule sch, tir::BlockRV block, const std::string& intrin_name);

} // namespace meta_schedule
} // namespace tvm

#endif // TVM_META_SCHEDULE_SCHEDULE_RULE_AUTO_TENSORIZE_H_
Loading

0 comments on commit 90e7d89

Please sign in to comment.