Minor fix #18

icemelon · 2017-01-18T03:25:49Z

No description provided.

tqchen · 2017-01-18T03:46:11Z

src/pass/ssa.cc

      } else {
        defined_.insert(v.get());
      }
+      // assign a unique name to variable
+      std::stringstream new_name;
+      if (name_index_.count(old_name) != 0) {


This will cause error when both i1 and i are var names

It might be worth write a testcase for such case

tqchen · 2017-01-18T03:47:24Z

src/pass/ssa.cc

        defined_.insert(v.get());
      }
+      // assign a unique name to variable
+      std::stringstream new_name;
+      if (name_index_.count(old_name) != 0) {


Use a function to merge the logic in Stmt and Expr together, e.g. GetUniqueName

* Rename file to avoid absolute import name clash * Remove old Relay Makefiles

* support tuning a whole network * pass unit test * update tune resnet * update all

@tqchen

* uTVM interfaces (#14) * some minor interface changes * implemented HostLowLevelDevice * added MicroDeviceAPI * implemented micro_common and added Python interfaces * current status, semi implemented micro session * added micro_common implementation and python interfaces (#18) * added micro_common implementation and python interfaces (#18) * current status, semi implemented * host test working * updated interfaces for MicroSession arguments allocation * make somewhat lint compatible * fix based on comments * added rounding macro * fix minor bug * improvements based on comments * Clean up `binutil.py` and make Python-3-compatible * Change argument allocation design * Address feedback and lint errors * Improve binutil tests * Simplify allocator (per @tqchen's suggestions) * Doc/style fixes * farts * mcgee * rodata section werks (and so does `test_runtime_micro_workspace.py`) * simple graph runtime werk * TEMP * ResNet works, yo * First round of cleanup * More cleanup * runs a dyson over the code * Another pass * Fix `make lint` issues * ready to pr... probably * final * Undo change * Fix rebase resolution * Minor fixes * Undo changes to C codegen tests * Add `obj_path` in `create_micro_lib` * TEMP * Address feedback * Add missing TODO * Partially address feedback * Fix headers * Switch to enum class for `SectionKind` * Add missing ASF header * Fix lint * Fix lint again * Fix lint * Kill lint warnings * Address feedback * Change Python interface to MicroTVM All interaction with the device is now through `Session` objects, which are used through Python's `with` blocks. * Reorder LowLevelDevice interface * Store shared ptr to session in all alloced objects * Move helper functions out of `tvm.micro` * Switch static char arr to vector * Improve general infra and code quality Does not yet address all of tqchen's feedback * Forgot a rename * Fix lint * Add ASF header * Fix lint * Partially address MarisaKirisame's feedback * Lint * Expose `MicroSession` as a node to Python * Revert to using `Session` constructor * Fix compiler error * (Maybe) fix CI error * Debugging * Remove * Quell lint * Switch to stack-based session contexts * Make uTVM less intrusive to host codegen And use SSA for operands of generated ternary operators * Inline UTVMArgs into UTVMTask struct * Remove `HostLowLevelDevice` header * Remove `BaseAddr` class * Address feedback * Add "utvm" prefix to global vars in runtime * Fix lint * Fix CI * Fix `test_binutil.py` * Fix submodules * Remove ResNet tests * Make `test_binutil.py` work with nose * Fix CI * I swear this actually fixes the binutil tests * lint * lint * Add fcompile-compatible cross-compile func * Add docs for uTVM runtime files * Move pointer patching into `MicroSession` * Fix lint * First attempt at unifying cross-compile APIs * Fix lint * Rename `cross_compile` back to `cc` * Address feedback * Remove commented code * Lint * Figure out failing function * Remove debugging code * Change "micro_dev" target to "micro" * Add checks in tests for whether uTVM is enabled * Add TODO for 32-bit support * Rename more "micro_dev" to "micro" * Undo rename We already have `tvm.micro` as a namespace. Can't have it as a method as well. * Fix failing CI Thanks to @tqchen for finding this bug. Emitting ternary operators for `min` and `max` causes concurrency bugs in CUDA, so we're moving the ternary op emissions from `CodeGenC` to `CodeGenCHost`. * Address feedback * Fix lint

@tqchen

* uTVM interfaces (apache#14) * some minor interface changes * implemented HostLowLevelDevice * added MicroDeviceAPI * implemented micro_common and added Python interfaces * current status, semi implemented micro session * added micro_common implementation and python interfaces (apache#18) * added micro_common implementation and python interfaces (apache#18) * current status, semi implemented * host test working * updated interfaces for MicroSession arguments allocation * make somewhat lint compatible * fix based on comments * added rounding macro * fix minor bug * improvements based on comments * Clean up `binutil.py` and make Python-3-compatible * Change argument allocation design * Address feedback and lint errors * Improve binutil tests * Simplify allocator (per @tqchen's suggestions) * Doc/style fixes * farts * mcgee * rodata section werks (and so does `test_runtime_micro_workspace.py`) * simple graph runtime werk * TEMP * ResNet works, yo * First round of cleanup * More cleanup * runs a dyson over the code * Another pass * Fix `make lint` issues * ready to pr... probably * final * Undo change * Fix rebase resolution * Minor fixes * Undo changes to C codegen tests * Add `obj_path` in `create_micro_lib` * TEMP * Address feedback * Add missing TODO * Partially address feedback * Fix headers * Switch to enum class for `SectionKind` * Add missing ASF header * Fix lint * Fix lint again * Fix lint * Kill lint warnings * Address feedback * Change Python interface to MicroTVM All interaction with the device is now through `Session` objects, which are used through Python's `with` blocks. * Reorder LowLevelDevice interface * Store shared ptr to session in all alloced objects * Move helper functions out of `tvm.micro` * Switch static char arr to vector * Improve general infra and code quality Does not yet address all of tqchen's feedback * Forgot a rename * Fix lint * Add ASF header * Fix lint * Partially address MarisaKirisame's feedback * Lint * Expose `MicroSession` as a node to Python * Revert to using `Session` constructor * Fix compiler error * (Maybe) fix CI error * Debugging * Remove * Quell lint * Switch to stack-based session contexts * Make uTVM less intrusive to host codegen And use SSA for operands of generated ternary operators * Inline UTVMArgs into UTVMTask struct * Remove `HostLowLevelDevice` header * Remove `BaseAddr` class * Address feedback * Add "utvm" prefix to global vars in runtime * Fix lint * Fix CI * Fix `test_binutil.py` * Fix submodules * Remove ResNet tests * Make `test_binutil.py` work with nose * Fix CI * I swear this actually fixes the binutil tests * lint * lint * Add fcompile-compatible cross-compile func * Add docs for uTVM runtime files * Move pointer patching into `MicroSession` * Fix lint * First attempt at unifying cross-compile APIs * Fix lint * Rename `cross_compile` back to `cc` * Address feedback * Remove commented code * Lint * Figure out failing function * Remove debugging code * Change "micro_dev" target to "micro" * Add checks in tests for whether uTVM is enabled * Add TODO for 32-bit support * Rename more "micro_dev" to "micro" * Undo rename We already have `tvm.micro` as a namespace. Can't have it as a method as well. * Fix failing CI Thanks to @tqchen for finding this bug. Emitting ternary operators for `min` and `max` causes concurrency bugs in CUDA, so we're moving the ternary op emissions from `CodeGenC` to `CodeGenCHost`. * Address feedback * Fix lint

Paddle frontend add unary function

* Don't use checked_type for annotating types in the matcher (breaks tests and relies on type checking) * Add callback feature to exact matcher, expand tests, simplify recent additions

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> fix some fixes fix test

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

* VM compiler. * Update. * Compile IRmodule; expose Python api * Add dtype contant serialization and type hint. * Address comments. * Add todos and fix lint. * Update * Update.

[SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> Fix AxisTree (apache#3) * fix axis tree * upd [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr [BugFix] Fix binary search & SpIterVar (apache#7) [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting [SparseTIR] Index Lowering (apache#8) * Add StmtFunctor/ExprFunctor for SparseBufferStore/Load * Add basic index lowering * Finish index lowering (maybe) * Address comments * Convert CRLF to LF Frontend update, demo scripts. (apache#10) * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr * [BugFix] Fix binary search & SpIterVar (apache#7) * [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting * upd * upd Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> [SparseTIR] SparseBlock on C++/Python side (apache#11) * Fix a bug in the last commit * SparseBlock on C++ & Python side [BugFix][SparseTIR] TVMScript Parser for Axis & SpIterVar (apache#12) * Update `cord` and `pos` * Fix `idtype` * Formatting.. * Bug fix 1 * Move new special stmts * Parser for Axis and SpIterVar * Fix context_maintainer.py [SparseTIR] Enhance SparseBlock to contain enough PrimFunc information (apache#13) * Enhance SparseBlock to have enough PrimFunc info * Remove `func_sparse_buffer_map_` * Don't print the map uh-huh [SparseTIR] Parser, Printer, Roundtrip (apache#14) * SparseBlock scope handler (part 1) * SparseBlock scope handler (part 2) * SparseBlock scope handler (part 3) * SparseBlock scope handler (fix 1) * Add SparseBufferLoad/Store on Python side * Parser for SparseBufferLoad/Store * Add SparseBlock to Python __init__ * StmtFunctor for SparseBlock * Ensure at least one dimension for SparseBuffer * Make `axis` field of SpIterVar mandatory * SparseBlock scope handler (fix 2) * Update Axis syntax by removing `name` parameter * Move to intrin.py * Add filed `from_sparse` to DenseFixedAxis * SparseTIR script printer * Roundtrip test * `update_symbol` bug fix * Fix attr visit in SparseBuffer * Define then compare in SparseBlock * Fix printer bug for SparseBuffer * Enable graph match for Axis and SparseBuffer * Complete HashReduce and EqualReduce for AxisTree and SparseBuffer * Fix typo * Rename test * Bug fix 1 * Bug fix 2 * Add more tests Move tests (apache#15) [SparseTIR] ReprPrinter for Axis and SpIterVar (apache#16) upd (apache#17) flatten (apache#18) ELL and BSR correctness test scripts (apache#19) [SparseTIR] SparseTIR Lowering (apache#20) * Fix a previous bug of sparse-fixed SpIterVar creation * Fix a previous bug in `GetDenseValue` * Refactor Collector and IndexTransformer * Construct block and loops * Fix a previous bug which rejects DV iters in collector * Update buffer map * Create root block * Fix bug of sparse-fixed SpIterVar creation * Fix bug on SpIterVar conversion (with refactor) * Fix bug when getting dependent SpIterVars * Fix bug on dependency map and index lowering * Full block read/write region * Test version 1 * Fix bug of loop order * Fix bug of batch-mm iterator ordering * Update PrimFunc args to use symbolic params * Fix bug of test "csr_element_wise" * Fix bug of index accumulation for sparse-fixed axis * Update correctness test * Test structural equality * Refactor and use Array fix nnz cols Add docstring for sparse tir lowering (apache#21) * add docstring * upd Add more examples part 1 (sddmm) (apache#22) * upd * upd * upd [SparseTIR][Schedule] SparseBlockRV, GetSparseBlock, SparseReorder (apache#23) * Test initialization * Fix a stupid bug of ReprPrinter * Add SparseBlockRV * Schedule: GetSparseBlock * Schedule: Reorder [SparseTIR][Schedule] GetSpIters (apache#24) remove hybrid script for successful compilation Add atomic intrinsic for output nonzero inference. (apache#25) * upd * upd Add "sparse" block attribute. (apache#26) Revert "remove hybrid script for successful compilation" This reverts commit eebd7c1. [SparseTIR] Hack `IsAffineBinding` check (apache#27) * [TensorIR][Schedule] Inherit block anotation upon creating new blocks * Fix SDDMM test * Hack IsAffineBinding for sparse blocks Axis Dependency Tree aware code-gen and bmm example (apache#28) * upd * upd * upd * upd * upd * upd * upd * upd * remove redundancy * fix * upd * upd Re-design Indices lowering (apache#29) * upd * upd * upd * upd * upd * init * format * fix * revise coding-style * format Complete indices lowering (apache#30) * upd * upd * upd * done * upd * passed test * upd Add more docstrings and depress warnings for new lowering algorithm. (apache#31) Refactor derived axis, frontend support of fusion. (apache#32) * upd * upd * fix Fatal bugfix and change the signature of DenseVariableAxis. (apache#33) Syntax simplification (apache#34) Change the order of generated blocks for block isolation. (apache#35) * upd * upd * upd Syntax of AttachAxis for BMM (apache#36) * upd * upd * upd [SparseTIR] Add "square sum" lowering test (apache#37) * Add square sum test * Remove pylint comment [BugFix] Fix offset caching in lowering (apache#38) * Hack compact dataflow check in a dirty way * Add two-K square sum test * Mark skipped tests * Fix offset saving in lowering Fusion syntax fix + SDDMM example. (apache#39) Some structure change on update offsets. (apache#40) [Refactor] SparseTIR Lowering (apache#41) * Take out methods in Scope * Refactor * Refactor "match" * Tweak scope contents * Refactor ViewIndexInAxis * Refactor Scope * SDDMM tests under implementation * Refactor block stack * Use Map for var_map * Extract NeedCreateNewBlock * Simplify SpIterVarToIterVar via GetIterExtent * Refactor NeedCreateNewBlock * Add docstring * Use "auto" correctly * Minor refactor and use some move Remove redundant analyzers (apache#42) Support indices lowering for attach and fuse. (apache#43) * upd * upd * upd Fix irregular BMM example. (apache#44) * upd * upd * upd * upd RGCN forward and butterfly pattern example. (apache#45) Fused SDDMM example. (apache#46) * upd * wip * fix Fix sparse reorder after refactor (apache#47) [Refactor] Refactor Unittest (apache#48) * upd * remove redundancy [Unittest] Correctness test for benchmarking scripts (apache#49) Bugfix and more test for axis fusion, new workload (apache#50) * upd * upd upd

* VM compiler. * Update. * Compile IRmodule; expose Python api * Add dtype contant serialization and type hint. * Address comments. * Add todos and fix lint. * Update * Update.

* relax.cumsum * Legalizer for expand_dims * relax.trilu * relax.cast * Legalizer for batch_norm and flatten * relax.take * relax.full * relax.split * relax.broadcast_to * relax.strided_slice * relax.image.resize2d * relax.nn.max_pool2d * relax.nn.adaptive_avg_pool2d

* VM compiler. * Update. * Compile IRmodule; expose Python api * Add dtype contant serialization and type hint. * Address comments. * Add todos and fix lint. * Update * Update.

Fix CI * Fix CI Approved-by: Spenser Gilliland Approved-by: Lam Nguyen

[EXP] TVM AMP + NHWC + autoscheduler Closes apache#18 See merge request RTST_AI/tvm!4

* Update dependabot.yml * remove dependency. * UPDATE INT8xINT2 * refactor the docs' link * update int range * Update installation and integration links in README.md

[PASS] Assign unique names to variables in ConvertSSA pass

4d704a4

tqchen reviewed Jan 18, 2017

View reviewed changes

revert change to ConverSSA pass

aa8edec

tqchen approved these changes Jan 18, 2017

View reviewed changes

tqchen merged commit 110c9be into master Jan 18, 2017

tqchen changed the title ~~[PASS] Assign unique names to variables in ConvertSSA pass~~ Minor fix Jan 20, 2017

tqchen deleted the haichen branch May 5, 2017 02:02

tqchen added a commit to tqchen/tvm that referenced this pull request May 26, 2018

[PASS] Add place device (apache#18)

a33e9ce

tqchen added a commit that referenced this pull request May 29, 2018

[PASS] Add place device (#18)

0413c40

tqchen added a commit to tqchen/tvm that referenced this pull request Jul 6, 2018

[PASS] Add place device (apache#18)

204c444

sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018

[PASS] Add place device (apache#18)

01a3a03

jroesch added a commit to jroesch/tvm that referenced this pull request Aug 29, 2018

Remove Makefiles (apache#18)

af8600a

* Rename file to avoid absolute import name clash * Remove old Relay Makefiles

jroesch pushed a commit to jroesch/tvm that referenced this pull request Dec 3, 2018

[AUTOTVM] End2End autotvm support for vta (apache#18)

7055803

* support tuning a whole network * pass unit test * update tune resnet * update all

jroesch pushed a commit to jroesch/tvm that referenced this pull request Dec 3, 2018

[AUTOTVM] End2End autotvm support for vta (apache#18)

e47976c

* support tuning a whole network * pass unit test * update tune resnet * update all

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Jan 2, 2019

[AUTOTVM] End2End autotvm support for vta (apache#18)

1cd86af

* support tuning a whole network * pass unit test * update tune resnet * update all

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Mar 22, 2019

[AUTOTVM] End2End autotvm support for vta (apache#18)

7369164

* support tuning a whole network * pass unit test * update tune resnet * update all

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Mar 22, 2019

[AUTOTVM] End2End autotvm support for vta (apache#18)

e3e60ea

* support tuning a whole network * pass unit test * update tune resnet * update all

weberlo pushed a commit to weberlo/tvm that referenced this pull request May 22, 2019

added micro_common implementation and python interfaces (apache#18)

d9b996b

weberlo pushed a commit to weberlo/tvm that referenced this pull request May 22, 2019

added micro_common implementation and python interfaces (apache#18)

e4de437

u99127 mentioned this pull request Jun 1, 2019

Implementation of uTVM #3227

Merged

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 11, 2019

added micro_common implementation and python interfaces (apache#18)

8d62426

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 11, 2019

added micro_common implementation and python interfaces (apache#18)

3502475

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 12, 2019

added micro_common implementation and python interfaces (apache#18)

38566e3

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 12, 2019

added micro_common implementation and python interfaces (apache#18)

fcfd0f9

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 23, 2019

added micro_common implementation and python interfaces (apache#18)

22f5ebd

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 23, 2019

added micro_common implementation and python interfaces (apache#18)

ab11dac

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 24, 2019

added micro_common implementation and python interfaces (apache#18)

e6c4448

weberlo pushed a commit to weberlo/tvm that referenced this pull request Jul 24, 2019

added micro_common implementation and python interfaces (apache#18)

059fdc2

wjj19950828 pushed a commit to wjj19950828/tvm that referenced this pull request Sep 6, 2021

Merge pull request apache#18 from heliqi/paddle_frontend

5ae3b5b

Paddle frontend add unary function

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Nov 13, 2021

flatten (apache#18)

3ccf09d

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Nov 20, 2021

flatten (apache#18)

b21e9e7

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Nov 24, 2021

flatten (apache#18)

208dfee

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Dec 24, 2021

flatten (apache#18)

97e36b3

yelite pushed a commit to yelite/tvm that referenced this pull request Feb 17, 2023

VM compiler. (apache#18)

d3e432d

* VM compiler. * Update. * Compile IRmodule; expose Python api * Add dtype contant serialization and type hint. * Address comments. * Add todos and fix lint. * Update * Update.

mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023

Merged in fix_ci (pull request apache#18)

fe2757e

Fix CI * Fix CI Approved-by: Spenser Gilliland Approved-by: Lam Nguyen

neigh80 added a commit to Cloud-AutoVe-SWPlatform/tvm that referenced this pull request Jan 22, 2024

Merge branch 'exp/amp-nhwc-autoscheduler' into 'develop'

d91b2db

[EXP] TVM AMP + NHWC + autoscheduler Closes apache#18 See merge request RTST_AI/tvm!4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor fix #18

Minor fix #18

icemelon commented Jan 18, 2017

tqchen Jan 18, 2017

tqchen Jan 18, 2017

tqchen Jan 18, 2017

Minor fix #18

Minor fix #18

Conversation

icemelon commented Jan 18, 2017

tqchen Jan 18, 2017

Choose a reason for hiding this comment

tqchen Jan 18, 2017

Choose a reason for hiding this comment

tqchen Jan 18, 2017

Choose a reason for hiding this comment