-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc Improvement of Evolutionary Search #549
Misc Improvement of Evolutionary Search #549
Conversation
CC @zxybazh: I finally managed to get rid of CachedTrace :-) |
526b92d
to
d753e87
Compare
@@ -146,8 +146,8 @@ bool RewriteCooperativeFetchNode::Apply(const tir::Schedule& sch) { | |||
sch->Vectorize(split[3]); | |||
sch->Bind(split[2], "threadIdx.x"); | |||
sch->Bind(split[1], "threadIdx.y"); | |||
sch->StorageAlign(block, 0, -2, 32, 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spectrometerHBH Let me know if my fix makes sense to you
@@ -129,7 +131,7 @@ def main(var_A: T.handle, var_B: T.handle, var_C: T.handle) -> None: | |||
v1 = T.axis.spatial(512, i2_0_0 * 128 + (ax0_ax1_fused_0 * 256 + ax0_ax1_fused_1 * 32 + ax0_ax1_fused_2) % 128) | |||
T.reads([A[v0, v1]]) | |||
T.writes([A_shared[v0, v1]]) | |||
T.block_attr({"meta_schedule.cooperative_fetch":1}) | |||
T.block_attr({"meta_schedule.cooperative_fetch":1, "buffer_dim_align": [[0, 0, 32, 8]]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spectrometerHBH Let me know if the fix makes sense to you
@@ -283,6 +285,7 @@ def main(var_A: T.handle, var_B: T.handle, var_C: T.handle) -> None: | |||
jo = T.axis.spatial(32, i0_0_1_i1_0_1_fused % 2 * 16 + i0_0_2_i1_0_2_fused * 2 + i1_0_4_init) | |||
T.reads([]) | |||
T.writes([C_local_wmma_accumulator[io * 16 : io * 16 + 16, jo * 16 : jo * 16 + 16]]) | |||
T.block_attr({"meta_schedule.auto_tensorize":"wmma_fill"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spectrometerHBH Let me know if the fix makes sense to you
@@ -58,8 +59,15 @@ class ThreadBindingUnifier : public StmtExprMutator { | |||
if (op->kind != ForKind::kThreadBinding) { | |||
return StmtExprMutator::VisitStmt_(op); | |||
} | |||
return UnifyThreadBindingImpl(op, op->loop_var, op->thread_binding.value(), | |||
Range::FromMinExtent(op->min, op->extent)); | |||
Map<String, ObjectRef> annotations = op->annotations; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinx13 @jinhongyii Let me know if the quick hack makes sense to you
1fef7f2
to
8a99e3f
Compare
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (#485) [Meta Schedule][M3c] PostOrderApply (#486) Fix Post Order Apply (#490) [MetaSchedule] Relay Integration (#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (#492) Fix replay trace. (#493) [M3c][Meta Schedule] Implement the Replay Func class. (#495) [PR] Test script for meta-schedule task extraction. Interface to load… (#494) [Meta Schedule Refactor] Get child blocks (#500) Read-at && Write-at (#497) [M3c][Meta Schedule] Measure Callbacks (#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (#496) [MetaSchedule] Sample-Perfect-Tile (#501) [MetaSchedule] TE Workloads (#502) [TensorIR] GetProducer, GetConsumer (#506) [MetaScheduleRefactor] Annotate&Unannotate (#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (#503) [Tests] Add unittests for auto-inline and multi-level-tiling (#508) [Meta Schedule] Minor Fixes (#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (#499) [Meta Schedule] Add Helper Function & Minor Modification (#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (#513) [Meta Schedule] Feature Extractor & Cost Model (#510) Blockize & Tensorize (#514) Layout Rewriting: Suggest-Index-Map (#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (#516) [Meta Schedule] Per-Store-Feature (#521) Add traced schedule for blockize & tensorize (#526) [Meta Schedule] Add XGBoost Model & Random Model (#519) User-Interface: Tune-TIR (#525) User-Interface: Tune-TE (#527) [Minor] More logging on python (#528) Get CUDA tuning working (#529) [MetaSchedule] TensorRT BYOC (#518) [BugFix] LocalBuilder API (#531) [Meta Schedule] Add Cost Model Update Measure Callback (#530) [Bugfix] BuilderInput with default params (#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (#534) [Meta Schedule] Evolutionary Search (#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (#535) [Meta Schedule] Fix some bugs (#537) Initiate Experiments for CPU Performance Alignment with Ansor (#538) [Meta Schedule] Tweak experiment scripts (#539) [Meta Schedule] Initiate experiments on CUDA (#540) [TIR][Schedule] Buffer transform (#523) Auto Tensor Core (#524) Working on Evo Search (#542) [Meta Schedule] Add Replay Tuning Interface (#543) Evolutionary Search on CPU (#544) Misc improvement over the error message (#545) [TIR][Schedule] Software pipelining (#533) [Meta Schedule Refactor] fixing unit tests (#547) [MetaSchedule] Mutator-Compute-Location (#548) Misc Improvement of Evolutionary Search (#549) Hotfix for software pipeline (#552) Misc Improvement (#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (#9738) (#555) Rule RFactor (#551) [MemHammer] Rewrite Rules (#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (#559) [MetaSchedule] Perf Alignment - NRM on CUDA (#560) [TIR] Reorder the block iters of the blocks generated by RFactor (#561) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
No description provided.