Releases: taichi-dev/taichi
Releases · taichi-dev/taichi
v0.6.2
Highlights:
- Miscellaneous
- Add
ti.core.toggle_advanced_optimization(True/False)
(#927) (by Yuanming Hu)
- Add
Full changelog:
- [doc] Non-Ubuntu Linux users are suggested to build clang-8 from scratch (#867) (by 彭于斌)
- [Misc] Add
ti.core.toggle_advanced_optimization(True/False)
(#927) (by Yuanming Hu) - [ir][refactor] Move legacy frontend constructs to frontend.h/cpp (#924) (by Ye Kuang)
- [misc] Enforce format (#922) (by Taichi Gardener)
- [ir][refactor] Move all Expression subclasses to
frontend_ir.h
(#919) (by Ye Kuang)
v0.6.1
Highlights:
- Automatic differentiation
- Fix CUDA data layout and stack alignment (#918) (by Yuanming Hu)
- CUDA backend
- Fix CUDA data layout and stack alignment (#918) (by Yuanming Hu)
- Examples
- Add
example/bitmasked.py
(#905) (by 彭于斌)
- Add
- GUI
- Better event filtering system (#801) (by 彭于斌)
- Language and syntax
- Scalar math functions can now be applied element-wisely to vectors/matrices (#891) (by 彭于斌)
Full changelog:
- [ir][refactor] Move all frontend stmts to
frontend_ir.h
(#916) (by Ye Kuang) - [CUDA] [AutoDiff] Fix CUDA data layout and stack alignment (#918) (by Yuanming Hu)
- [Example] Add example/bitmasked.py (#905) (by 彭于斌)
- [opengl] [bug] Move GLSLRuntime into a separate buffer, fix ti.random() bug on NVIDIA (#912) (by 彭于斌)
- [ir][refactor] First step to move Frontend IR to its own file (#914) (by Ye Kuang)
- [async] Avoid unnecessary list generations and activations (#913) (by Yuanming Hu)
- [Lang] Scalar math functions can now be applied element-wisely to vectors/matrices (#891) (by 彭于斌)
- [ir] Deprecate
FrontendAtomicStmt
(#907) (by Ye Kuang) - [ir] [refactor] Remove the global
current_block
(#908) (by Ye Kuang) - [ir][refactor] Pass a context object to
Expression::flatten()
(#901) (by Ye Kuang) - [test] Add an argument -r to rerun failed test (#904) (by 彭于斌)
- [GUI] Better event filtering system (#801) (by 彭于斌)
v0.6.0
Highlights:
- The OpenGL Compute Shader backend (by 彭于斌) is officially released!
- Examples
- Add Multiple Importance Sampling to the cornell box scene (#890) (by Ye Kuang)
- IR optimization passes
Full changelog:
- [ir] [refactor] Separate
UnaryOpType::cast
intocast_value
andcast_bits
(#892) (by 彭于斌) - [opt] Set
has_global_side_effect
tofalse
for some statements (#898) (by xumingkuan) - [Example] Add Multiple Importance Sampling to the cornell box scene (#890) (by Ye Kuang)
- [async] Overlap compilation and execution (#885) (by Yuanming Hu)
- [Opt] Extract constants to top-level (#897) (by xumingkuan)
- [ir] Remove
Ident
alias (#895) (by Ye Kuang) - [opt] Use advanced_optimization for #857 (#894) (by xumingkuan)
- [Opt] Global variable optimizations (#857) (by xumingkuan)
v0.5.15
Highlights:
- Bug fixes
- Examples
- Miscellaneous
- Enhance
ti.init
and supportarch=ti.gpu
(#843) (by 彭于斌)
- Enhance
- OpenGL backend
- IR optimization passes
- Refactoring
- Use state machines for optimization on variables (#859) (by xumingkuan)
Full changelog:
- [release] v0.5.15 (#888) (by Yuanming Hu)
- [Bug] Fix CPU SNode reader in debug mode (#887) (by Yuanming Hu)
- [opengl] Tolerate all errors in with_opengl() (#884) (by 彭于斌)
- [Opt] [Refactor] [Bug] Use state machines for optimization on variables (#859) (by xumingkuan)
- [async] No sync after kernel launch and postpone lower access (#883) (by Yuanming Hu)
- [async] Fix SNode reading in async mode (#881) (by Yuanming Hu)
- [opt] Eliminate assertions with non-zero const conditions (#877) (by xumingkuan)
- [misc] Reorganize misc files (#876) (by Yuanming Hu)
- revert #769 ti.cfg.gdb_trigger error (#879) (by 彭于斌)
- [cuda] [test] Add TI_DEVICE_MEMORY_GB and TI_DEVICE_MEMORY_FRACTION environment variable (#769) (by 彭于斌)
- [OpenGL] Detect driver existence in
with_opengl()
(#864) (by 彭于斌) - [misc] Avoid unnecessary linking when git commit hash not changed (#871) (by Yuanming Hu)
- [async] [refactor] Fix async engine statement not found error (#874) (by Yuanming Hu)
- [example] Fix
pbf2d.py
's initialization to use np.float32 (#873) (by Ye Kuang) - [misc] Avoid mutable defaults in
PyTaichi
(#870) (by Yuxin Wu) - [misc] Fix imread not reading properly when image is not equal in height and width (#855) (by 彭于斌)
- [async] Parallel compilation (#863) (by Yuanming Hu)
- [Opt] [Bug] Fix the state of the last store after a loop (#862) (by xumingkuan)
- [test]
ti test -na opengl
to exclude tests on some archs (#830) (by 彭于斌) - [opt] Eliminate useless local stores and atomics (#858) (by xumingkuan)
- [doc] update documention for #843 (#844) (by 彭于斌)
- [Example] add sdf2d.py (#835) (by 彭于斌)
- [opt] Lower
linearize
even without advanced optimization (#854) (by xumingkuan) - [misc] Split accessor and other kernels in statistics (#853) (by xumingkuan)
- [misc] Profiler now supports multiple threads (#852) (by Yuanming Hu)
- [opt] Improve optimization for OffsetAndExtractBitsStmt (#851) (by xumingkuan)
- [Bug] [Opt] Fix a bug in local variable optimization (#849) (by xumingkuan)
- [test] Fix duplicate test name in
test_tensor_reflection.py
(#850) (by xumingkuan) - [Example] Add
quadtree.py
(#824) (by 彭于斌) - [misc] Avoid printing empty logs (#848) (by xumingkuan)
- [async] Each JIT thread now has its own LLVM context (#845) (by Yuanming Hu)
- [misc] Clear stats in the constructor of Program (by xumingkuan)
- [Misc] Enhance
ti.init
and supportarch=ti.gpu
(#843) (by 彭于斌) - [misc] Add an option to print statistics into files (#841) (by xumingkuan)
- [test] Fix test when
advanced_optimization == false
(#842) (by xumingkuan) - [Example] Add a Cornell box renderer (#836) (by Ye Kuang)
- [metal] Increment by thread grid size for the grid-strip loops (#837) (by Ye Kuang)
- [OpenGL] Use GLAD as API loader (replace GLEW) (#819) (by 彭于斌)
- [opt] Eliminate
WhileControlStmt
with non-zero const conditions (by xumingkuan) - [refactor] Move backend implementations to the
backends
folder (#818) (by Yuanming Hu) - [test] Add a test for statement offloading (#826) (by xumingkuan)
- [Opt] Algebraic simplification for bitwise operators (#827) (by xumingkuan)
v0.5.14
Highlights:
- CUDA backend
- Fix
slim_libdevice.bc
shipping on Windows (#820) (by Yuanming Hu)
- Fix
- OpenGL backend
- Fix NVIDIA GLSL compile freeze by sorting SNodes (#808) (by 彭于斌)
Full changelog:
- [refactor] Miscellaneous & mechanical cleanups (#822) (by Taichi Gardener)
- [CUDA] Fix
slim_libdevice.bc
shipping on Windows (#820) (by Yuanming Hu) - [OpenGL] Fix NVIDIA GLSL compile freeze by sorting SNodes (#808) (by 彭于斌)
- [refactor] Remove
SNodeAttr
and decoupleSNode
from LLVM (#817) (by Yuanming Hu) - [async] Parallel compilation infrastructure (#816) (by Yuanming Hu)
- [example] Improve
mpm128
gravity control (#809) (by 彭于斌)
v0.5.13
Highlights:
- Bug fixes
- Insert gc tasks only for
pointer
anddynamic
SNodes (#781) (by Ye Kuang)
- Insert gc tasks only for
- Command line interface
- Fix
ti
commands other thanti test
(#783) (by Yuanming Hu)
- Fix
- Documentation
- Improved documentation (#747) (by 彭于斌)
- Examples
- Add an Euler equation solver (#796) (by Kenneth Lozes)
- Intermediate representation
- Language and syntax
- Support matrix initialization with a list of vectors (#811) (by Kenneth Lozes)
- LLVM backend (CPU and CUDA)
- Upgrade to LLVM 10 and keep backward compatibility with LLVM 8 (#787) (by Tao He)
- OpenGL backend
- Support range for-loops with non-constant bounds (#785) (by 彭于斌)
- IR optimization passes
- Improve local variable optimizations (#788) (by xumingkuan)
Full changelog:
- [Lang] Support matrix initialization with a list of vectors (#811) (by Kenneth Lozes)
- [IR] Fix compilation crash when offloading local variables stores/atomic adds (#813) (by xumingkuan)
- [opt] Avoid storing constants across offloaded tasks (#812) (by xumingkuan)
- [misc] Update
README.md
(#776) (by Yuanming Hu) - [OpenGL] Support range for-loops with non-constant bounds (#785) (by 彭于斌)
- [Example] Add an Euler equation solver (#796) (by Kenneth Lozes)
- [doc] Add Google code review article PDFs (#800) (by Yuanming Hu)
- [IR] Fix malformed IRs (#806) (by xumingkuan)
- [async] Add parallel task executor (#805) (by Yuanming Hu)
- [ir] Add checks for
loop_var
s and print names for loops (#803) (by xumingkuan) - [Opt] Improve local variable optimizations (#788) (by xumingkuan)
- [misc] Improve type checking error messages (#797) (by Yuanming Hu)
- [example] Improve mpm_lagrangian_forces.py (#799) (by Yuanming Hu)
- [doc] Document the Statistics class (#795) (by Yuanming Hu)
- [metal] Implement range_for using grid stride loop (#780) (by Ye Kuang)
- [doc] Include a few links to Google Code Health articles in contributor guidelines (#789) (by Yuanming Hu)
- [Doc] Improved documentation (#747) (by 彭于斌)
- [LLVM] Upgrade to LLVM 10 and keep backward compatibility with LLVM 8 (#787) (by Tao He)
- [doc] Fix typo (#786) (by xumingkuan)
- [CLI] Fix
ti
commands other thanti test
(#783) (by Yuanming Hu) - [Bug] [sparse] Insert gc tasks only for
pointer
anddynamic
SNodes (#781) (by Ye Kuang)
v0.5.12
Notable changes:
- Language
- Support
4x4
matrix inverse and determinant (#763) (by KLozes)
- Support
- Command-line interface
- Fix
ti test --arch ...
(#764) (by 彭于斌)
- Fix
- Examples
- Add 2D stable fluids example (#748) (by Ye Kuang)
- PyPI package
- Package name changed from
taichi-nightly
totaichi
(#762) (by Yuanming Hu)
- Package name changed from
- Full log
Full changelog:
- [cli] Add
TI_GDB_TRIGGER
environment variable (#766) (by 彭于斌) - [misc] Update Jenkinsfile (by Yuanming Hu)
- [cli] Fix
ti test
(#774) (by Yuanming Hu) - [misc] Update
README.md
(#767) (by Yuanming Hu) - [refactor] [opt] Move LocalLoadSearcher and 2 other classes to irpass::analysis (#771) (by xumingkuan)
- [ir] [refactor] Slim ir.h and reduce build time by ~4% (#761) (by xumingkuan)
- [CLI] Fix
ti test --arch ...
(#764) (by 彭于斌) - [opengl] GLFW 3.3.2 is now included as a submodule (#760) (by Yuanming Hu)
- [lang] Support 4x4 matrix inverse and determinant (#763) (by KLozes)
- [PyPI] Package name changed from
taichi-nightly
totaichi
(#762) (by Yuanming Hu) - [misc] Unify #include format (#759) (by Taichi Gardener)
- [Example] Add 2D stable fluids example (#748) (by Ye Kuang)
v0.5.11
Notable changes:
- Automatic differentiation
- Fix floating-point type-cast gradients (#687) (by Yuanming Hu)
- CUDA backend
- PyPI package
taichi-nightly
now covers CUDA 10.X on Windows and Linux (#756) (by Yuanming Hu)
- PyPI package
- Examples
- GUI
- Language and syntax
- Support
continue
on all backends (#716) (by Ye Kuang)
- Support
- LLVM backend (CPU and CUDA)
- Metal backend
- Support
ti.random()
on Metal (#710) (by Ye Kuang)
- Support
- OpenGL backend
- IR and Optimization
- More Taichi IR standardization and optimization (#656) (by xumingkuan)
Full changelog:
- [CUDA] limit memory allocation chunk size 128 MB (#758) (by Yuanming Hu)
- [CUDA] PyPI package
taichi-nightly
now covers CUDA 10.X on Windows and Linux (#756) (by Yuanming Hu) - [Example] Fix
examples/regression.py
(#757) (by Quan Wang) - [ir] [refactor] Move all passes that do not change IR into
irpass::analysis
(#754) (by xumingkuan) - [GUI] Fix blinking particles and random segmentation faults in
ti.GUI.circles
(#755) (by Yuanming Hu) - [misc] Fix dynamic node out-of-bound checker (#752) (by Yuanming Hu)
- [doc] Update
syntax.rst
to includeti.sqrt
,ti.asin
,ti.acos
andx ** y
(#753) (by Quan Wang) - [refactor] Remove vprintf in runtime/llvm/runtime.cpp to avoid name conflicts (#750) (by Yuanming Hu)
- [misc] Update Jenkinsfile (by Yuanming Hu)
- [GUI][Metal] Support SPACE key (#749) (by Ye Kuang)
- [OpenGL] 64-bit data type support (#717) (by 彭于斌)
- [cli] Better
ti test
to test single cpp file and no cpp test when testing python(s) (#724) (by 彭于斌) - [ir] [refactor] Simplify statement visitors (#744) (by xumingkuan)
- [async] Benchmark infrastructure (#743) (by Yuanming Hu)
- [opt] Move common statements in true/false branches outside if's (#727) (by xumingkuan)
- [opengl] Add
check_opengl_error
to prevent potential segfaults (#728) (by 彭于斌) - [Example] Add
game_of_life.py
(#741) (by 彭于斌) - [ir] [refactor] Add a
verify
pass to find out illegal IRs, and remove OffloadedStmt::begin_stmt/end_stmt (#731) (by xumingkuan) - [cli] Improve
ti
header message (#715) (by Ye Kuang) - [metal][refactor] Use
compile_to_offloads()
in codegen (#738) (by Ye Kuang) - [ir] Fix adjoint alloca location in
make_adjoint
(#734) (by Yuanming Hu) - [ir] Fix out-of-scope operands during offloading (#730) (by Yuanming Hu)
- [metal] Refactor the codegen to have multiple code sections (#733) (by Ye Kuang)
- [ir] [refactor] BasicStmtVisitor includes Frontend sstatements (#732) (by 彭于斌)
- [infra] AppVeyor triggers format server when
[format]
included as substrings in commit messages (#725) (by 彭于斌) - [opengl] [refactor] Remove the global
no_gc
inopengl_codegen.cpp
(#723) (by Ye Kuang) - [Lang] Support
continue
on all backends (#716) (by Ye Kuang) - [ir] Add assertions of
Alloca
s in the constructors of LocalAddress and LocalStoreStmt (by xumingkuan) - [CUDA] Improve CUDA build portability with run-time driver loading (#714) (by Yuanming Hu)
- [infra] Windows stack backtrace (#720) (by xumingkuan)
- [ir][refactor] Use const & in function arguments to avoid copying (#718) (by xumingkuan)
- [opengl] [refactor] Fix memory leakages using modern C++ memory management features (#696) (by 彭于斌)
- [ir] Fix compilation crash when passing global pointer to if statements (#713) (by Yuanming Hu)
- [test] Make
test_struct_for_branching
run on archs withpointer
(#712) (by Ye Kuang) - [Metal] Support
ti.random()
on Metal (#710) (by Ye Kuang) - [refactor] SNode now uses unique_ptr instead of shared_ptr for clearer children ownership (#705) (by Yuanming Hu)
- [LLVM] Fix LLVM struct-for codegen crashing due to extra return #704 (#707) (by Yuanming Hu)
- [metal] Skip
listgen
for leaf Snode (#699) (by Ye Kuang) - [refactor] CUDA-related infrastructure (#706) (by Yuanming Hu)
- [OpenGL] Support more than one external array arguments (#694) (by 彭于斌)
- [ir] Add a function to test if two IRNodes are equivalent (#683) (by xumingkuan)
- [refactor] Create
taichi/codegen/codegen_llvm.cpp
and outline class member definitions (#702) (by Taichi Gardener) - [refactor] Extract Taichi IR compilation from KernelCodeGen (#700) (by Yuanming Hu)
- [async] AsyncEngine infrastructure (#698) (by Yuanming Hu)
- [metal] Fix tests that require 64-bit data (#697) (by Ye Kuang)
- [opengl] Improved randomness of PRNGs across each launch (#692) (by 彭于斌)
- [OpenGL] Support NVIDIA GLSL compiler (#666) (by 彭于斌)
- [metal] Fix bug in Metal listgen where it goes beyond the capacity (#691) (by Ye Kuang)
- [ir] Add statement field manager (#690) (by Yuanming Hu)
- [misc] Enforce the use of #include "taichi/..." (#688) (by Taichi Gardener)
- [AutoDiff] Fix floating-point type-cast gradients (#687) (by Yuanming Hu)
- [metal] Use grid-stride loop to implement
listgen
kernels (#682) (by Ye Kuang) - [refactor] Remove
llvm::Value *Stmt::value
(#686) (by Yuanming Hu) - [refactor] Removed
Stmt::adjoint
(#685) (by Yuanming Hu) - [lang] Fix ti.static(ti.grouped(...)) syntax checker (#681) (by xumingkuan)
v0.5.10
Notable changes:
- (Mar 29, 2020) v0.5.10 released
Full changelog:
- [Infra]
ti test
now supports-t/--threads
for specifying number of testing threads (#674) (by Yuanming Hu) - [Lang] Fix
ti.static(ti.grouped(ti.ndrange(...)))
syntax checker false positive (#680) (by Yuanming Hu) - [misc] Removed useless files (by Taichi Gardener)
- [misc] Update README.md (by Yuanming Hu)
v0.5.9
Notable changes:
- (Mar 28, 2020) v0.5.9 released
- CPU & CUDA backends
- Support
bitmasked
as the leaf block structure for1x1x1
masks (#676) (by Yuanming Hu)
- Support
- Documentation
- Updated contributor guideline (#658) (by Yuanming Hu)
- Infrastructure
- 6x faster compilation on CPU/CUDA backends (#673) (by Yuanming Hu)
- Language and syntax
- Metal backend
- Optimization
- CPU & CUDA backends
Full changelog:
- [misc]
misc/make_changelog.py
for automatically generating changelogs (#679) (by Yuanming Hu) - [metal] Simplify Metal backend's namings (#675) (by Ye Kuang)
- [CPU][CUDA] Support
bitmasked
as the leaf block structure for1x1x1
masks (#676) (by Yuanming Hu) - [Infra] 6x faster compilation on CPU backends (#673) (by Yuanming Hu)
- [misc] Improve format server stability (#672) (by Yuanming Hu)
- [ir] Basic function definition/call instructions (#612) (by 彭于斌)
- [Lang] Simplify dense.bitmasked to bitmasked (#670) (by Ye Kuang)
- [misc] Fixed format server file coverage (#669) (by Yuanming Hu)
- [Opt] Merge adjacent if's with identical conditions (#668) (by xumingkuan)
- [metal] Move platform/metal to backends/metal (#667) (by Ye Kuang)
- [ir] Added irpass::gather_statements (#665) (by Yuanming Hu)
- [Opt] Dive into container statements to find local loads/stores for optimization, and optimize loads of new allocas to 0 (#662) (by xumingkuan)
- [Metal] Changes to enable
bitmasked
on Metal! (#661) (by Ye Kuang) - [Doc] Updated contributor guideline (#658) (by Yuanming Hu)
- [misc] Introduced a temporary boolean constant for benchmarking advanced optimizations (#657) (by xumingkuan)
- [misc] v0.5.8 README (#654) (by Yuanming Hu)
- Fixed MGPCG (#652) (by Yuanming Hu)
- Refactor ASTTransformer.visit_For and fix a bug on grouped ndrange loops (#648) (by xumingkuan)
- [Metal] Silence compile warning with [[maybe_unused]] (#650) (by Ye Kuang)
- add LineAppender for OpenGL too #643 (#651) (by 彭于斌)
- Support break in non-parallel for statements by translate range-for into while #578 (#583) (by 彭于斌)
- [Metal] Add bitmasked support in MetalRuntime (#638) (by Ye Kuang)