This repository has been archived by the owner on Nov 25, 2024. It is now read-only.
Releases: rapidsai/wholegraph
Releases · rapidsai/wholegraph
v24.10.00
🐛 Bug Fixes
- Ensure pylibwholegraph conda packages have the license (#215) @raydouglass
- fix_mnnvl_with_uuid (#207) @chuangz0
🛠️ Improvements
- bump NCCL floor to 2.19 (#223) @jameslamb
- Update update-version.sh to use packaging lib (#219) @AyodeAwe
- bump NCCL floor to 2.18.1.1, relax PyTorch pin (#218) @jameslamb
- Use CI workflow branch 'branch-24.10' again (#216) @jameslamb
- Add support for Python 3.12 (#214) @jameslamb
- Update rapidsai/pre-commit-hooks (#213) @KyleFromNVIDIA
- Drop Python 3.9 support (#209) @jameslamb
- Remove NumPy <2 pin (#208) @seberg
- Update pre-commit hooks (#206) @KyleFromNVIDIA
- Improve update-version.sh (#204) @bdice
- Use tool.scikit-build.cmake.version, set scikit-build-core minimum-version (#203) @jameslamb
- Add horovodrun launch agent for Wholegraph (#200) @Tomcli
- support different entry size for different ranks (#194) @zhuofan1123
- Remove Dockerfile (#184) @bdice
[NIGHTLY] v24.12.00
🔗 Links
🚨 Breaking Changes
- stop publishing packages (#233) @jameslamb
🐛 Bug Fixes
🛠️ Improvements
- [Bugfix] Sync stream for scatter_op (#235) @chang-l
- stop publishing packages (#233) @jameslamb
- print sccache stats in builds (#231) @jameslamb
- manage more dependencies in dependencies.yaml, declare 'numpy' runtime dependency (#230) @jameslamb
- Add gather/scatter support 1D tensor (#229) @chang-l
- make conda installations in CI stricter (#228) @jameslamb
- Add a new memory type: Hierarchy (#227) @zhuofan1123
- Prune workflows based on changed files (#226) @KyleFromNVIDIA
- [Optimization] Accelerate dgl create block by bypassing sanity check (#221) @zhou9402
v24.08.00
🐛 Bug Fixes
- Add cuda-nvml-dev to dependencies.yaml. (#197) @bdice
- Revert "Add CUDA_STATIC_MATH_LIBRARIES" (#192) @KyleFromNVIDIA
- fixed bugs (#180) @zhuofan1123
🛠️ Improvements
- clarify which dependencies in dependencies.yaml are conda-only (#195) @jameslamb
- Use workflow branch 24.08 again (#193) @KyleFromNVIDIA
- Build and test with CUDA 12.5.1 (#191) @KyleFromNVIDIA
- Add CUDA_STATIC_MATH_LIBRARIES (#190) @KyleFromNVIDIA
- skip CMake 3.30.0 (#189) @jameslamb
- Use verify-alpha-spec hook (#188) @KyleFromNVIDIA
- allow users to choose shm allocation method for chunked/continous host memory (#187) @linhu-nv
- decouple embedding creation from optimizer (#186) @zhuofan1123
- Mnnvl with split comm (#185) @chuangz0
- Adopt CI/packaging codeowners (#183) @bdice
- use rapids-build-backend (#181) @jameslamb
v24.06.00
🐛 Bug Fixes
- a quick fix to wholememory tensor gather default data type (#173) @linhu-nv
- quick fix to a map_indice bug && add comment for parameter round_robin_size (#172) @linhu-nv
🚀 New Features
🛠️ Improvements
- Sort indices before gathering (#174) @zhuofan1123
- Always use a static gtest (#167) @vyasr
- Fix host view for mnnvl (#166) @chuangz0
- subwarp version gather op for small embedding size (#165) @chuangz0
- Migrate to
{{ stdlib("c") }}
(#164) @hcho3 - support read file with multi threads and add test_wholememory_io for round-roubin read (#163) @chuangz0
- fix CI issue due to pytorch and mkl version conflict (#162) @linhu-nv
- allow temp_memory_handler to allocate memory for multiple times (#161) @linhu-nv
- remove unnecessary sync between thrust ops and host threads (#160) @linhu-nv
- Remove scripts/checks/copyright.py (#149) @KyleFromNVIDIA
v24.04.00
🐛 Bug Fixes
- Update pre-commit-hooks to v0.0.3 (#152) @KyleFromNVIDIA
- Fixed README links to point to cuGraph API (#145) @acostadon
- [Bugfix] Fix to compile when NVSHMEM is ON (#142) @chang-l
- handle more RAPIDS version formats in update-version.sh (#122) @jameslamb
🚀 New Features
- Support CUDA 12.2 (#116) @jameslamb
🛠️ Improvements
- Use
conda env create --yes
instead of--force
(#155) @bdice - add round-robin shard strategy (#154) @linhu-nv
- Switch to scikit-build-core (#150) @vyasr
- Update script input name (#147) @AyodeAwe
- Add upper bound to prevent usage of NumPy 2 (#146) @bdice
- Replace local copyright check with pre-commit-hooks verify-copyright (#144) @KyleFromNVIDIA
- remove an unnecessary sync in exchange_embeddings_nccl_func (#143) @linhu-nv
- Use default
rapids-cmake
CUDA_ARCHITECTURES (#140) @trxcllnt - Add support for Python 3.11, require NumPy 1.23+ (#139) @jameslamb
- [Bugfix] Host full-neighbor sampling returns wrong results in unit test (#138) @chang-l
- use enum to implement log_level in wholememory (#136) @linhu-nv
- target branch-24.04 for GitHub Actions workflows (#135) @jameslamb
- Add environment-agnostic scripts for running ctests and pytests (#128) @trxcllnt
v24.02.00
🐛 Bug Fixes
- Revert "Exclude tests from builds (#127)" (#130) @raydouglass
- Exclude tests from builds (#127) @vyasr
- fix a bug for embedding optimizer, which leads to undefined behavior (#108) @linhu-nv
- fix inferencesample option (#107) @chuangz0
🚀 New Features
🛠️ Improvements
- Logging level (#123) @linhu-nv
- Fix pip dependencies (#118) @trxcllnt
- Remove usages of rapids-env-update (#117) @KyleFromNVIDIA
- refactor CUDA versions in dependencies.yaml (#115) @jameslamb
- Don't overwrite wholegraph_ROOT if provided (#114) @vyasr
- added Direct IO support for WholeMemory loading (#113) @dongxuy04
- Align versions for cudnn, clang-tools, cython, and doxygen with the rest of RAPIDS. (#112) @bdice
- Reset WholeGraph communicators during the finalize call (#111) @chang-l
- Forward-merge branch-23.12 to branch-24.02 (#102) @bdice
v23.12.00
🐛 Bug Fixes
- move_vector_clear_outside_loop (#103) @chuangz0
- change pytorch cu121 to stable to fix ci (#97) @dongxuy04
🚀 New Features
- Integrate NVSHMEM into WholeGraph (#91) @chuangz0
- Grace Hopper support and add benchmark (#87) @dongxuy04
🛠️ Improvements
- Fix dependencies on librmm and libraft. (#96) @bdice
- gather/scatter optimizations (#90) @linhu-nv
- Use branch-23.12 workflows. (#84) @AyodeAwe
- Setup Consistent Nightly Versions for Pip and Conda (#82) @divyegala
- Add separate init, expose gather/scatter for WholeMemoryTensor and update example (#81) @dongxuy04
- Use RNG (random number generator) provided by RAFT (#79) @linhu-nv
- Build CUDA 12.0 ARM conda packages. (#74) @bdice
- upload xml docs (#73) @AyodeAwe
- replace optparse with argparser (#61) @chuangz0
v23.10.00
🐛 Bug Fixes
- Update all versions to 23.10 (#71) @raydouglass
- Use
conda mambabuild
notmamba mambabuild
(#67) @bdice
🛠️ Improvements
- Update image names (#70) @AyodeAwe
- Update to clang 16.0.6. (#68) @bdice
- Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#66) @divyegala
- Fix docs build and slightly optimize (#63) @dongxuy04
- Use
copy-pr-bot
(#60) @ajschmidt8 - PR: Use top-k from RAFT (#53) @chuangz0
v23.08.01
🚨 Breaking Changes
- Refactoring into 23.08 (#24) @BradReesWork
🐛 Bug Fixes
- Add LICENSE to wheels (#55) @raydouglass
🛠️ Improvements
- Correct syntax in GHA workflow (#46) @tingyu66
- Refactoring into 23.08 (#24) @BradReesWork
v23.08.00
🚨 Breaking Changes
- Refactoring into 23.08 (#24) @BradReesWork
🛠️ Improvements
- Correct syntax in GHA workflow (#46) @tingyu66
- Refactoring into 23.08 (#24) @BradReesWork