Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: set cuda rng state on gpu tests for test_random.py #1014

Merged
merged 19 commits into from
Aug 23, 2022
Merged

fix: set cuda rng state on gpu tests for test_random.py #1014

merged 19 commits into from
Aug 23, 2022

Conversation

JuanPedroGHM
Copy link
Member

@JuanPedroGHM JuanPedroGHM commented Aug 22, 2022

Description

On test_random.py (test_permutation and test_randperm), change between torch.cuda and torch.random to get/set_rng_state depending on self.device.torch_device.

Issue/s resolved: #1010

Changes proposed:

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Does this change modify the behavior of other functions? If so, which?

no

Tested with:
python 3.8
torch 1.12
cuda 11.6
openmpi 4.0

skip ci

@JuanPedroGHM JuanPedroGHM linked an issue Aug 22, 2022 that may be closed by this pull request
@ghost
Copy link

ghost commented Aug 22, 2022

👇 Click on the image for a new way to code review
  • Make big changes easier — review code in small groups of related files

  • Know where to start — see the whole change at a glance

  • Take a code tour — explore the change with an interactive tour

  • Make comments and review — all fully sync’ed with github

    Try it now!

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map Legend

@ClaudiaComito ClaudiaComito changed the base branch from main to release/1.2.x August 22, 2022 12:09
ClaudiaComito
ClaudiaComito previously approved these changes Aug 23, 2022
Copy link
Contributor

@ClaudiaComito ClaudiaComito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that test_random runs on HDFML with:

  • OpenMPI/4.1.2
  • Python/3.9.6
  • mpi4py/3.1.3
  • PyTorch/1.11-CUDA-11.5
  • on 1 Node, 1 process & 2 processes
  • on 2 Nodes, 1 process each

It works! Thanks a lot @JuanPedroGHM !

@ClaudiaComito ClaudiaComito changed the base branch from release/1.2.x to main August 23, 2022 08:48
@ClaudiaComito ClaudiaComito dismissed their stale review August 23, 2022 08:48

The base branch was changed.

Copy link
Contributor

@ClaudiaComito ClaudiaComito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added python 3.9 and pytorch 1.12 tests to the CI matrix

@JuanPedroGHM JuanPedroGHM added testing Implementation of tests, or test-related issues PR talk labels Aug 23, 2022
@ClaudiaComito ClaudiaComito merged commit 52d88c0 into helmholtz-analytics:main Aug 23, 2022
@JuanPedroGHM JuanPedroGHM deleted the fix/1010-bug-test_random-fails-on-gpu branch August 23, 2022 11:17
ClaudiaComito added a commit that referenced this pull request Sep 13, 2022
* Replace bug report MD template with form in view of further automation

* Fix bug report file name

* Update bug_report.yml

* Update bug_report.yml

* Update bug_report.yml

* Update bug_report.yml

* Auto generated release notes and changelog (#974)

* wip: Initial release draft and changelog updater actions configuration

* doc: pr title style guide in contibuting.md

* ci: improved release draft templates

* ci: extra release draft categories

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Tutorial note about local and global printing (#972)

* doc: parallel tutorial note metioning local and global printing

* doc: extenden local print note with ``ht.local_printing()``

* Fix typo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Updated the tutorial document. (#977)

* Updated the tutorial document.

1. Corrected the spelling mistake -> (sigular to single)
2. Corrected the statement -> the number of dimensions is the rank of the array.
3. Made 2 more small changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Set write permissions for workflow

* Update schedule

* Update schedule

* Update schedule

* Move pytorch version file out of workflows dir

* Update paths

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 22.3.0 → 22.6.0](psf/black@22.3.0...22.6.0)

* Push pytorch release update to release/1.2.x branch, not main

* Update schedule

* Bypass  `on push` trigger

* Update schedule

* Fix condition syntax

* Fix syntax

* On push trigger workaround

* Update schedule

* Update schedule

* Enable non-negative sample size

* Read `min` value directly from torch return object

* Enable non-negative number of samples for `logspace`

* Add test for `logspace`

* Add MPI version field to bug report template

* fix: set cuda rng state on gpu tests for test_random.py (#1014)

* Test latest pyorch on both main and release branch

* Move pytorch release record out of workflows directory

* Update paths

* New PyTorch release

* Temporarily remove trigger

* Update pytorch-latest.txt

* Reinstate trigger

* New PyTorch release

* Remove matrix strategy

* Update pytorch-latest.txt

* New PyTorch release

* New PyTorch release

* fix: set cuda rng state on gpu tests for test_random.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added tests for python 3.9 and pytorch 1.12

Co-authored-by: Claudia Comito <c.comito@fz-juelich.de>
Co-authored-by: Daniel Coquelin <daniel.coquelin@gmail.com>
Co-authored-by: ClaudiaComito <c.comito@fz-juelich.de@users.noreply.github.com>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Create mirrorci.yml

* Create .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Update .gitlab-ci.yml

* Delete Jenkinsfile

* trigger on main and release branches + PRs only

* revert

* add parallel hdf5 & netcdf

* optional dependencies activated

* exclude python 3.9 + pytorch 1.7

* Update torch 1.9 etc. versions

Co-authored-by: Claudia Comito <c.comito@fz-juelich.de>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: JuanPedroGHM <juanpedroghm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: SaiSuraj27 <87087741+SaiSuraj27@users.noreply.github.com>
Co-authored-by: neosunhan <neosunhan@gmail.com>
Co-authored-by: Markus Goetz <markus.goetz@kit.edu>
Co-authored-by: Daniel Coquelin <daniel.coquelin@gmail.com>
Co-authored-by: ClaudiaComito <c.comito@fz-juelich.de@users.noreply.github.com>
ClaudiaComito added a commit that referenced this pull request Nov 3, 2022
…ompatible shapes of local arrays (#1034)

* Replace bug report MD template with form in view of further automation

* Fix bug report file name

* Update bug_report.yml

* Update bug_report.yml

* Update bug_report.yml

* Update bug_report.yml

* Auto generated release notes and changelog (#974)

* wip: Initial release draft and changelog updater actions configuration

* doc: pr title style guide in contibuting.md

* ci: improved release draft templates

* ci: extra release draft categories

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Tutorial note about local and global printing (#972)

* doc: parallel tutorial note metioning local and global printing

* doc: extenden local print note with ``ht.local_printing()``

* Fix typo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Updated the tutorial document. (#977)

* Updated the tutorial document.

1. Corrected the spelling mistake -> (sigular to single)
2. Corrected the statement -> the number of dimensions is the rank of the array.
3. Made 2 more small changes.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typo

Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Set write permissions for workflow

* Update schedule

* Update schedule

* Update schedule

* Move pytorch version file out of workflows dir

* Update paths

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 22.3.0 → 22.6.0](psf/black@22.3.0...22.6.0)

* Push pytorch release update to release/1.2.x branch, not main

* Update schedule

* Bypass  `on push` trigger

* Update schedule

* Fix condition syntax

* Fix syntax

* On push trigger workaround

* Update schedule

* Update schedule

* Enable non-negative sample size

* Read `min` value directly from torch return object

* Enable non-negative number of samples for `logspace`

* Add test for `logspace`

* Add MPI version field to bug report template

* fix: set cuda rng state on gpu tests for test_random.py (#1014)

* Test latest pyorch on both main and release branch

* Move pytorch release record out of workflows directory

* Update paths

* New PyTorch release

* Temporarily remove trigger

* Update pytorch-latest.txt

* Reinstate trigger

* New PyTorch release

* Remove matrix strategy

* Update pytorch-latest.txt

* New PyTorch release

* New PyTorch release

* fix: set cuda rng state on gpu tests for test_random.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added tests for python 3.9 and pytorch 1.12

Co-authored-by: Claudia Comito <c.comito@fz-juelich.de>
Co-authored-by: Daniel Coquelin <daniel.coquelin@gmail.com>
Co-authored-by: ClaudiaComito <c.comito@fz-juelich.de@users.noreply.github.com>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate (#1024)

updates:
- [github.com/psf/black: 22.6.0 → 22.8.0](psf/black@22.6.0...22.8.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>

* Refactored code for readability

* rename file and activate force push

* Update bug_report.yml

fixes formatting issues

* Update bug_report.yml

fixes an issue where the bug label is not set.

* Update README.md

Use status badge from a different workflow action

* Update codecov.yml

* Update codecov.yml

* Fixed code checking for non-matching local shapes while using is_split + Added test

* Add section `Google Summer of Code 2022`

* Bug/1017 `prod` / `sum` with empty arrays (#1018)

* Check for split in `__reduce_op`

* Check whether x is distributed

Co-authored-by: mtar <m.tarnawa@fz-juelich.de>

Co-authored-by: mtar <m.tarnawa@fz-juelich.de>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>

* Add section "Array API"

* Mirror Repository and run GitHub CI at HZDR (#1032)

* Update ci worflow action

* Update codecov.yml

* Bug/999 Fix `keepdim` in `any`/`all` (#1000)

* Fix `all`

* Fix `any`

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add distributed tests

* Expanded tests for combination of axis/split axis

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: mtar <m.tarnawa@fz-juelich.de>

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/psf/black: 22.8.0 → 22.10.0](psf/black@22.8.0...22.10.0)

* Updated error message

Co-authored-by: Claudia Comito <c.comito@fz-juelich.de>
Co-authored-by: Claudia Comito <39374113+ClaudiaComito@users.noreply.github.com>
Co-authored-by: JuanPedroGHM <juanpedroghm@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: SaiSuraj27 <87087741+SaiSuraj27@users.noreply.github.com>
Co-authored-by: neosunhan <neosunhan@gmail.com>
Co-authored-by: Markus Goetz <markus.goetz@kit.edu>
Co-authored-by: mtar <m.tarnawa@fz-juelich.de>
Co-authored-by: Daniel Coquelin <daniel.coquelin@gmail.com>
Co-authored-by: ClaudiaComito <c.comito@fz-juelich.de@users.noreply.github.com>
Co-authored-by: neosunhan <97215518+neosunhan@users.noreply.github.com>
@mtar mtar removed the PR talk label Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Implementation of tests, or test-related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: test_random fails on GPU
4 participants