Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune](deps): Bump pytorch-lightning from 1.0.3 to 1.1.8 in /python/requirements #10

Conversation

dependabot[bot]
Copy link

@dependabot dependabot bot commented on behalf of github Feb 14, 2021

Bumps pytorch-lightning from 1.0.3 to 1.1.8.

Release notes

Sourced from pytorch-lightning's releases.

Standard weekly patch release

[1.1.8] - 2021-02-08

Fixed

  • Separate epoch validation from step validation (#5208)
  • Fixed toggle_optimizers not handling all optimizer parameters (#5775)

Contributors

@ananthsub, @rohitgr7

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

[1.1.7] - 2021-02-03

Fixed

  • Fixed TensorBoardLogger not closing SummaryWriter on finalize (#5696)
  • Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
  • Fixed num_classes argument in F1 metric (#5663)
  • Fixed log_dir property (#5537)
  • Fixed a race condition in ModelCheckpoint when checking if a checkpoint file exists (#5144)
  • Remove unnecessary intermediate layers in Dockerfiles (#5697)
  • Fixed auto learning rate ordering (#5638)

Contributors

@awaelchli @guillochon @noamzilo @rohitgr7 @SkafteNicki @sumanthratna

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

[1.1.6] - 2021-01-26

Changed

  • Increased TPU check timeout from 20s to 100s (#5598)
  • Ignored step param in Neptune logger's log_metric method (#5510)
  • Pass batch outputs to on_train_batch_end instead of epoch_end outputs (#4369)

Fixed

  • Fixed toggle_optimizer to reset requires_grad state (#5574)
  • Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
  • Fixed an error when logging a progress bar metric with a reserved name (#5620)
  • Fixed Metric's state_dict not included when child modules (#5614)
  • Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
  • Fixed duplicate logs appearing in console when using the python logging module (#5509)
  • Fixed tensor printing in trainer.test() (#5138)
  • Fixed not using dataloader when hparams present (#4559)

... (truncated)

Changelog

Sourced from pytorch-lightning's changelog.

[1.1.8] - 2021-02-08

Fixed

  • Separate epoch validation from step validation (#5208)
  • Fixed toggle_optimizers not handling all optimizer parameters (#5775)

[1.1.7] - 2021-02-03

Fixed

  • Fixed TensorBoardLogger not closing SummaryWriter on finalize (#5696)
  • Fixed filtering of pytorch "unsqueeze" warning when using DP (#5622)
  • Fixed num_classes argument in F1 metric (#5663)
  • Fixed log_dir property (#5537)
  • Fixed a race condition in ModelCheckpoint when checking if a checkpoint file exists (#5144)
  • Remove unnecessary intermediate layers in Dockerfiles (#5697)
  • Fixed auto learning rate ordering (#5638)

[1.1.6] - 2021-01-26

Changed

  • Increased TPU check timeout from 20s to 100s (#5598)
  • Ignored step param in Neptune logger's log_metric method (#5510)
  • Pass batch outputs to on_train_batch_end instead of epoch_end outputs (#4369)

Fixed

  • Fixed toggle_optimizer to reset requires_grad state (#5574)
  • Fixed FileNotFoundError for best checkpoint when using DDP with Hydra (#5629)
  • Fixed an error when logging a progress bar metric with a reserved name (#5620)
  • Fixed Metric's state_dict not included when child modules (#5614)
  • Fixed Neptune logger creating multiple experiments when GPUs > 1 (#3256)
  • Fixed duplicate logs appearing in console when using the python logging module (#5509)
  • Fixed tensor printing in trainer.test() (#5138)
  • Fixed not using dataloader when hparams present (#4559)

[1.1.5] - 2021-01-19

Fixed

  • Fixed a visual bug in the progress bar display initialization (#4579)
  • Fixed logging on_train_batch_end in a callback with multiple optimizers (#5521)
  • Fixed reinit_scheduler_properties with correct optimizer (#5519)
  • Fixed val_check_interval with fast_dev_run (#5540)

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Feb 14, 2021
@dependabot dependabot bot force-pushed the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch from 63128ec to 1c83739 Compare February 16, 2021 03:38
@dependabot dependabot bot force-pushed the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch from 1c83739 to ad0e4b5 Compare February 18, 2021 07:15
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Feb 20, 2021

Superseded by #13.

@dependabot dependabot bot closed this Feb 20, 2021
@dependabot dependabot bot deleted the dependabot/pip/python/requirements/pytorch-lightning-1.1.8 branch February 20, 2021 08:02
rkooo567 pushed a commit that referenced this pull request Jul 27, 2022
We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is:

```
#0  0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) ()
   from /lib64/libstdc++.so.6
#1  0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#2  0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#3  0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#4  0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] ()
   from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#5  0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so
#6  0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#7  0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#8  0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#9  0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2
#11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2
#12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6
#13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6
#14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2
#15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>)
    at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369
```

The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`).

It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`.

The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though.

BTW, I've tried different approaches:

1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well.
2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.
rkooo567 pushed a commit that referenced this pull request Jul 22, 2024
…e script and matching RLModule example class (tiny CNN).. (ray-project#45774)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants