Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify progress bar args #1108

Merged
merged 16 commits into from
Apr 2, 2020
Merged
22 changes: 10 additions & 12 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,14 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Changed

- Changed `progress_bar_refresh_rate` trainer flag to disable progress bar when set to 0. ([#1108](https://github.com/PyTorchLightning/pytorch-lightning/pull/1108))
- Changed default behaviour of `configure_optimizers` to use no optimizer rather than Adam. ([#1279](https://github.com/PyTorchLightning/pytorch-lightning/pull/1279))
- Added support for optimizer frequencies through `LightningModule.configure_optimizers()` ([#1269](https://github.com/PyTorchLightning/pytorch-lightning/pull/1269))
- Added support for `IterableDataset` when `val_check_interval=1.0` (default), this will trigger validation at the end of each epoch. ([#1283](https://github.com/PyTorchLightning/pytorch-lightning/pull/1283))
- Added `summary` method to Profilers. ([#1259](https://github.com/PyTorchLightning/pytorch-lightning/pull/1259))
- Added informative errors if user defined dataloader has zero length ([#1280](https://github.com/PyTorchLightning/pytorch-lightning/pull/1280))
- Allow to upload models on W&B ([#1339](https://github.com/PyTorchLightning/pytorch-lightning/pull/1339))
- Added model configuration checking ([#1199](https://github.com/PyTorchLightning/pytorch-lightning/pull/1199))

### Changed

Borda marked this conversation as resolved.
Show resolved Hide resolved
- On DP and DDP2 unsqueeze is automated now ([#1319](https://github.com/PyTorchLightning/pytorch-lightning/pull/1319))
- Does not interfere with a default sampler ([#1318](https://github.com/PyTorchLightning/pytorch-lightning/pull/1318))
- Enhanced load_from_checkpoint to also forward params to the model ([#1307](https://github.com/PyTorchLightning/pytorch-lightning/pull/1307))
Expand All @@ -39,6 +37,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
### Deprecated

- Deprecated Trainer argument `print_nan_grads` ([#1097](https://github.com/PyTorchLightning/pytorch-lightning/pull/1097))
- Deprecated Trainer argument `show_progress_bar` ([#1108](https://github.com/PyTorchLightning/pytorch-lightning/pull/1108))

### Removed

Expand Down Expand Up @@ -67,9 +66,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Added

- Added automatic sampler setup. Depending on DDP or TPU, lightning configures the sampler correctly (user needs to do nothing) ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Added `reload_dataloaders_every_epoch=False` flag for trainer. Some users require reloading data every epoch ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Added `progress_bar_refresh_rate=50` flag for trainer. Throttle refresh rate on notebooks ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Added automatic sampler setup. Depending on DDP or TPU, lightning configures the sampler correctly (user needs to do nothing) ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Added `reload_dataloaders_every_epoch=False` flag for trainer. Some users require reloading data every epoch ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Added `progress_bar_refresh_rate=50` flag for trainer. Throttle refresh rate on notebooks ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Updated governance docs
- Added a check to ensure that the metric used for early stopping exists before training commences ([#542](https://github.com/PyTorchLightning/pytorch-lightning/pull/542))
- Added `optimizer_idx` argument to `backward` hook ([#733](https://github.com/PyTorchLightning/pytorch-lightning/pull/733))
Expand All @@ -92,7 +91,6 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Added TPU gradient clipping ([#963](https://github.com/PyTorchLightning/pytorch-lightning/pull/963))
- Added max/min number of steps in `Trainer` ([#728](https://github.com/PyTorchLightning/pytorch-lightning/pull/728))


### Changed

- Improved `NeptuneLogger` by adding `close_after_fit` argument to allow logging after training([#908](https://github.com/PyTorchLightning/pytorch-lightning/pull/1084))
Expand All @@ -104,17 +102,17 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Freezed models `hparams` as `Namespace` property ([#1029](https://github.com/PyTorchLightning/pytorch-lightning/pull/1029))
- Dropped `logging` config in package init ([#1015](https://github.com/PyTorchLightning/pytorch-lightning/pull/1015))
- Renames model steps ([#1051](https://github.com/PyTorchLightning/pytorch-lightning/pull/1051))
* `training_end` >> `training_epoch_end`
* `validation_end` >> `validation_epoch_end`
* `test_end` >> `test_epoch_end`
- `training_end` >> `training_epoch_end`
- `validation_end` >> `validation_epoch_end`
- `test_end` >> `test_epoch_end`
Borda marked this conversation as resolved.
Show resolved Hide resolved
- Refactor dataloading, supports infinite dataloader ([#955](https://github.com/PyTorchLightning/pytorch-lightning/pull/955))
- Create single file in `TensorBoardLogger` ([#777](https://github.com/PyTorchLightning/pytorch-lightning/pull/777))

### Deprecated

- Deprecated `pytorch_lightning.logging` ([#767](https://github.com/PyTorchLightning/pytorch-lightning/pull/767))
- Deprecated `LightningModule.load_from_metrics` in favour of `LightningModule.load_from_checkpoint` ([#995](https://github.com/PyTorchLightning/pytorch-lightning/pull/995), [#1079](https://github.com/PyTorchLightning/pytorch-lightning/pull/1079))
- Deprecated `@data_loader` decorator ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Deprecated `@data_loader` decorator ([#926](https://github.com/PyTorchLightning/pytorch-lightning/pull/926))
- Deprecated model steps `training_end`, `validation_end` and `test_end` ([#1051](https://github.com/PyTorchLightning/pytorch-lightning/pull/1051), [#1056](https://github.com/PyTorchLightning/pytorch-lightning/pull/1056))

### Removed
Expand Down Expand Up @@ -305,7 +303,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
### Added

- Added the flag `log_gpu_memory` to `Trainer` to deactivate logging of GPU
Borda marked this conversation as resolved.
Show resolved Hide resolved
memory utilization
memory utilization
Borda marked this conversation as resolved.
Show resolved Hide resolved
gerardrbentley marked this conversation as resolved.
Show resolved Hide resolved
- Added SLURM resubmit functionality (port from test-tube)
- Added optional weight_save_path to trainer to remove the need for a checkpoint_callback when using cluster training
- Added option to use single gpu per node with `DistributedDataParallel`
Expand Down
10 changes: 4 additions & 6 deletions pytorch_lightning/trainer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,8 @@ def on_train_end(self):
# default used by the Trainer
trainer = Trainer(progress_bar_refresh_rate=1)

#disable progress bar
williamFalcon marked this conversation as resolved.
Show resolved Hide resolved
trainer = Trainer(progress_bar_refresh_rate=0)

reload_dataloaders_every_epoch
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -702,12 +704,8 @@ def on_train_end(self):
show_progress_bar
^^^^^^^^^^^^^^^^^

If true shows tqdm progress bar

Example::

# default used by the Trainer
trainer = Trainer(show_progress_bar=True)
.. warning:: .. deprecated:: 0.7.2
Set `progress_bar_refresh_rate` to 0 instead. Will remove 0.9.0.
Borda marked this conversation as resolved.
Show resolved Hide resolved

test_percent_check
^^^^^^^^^^^^^^^^^^
Expand Down
19 changes: 19 additions & 0 deletions pytorch_lightning/trainer/deprecated_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,22 @@ def nb_sanity_val_steps(self, nb):
"`num_sanity_val_steps` since v0.5.0"
" and this method will be removed in v0.8.0", DeprecationWarning)
self.num_sanity_val_steps = nb


class TrainerDeprecatedAPITillVer0_9(ABC):

def __init__(self):
super().__init__() # mixin calls super too

@property
def show_progress_bar(self):
"""Back compatibility, will be removed in v0.9.0"""
warnings.warn("Argument `show_progress_bar` is now set by `progress_bar_refresh_rate` since v0.7.2"
" and this method will be removed in v0.9.0", DeprecationWarning)
return self.progress_bar_refresh_rate >= 1

@show_progress_bar.setter
def show_progress_bar(self, tf):
"""Back compatibility, will be removed in v0.9.0"""
warnings.warn("Argument `show_progress_bar` is now set by `progress_bar_refresh_rate` since v0.7.2"
" and this method will be removed in v0.9.0", DeprecationWarning)
2 changes: 1 addition & 1 deletion pytorch_lightning/trainer/distrib_data_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ def ddp_train(self, gpu_idx, model):
self.node_rank = 0

# show progressbar only on progress_rank 0
self.show_progress_bar = self.show_progress_bar and self.node_rank == 0 and gpu_idx == 0
self.progress_bar_refresh_rate = self.progress_bar_refresh_rate if self.node_rank == 0 and gpu_idx == 0 else 0

# determine which process we are and world size
if self.use_ddp:
Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/trainer/distrib_parts.py
Original file line number Diff line number Diff line change
Expand Up @@ -480,7 +480,7 @@ def tpu_train(self, tpu_core_idx, model):
self.tpu_global_core_rank = xm.get_ordinal()

# avoid duplicating progress bar
self.show_progress_bar = self.show_progress_bar and self.tpu_global_core_rank == 0
self.progress_bar_refresh_rate = self.progress_bar_refresh_rate if self.tpu_global_core_rank == 0 else 0

# track current tpu
self.current_tpu_idx = tpu_core_idx
Expand Down
5 changes: 2 additions & 3 deletions pytorch_lightning/trainer/evaluation_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,6 @@ class TrainerEvaluationLoopMixin(ABC):
num_val_batches: int
fast_dev_run: ...
process_position: ...
show_progress_bar: ...
process_output: ...
training_tqdm_dict: ...
proc_rank: int
Expand Down Expand Up @@ -278,7 +277,7 @@ def _evaluate(self, model: LightningModule, dataloaders, max_batches: int, test_
dl_outputs.append(output)

# batch done
if batch_idx % self.progress_bar_refresh_rate == 0:
if self.progress_bar_refresh_rate >= 1 and batch_idx % self.progress_bar_refresh_rate == 0:
if test_mode:
self.test_progress_bar.update(self.progress_bar_refresh_rate)
else:
Expand Down Expand Up @@ -361,7 +360,7 @@ def run_evaluation(self, test_mode: bool = False):
desc = 'Testing' if test_mode else 'Validating'
total = max_batches if max_batches != float('inf') else None
pbar = tqdm(desc=desc, total=total, leave=test_mode, position=position,
disable=not self.show_progress_bar, dynamic_ncols=True, file=sys.stdout)
disable=not self.progress_bar_refresh_rate, dynamic_ncols=True, file=sys.stdout)
setattr(self, f'{"test" if test_mode else "val"}_progress_bar', pbar)

# run evaluation
Expand Down
20 changes: 13 additions & 7 deletions pytorch_lightning/trainer/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@
from pytorch_lightning.trainer.callback_config import TrainerCallbackConfigMixin
from pytorch_lightning.trainer.callback_hook import TrainerCallbackHookMixin
from pytorch_lightning.trainer.data_loading import TrainerDataLoadingMixin
from pytorch_lightning.trainer.deprecated_api import TrainerDeprecatedAPITillVer0_8
from pytorch_lightning.trainer.deprecated_api import (TrainerDeprecatedAPITillVer0_8,
TrainerDeprecatedAPITillVer0_9)
from pytorch_lightning.trainer.distrib_data_parallel import TrainerDDPMixin
from pytorch_lightning.trainer.distrib_parts import TrainerDPMixin, parse_gpu_ids, determine_root_gpu_device
from pytorch_lightning.trainer.evaluation_loop import TrainerEvaluationLoopMixin
Expand Down Expand Up @@ -66,12 +67,13 @@ class Trainer(
TrainerCallbackConfigMixin,
TrainerCallbackHookMixin,
TrainerDeprecatedAPITillVer0_8,
TrainerDeprecatedAPITillVer0_9,
):
DEPRECATED_IN_0_8 = (
'gradient_clip', 'nb_gpu_nodes', 'max_nb_epochs', 'min_nb_epochs',
'add_row_log_interval', 'nb_sanity_val_steps'
)
DEPRECATED_IN_0_9 = ('use_amp',)
DEPRECATED_IN_0_9 = ('use_amp', 'show_progress_bar')

def __init__(
self,
Expand All @@ -86,7 +88,7 @@ def __init__(
gpus: Optional[Union[List[int], str, int]] = None,
num_tpu_cores: Optional[int] = None,
log_gpu_memory: Optional[str] = None,
show_progress_bar: bool = True,
show_progress_bar=None, # backward compatible, todo: remove in v0.9.0
progress_bar_refresh_rate: int = 1,
overfit_pct: float = 0.0,
track_grad_norm: int = -1,
Expand Down Expand Up @@ -161,9 +163,11 @@ def __init__(

log_gpu_memory: None, 'min_max', 'all'. Might slow performance

show_progress_bar: If true shows tqdm progress bar
show_progress_bar:
.. warning:: .. deprecated:: 0.7.2
Set `progress_bar_refresh_rate` to postive integer to enable. Will remove 0.9.0.
Borda marked this conversation as resolved.
Show resolved Hide resolved

progress_bar_refresh_rate: How often to refresh progress bar (in steps)
progress_bar_refresh_rate: How often to refresh progress bar (in steps). Value ``0`` disables progress bar.

overfit_pct: How much of training-, validation-, and test dataset to check.

Expand Down Expand Up @@ -414,7 +418,9 @@ def __init__(

# can't init progress bar here because starting a new process
# means the progress_bar won't survive pickling
self.show_progress_bar = show_progress_bar
# backward compatibility
if show_progress_bar is not None:
self.show_progress_bar = show_progress_bar

# logging
self.log_save_interval = log_save_interval
Expand Down Expand Up @@ -820,7 +826,7 @@ def run_pretrain_routine(self, model: LightningModule):
pbar = tqdm(desc='Validation sanity check',
total=self.num_sanity_val_steps * len(self.val_dataloaders),
leave=False, position=2 * self.process_position,
disable=not self.show_progress_bar, dynamic_ncols=True)
disable=not self.progress_bar_refresh_rate, dynamic_ncols=True)
self.main_progress_bar = pbar
# dummy validation progress bar
self.val_progress_bar = tqdm(disable=True)
Expand Down
2 changes: 1 addition & 1 deletion pytorch_lightning/trainer/training_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -623,7 +623,7 @@ def optimizer_closure():
self.get_model().on_batch_end()

# update progress bar
if batch_idx % self.progress_bar_refresh_rate == 0:
if self.progress_bar_refresh_rate >= 1 and batch_idx % self.progress_bar_refresh_rate == 0:
self.main_progress_bar.update(self.progress_bar_refresh_rate)
self.main_progress_bar.set_postfix(**self.training_tqdm_dict)

Expand Down
8 changes: 2 additions & 6 deletions tests/models/test_amp.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ def test_amp_single_gpu(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=True,
max_epochs=1,
gpus=1,
distributed_backend='ddp',
Expand All @@ -42,7 +41,6 @@ def test_no_amp_single_gpu(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=True,
max_epochs=1,
gpus=1,
distributed_backend='dp',
Expand All @@ -66,7 +64,6 @@ def test_amp_gpu_ddp(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=True,
max_epochs=1,
gpus=2,
distributed_backend='ddp',
Expand All @@ -90,7 +87,6 @@ def test_amp_gpu_ddp_slurm_managed(tmpdir):
model = LightningTestModel(hparams)

trainer_options = dict(
show_progress_bar=True,
max_epochs=1,
gpus=[0],
distributed_backend='ddp',
Expand Down Expand Up @@ -128,8 +124,8 @@ def test_cpu_model_with_amp(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=False,
logger=tutils.get_default_testtube_logger(tmpdir),
progress_bar_refresh_rate=0,
logger=tutils.get_test_tube_logger(tmpdir),
Borda marked this conversation as resolved.
Show resolved Hide resolved
max_epochs=1,
train_percent_check=0.4,
val_percent_check=0.4,
Expand Down
15 changes: 7 additions & 8 deletions tests/models/test_cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ def test_early_stopping_cpu_model(tmpdir):
gradient_clip_val=1.0,
overfit_pct=0.20,
track_grad_norm=2,
show_progress_bar=True,
logger=tutils.get_default_testtube_logger(tmpdir),
train_percent_check=0.1,
val_percent_check=0.1,
Expand All @@ -48,7 +47,7 @@ def test_lbfgs_cpu_model(tmpdir):
trainer_options = dict(
default_save_path=tmpdir,
max_epochs=2,
show_progress_bar=False,
progress_bar_refresh_rate=0,
weights_summary='top',
train_percent_check=1.0,
val_percent_check=0.2,
Expand All @@ -67,7 +66,7 @@ def test_default_logger_callbacks_cpu_model(tmpdir):
max_epochs=1,
gradient_clip_val=1.0,
overfit_pct=0.20,
show_progress_bar=False,
progress_bar_refresh_rate=0,
train_percent_check=0.01,
val_percent_check=0.01,
)
Expand Down Expand Up @@ -95,7 +94,7 @@ def test_running_test_after_fitting(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=False,
progress_bar_refresh_rate=0,
max_epochs=8,
train_percent_check=0.4,
val_percent_check=0.2,
Expand Down Expand Up @@ -133,7 +132,7 @@ class CurrentTestModel(LightTrainDataloader, LightTestMixin, TestModelBase):
checkpoint = tutils.init_checkpoint_callback(logger)

trainer_options = dict(
show_progress_bar=False,
progress_bar_refresh_rate=0,
max_epochs=1,
train_percent_check=0.4,
val_percent_check=0.2,
Expand Down Expand Up @@ -226,7 +225,7 @@ def test_cpu_model(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=False,
progress_bar_refresh_rate=0,
logger=tutils.get_default_testtube_logger(tmpdir),
max_epochs=1,
train_percent_check=0.4,
Expand All @@ -247,7 +246,7 @@ def test_all_features_cpu_model(tmpdir):
gradient_clip_val=1.0,
overfit_pct=0.20,
track_grad_norm=2,
show_progress_bar=False,
progress_bar_refresh_rate=0,
logger=tutils.get_default_testtube_logger(tmpdir),
accumulate_grad_batches=2,
max_epochs=1,
Expand Down Expand Up @@ -344,7 +343,7 @@ def test_single_gpu_model(tmpdir):

trainer_options = dict(
default_save_path=tmpdir,
show_progress_bar=False,
progress_bar_refresh_rate=0,
max_epochs=1,
train_percent_check=0.1,
val_percent_check=0.1,
Expand Down
Loading