[rllib] Support batch norm layers #3369

ericl · 2018-11-21T02:22:32Z

What do these changes do?

Pass in the right is_training tensor to build_layers_v2
Add the right update ops to apply grads automatically
Example script in test_batch_norm.py

I don't think this will work for e.g., A3C which applies gradient updates separately, but it should work fine in the other execution modes.

Related issue number

Closes: #2023

AmplabJenkins · 2018-11-21T03:54:19Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9498/
Test FAILed.

AmplabJenkins · 2018-11-22T04:52:32Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9528/
Test FAILed.

AmplabJenkins · 2018-11-22T12:05:31Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9542/
Test FAILed.

AmplabJenkins · 2018-11-24T21:36:59Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9574/
Test FAILed.

ericl · 2018-11-27T09:12:34Z

Ping @richardliaw

AmplabJenkins · 2018-11-27T12:17:57Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9630/
Test FAILed.

python/ray/rllib/agents/ppo/ppo.py

richardliaw · 2018-11-28T22:29:41Z

python/ray/rllib/evaluation/tf_policy_graph.py

@@ -151,7 +164,7 @@ def build_compute_actions(self,
            builder.add_feed_dict({self._prev_action_input: prev_action_batch})
        if self._prev_reward_input is not None and prev_reward_batch:
            builder.add_feed_dict({self._prev_reward_input: prev_reward_batch})
-        builder.add_feed_dict({self._is_training: is_training})
+        builder.add_feed_dict({self._is_training: False})


where in the code is _is_training True?

ray/python/ray/rllib/evaluation/tf_policy_graph.py

Line 227 in dedd3a8

builder.add_feed_dict({self._is_training: True})

richardliaw

A few questions

Co-Authored-By: ericl <ekhliang@gmail.com>

AmplabJenkins · 2018-11-29T11:32:14Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9660/
Test FAILed.

AmplabJenkins · 2018-11-29T11:37:36Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/9661/
Test FAILed.

ericl added 2 commits November 20, 2018 18:15

batch norm

49feadc

lint

1fee87e

ericl assigned richardliaw Nov 21, 2018

ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 21, 2018

fix dqn/ddpg update ops

e340734

ericl added 2 commits November 22, 2018 00:13

Merge remote-tracking branch 'upstream/master' into batch-norm

ed85249

bn model

1908a74

ericl removed the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 22, 2018

ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 22, 2018

ericl added 2 commits November 24, 2018 13:12

Update tf_policy_graph.py

bfb0fc0

Update multi_gpu_impl.py

3819373

Merge branch 'master' into batch-norm

dedd3a8

richardliaw reviewed Nov 28, 2018

View reviewed changes

python/ray/rllib/agents/ppo/ppo.py Outdated Show resolved Hide resolved

richardliaw reviewed Nov 28, 2018

View reviewed changes

richardliaw and others added 2 commits November 29, 2018 00:30

Apply suggestions from code review

c52d4bd

Co-Authored-By: ericl <ekhliang@gmail.com>

Merge branch 'master' into batch-norm

b4a4a4a

richardliaw approved these changes Nov 29, 2018

View reviewed changes

ericl merged commit 07d8cbf into ray-project:master Nov 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] Support batch norm layers #3369

[rllib] Support batch norm layers #3369

ericl commented Nov 21, 2018 •

edited

Loading

AmplabJenkins commented Nov 21, 2018

AmplabJenkins commented Nov 22, 2018

AmplabJenkins commented Nov 22, 2018

AmplabJenkins commented Nov 24, 2018

ericl commented Nov 27, 2018

AmplabJenkins commented Nov 27, 2018

richardliaw Nov 28, 2018

ericl Nov 29, 2018

richardliaw left a comment

AmplabJenkins commented Nov 29, 2018

AmplabJenkins commented Nov 29, 2018

[rllib] Support batch norm layers #3369

[rllib] Support batch norm layers #3369

Conversation

ericl commented Nov 21, 2018 • edited Loading

What do these changes do?

Related issue number

AmplabJenkins commented Nov 21, 2018

AmplabJenkins commented Nov 22, 2018

AmplabJenkins commented Nov 22, 2018

AmplabJenkins commented Nov 24, 2018

ericl commented Nov 27, 2018

AmplabJenkins commented Nov 27, 2018

richardliaw Nov 28, 2018

Choose a reason for hiding this comment

ericl Nov 29, 2018

Choose a reason for hiding this comment

richardliaw left a comment

Choose a reason for hiding this comment

AmplabJenkins commented Nov 29, 2018

AmplabJenkins commented Nov 29, 2018

ericl commented Nov 21, 2018 •

edited

Loading