[RLlib] Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). #48702

sven1977 · 2024-11-12T16:49:38Z

Reverse learner queue behavior of IMPALA/APPO:

consume oldest batches first (consume from left side of deque), instead of newest, BUT drop oldest batches if queue full.
when consuming from the right side (newest batches first) AND the queue size is >1, we run into the danger of consuming "normal" batches most time, but sometimes grabbing a very old batch from the left side of the deque, possibly destabilizing learning suddenly.
We leave the default learner_queue_size at 3 for now.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Excited to see APPOs performance now.

simonsays1980 · 2024-11-12T17:00:24Z

rllib/algorithms/impala/impala_learner.py

@@ -302,7 +302,11 @@ def step(self):
            if not self._in_queue:
                time.sleep(0.001)
                return
-            ma_batch_on_gpu = self._in_queue.pop()
+            # Consume from the left (oldest batches first).


Yeah :) Probably an important change

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…ge_learner_queue_logic_in_impala_from_filo_to_fifo

Signed-off-by: sven1977 <svenmika1977@gmail.com>

… batches first, instead of newest, BUT drop oldest batches if queue full). (ray-project#48702) Signed-off-by: hjiang <dentinyhao@gmail.com>

wip

de9ecf8

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested a review from simonsays1980 as a code owner November 12, 2024 16:49

simonsays1980 approved these changes Nov 12, 2024

View reviewed changes

wip

d8c1c47

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) November 12, 2024 17:15

github-actions bot added the go add ONLY when ready to merge, run all tests label Nov 12, 2024

fix

4cf1f15

Signed-off-by: sven1977 <svenmika1977@gmail.com>

github-actions bot disabled auto-merge November 12, 2024 17:59

sven1977 added 2 commits November 15, 2024 12:34

Merge branch 'master' of https://github.com/ray-project/ray into chan…

5d20fb0

…ge_learner_queue_logic_in_impala_from_filo_to_fifo

wip

57f20af

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) November 15, 2024 11:36

github-actions bot disabled auto-merge November 15, 2024 11:36

sven1977 enabled auto-merge (squash) November 15, 2024 13:14

sven1977 merged commit b33fa04 into ray-project:master Nov 15, 2024
7 checks passed

sven1977 deleted the change_learner_queue_logic_in_impala_from_filo_to_fifo branch November 22, 2024 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). #48702

[RLlib] Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). #48702

sven1977 commented Nov 12, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Nov 12, 2024

[RLlib] Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). #48702

[RLlib] Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). #48702

Conversation

sven1977 commented Nov 12, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Nov 12, 2024

Choose a reason for hiding this comment

sven1977 commented Nov 12, 2024 •

edited

Loading