-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🛠️] Change Command Scheduler to fix iterator invalidation bugs and get rid of "hacks" in C++ and Java #6593
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the iteration being done via an iterator (and then the removal also being done through the iterator)?
There's already endingCommands
, and this adds another toRemove
; I feel we're hacking this implementation patch on patch and it's not great.
This diverges from Java.
This changes behavior, as with this isScheduled
will return true
until the following iteration and not immediately when isFinished
returns false. As an example of practical change, proxies will now end the iteration after the proxied command (which tbh might be better than the current situation where it depends on the map iteration order).
Add unit tests based on the code used to find this.
Sort of. It used a range for, which has an internal iterator. That iterator is invalidated by SmallSet when the current object is removed, which means the for loop next iteration is incrementing from an invalid iterator (UB). Even manually managing the iterators doesn't work with SmallSet, because it can use a vector implementation, which invalidates all iterators, not just the current one. So the only other possible fix would be to use std::set AND manually manage the iterators instead of using range-for. Java doesn't have this issue--iterator.remove() updates itself in such a way that hasNext() on it is safe for the next loop iteration. |
|
No. Neither does std::set. |
/format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
However, considering that this certainly does have a change in behavior, Java should be changed to match. (We've had enough issues with differences between Java and C++, so it's best to keep them in sync as much as possible.) I'll note that robotpy's command scheduler avoids this issue by iterating over a (shallow) copy of the map/dictionary, but it still has the scheduling and canceling queues (presumably for compatibility?).
Additionally, I'm nervous about partially delaying ending the command- It feels like at least one person will run into an edge case where a "scheduled" command not being linked to any requirements causes strange and difficult-to-debug behavior.
wpilibNewCommands/src/main/native/cpp/frc2/command/CommandScheduler.cpp
Outdated
Show resolved
Hide resolved
The way robot.py did it seems like a better way to mitigate this issue, and What do you guys think? |
I agree that copying
|
Whatever behavior is being chosen, if it's going to being a part of the "spec", there should be tests that fail currently and pass afterwards added. |
/format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! (For what it's worth)
I'll just note down the behavioral differences for posterity, but I think these are acceptable (even positive) changes:
command.isScheduled()
isfalse
whencommand.end(boolean)
and the finish or interrupt callbacks are called.- When scheduling a command from the run loop,
command.initialize()
and the schedule callbacks are called immediately, not after the run loop. It remains the case thatcommand.execute()
will only be called from the next call toscheduler.run()
. - When cancelling a command from the run loop ,
command.end(true)
and the finish or interrupt callbacks are called immediately, not after the run loop (and the queued scheduled commands are processed). Notably, if a command (call it A) cancels another command (call it B) that was scheduled after A (and therefore will be processed after A), B is not be processed (execute()
called andisFinished()
checked) in the same run loop.
wpilibNewCommands/src/main/java/edu/wpi/first/wpilibj2/command/CommandScheduler.java
Outdated
Show resolved
Hide resolved
/format |
exact same changes from wpilibsuite/allwpilib#6593
Set.copyOf is memory allocation heavy; this is what the implementation is: return (Set<E>)Set.of(new HashSet<>(coll).toArray()); So it first creates a HashSet, then creates an array of its elements, then creates a Set from that array. For how we're using it here, toArray() would be more efficient. As a further optimization we could even avoid allocations most of the time by smartly reusing the array. |
Note that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs so many tests.
SchedulingRecursionTest
passing is a good sign (or a very bad one, if the tests there aren't working).
All behavior changes, edge cases, and so on need to be tested. Rigorously.
Note also toArray() will null-fill the end elements, not shrink the array, so you will need to look for the first null and exit the loop early if doing this. |
3dc304c
to
c85c91a
Compare
this is also to retrigger workflows
Co-authored-by: Joseph Eng <91924258+KangarooKoala@users.noreply.github.com>
c85c91a
to
2f5c5c4
Compare
synced with main with rebase. Will write new tests and better optimize currently working on it |
4db4256
to
2d54082
Compare
wpilibNewCommands/src/test/native/cpp/frc2/command/CommandScheduleTest.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: Joseph Eng <91924258+KangarooKoala@users.noreply.github.com>
2d54082
to
7149e43
Compare
fix formatting with rebase + force-push: (I forget to run |
96b0bb9
to
68d669d
Compare
Update the checking for null and isScheduled as discussed in discord with a amend + force-push: |
68d669d
to
896d784
Compare
oops did |
896d784
to
9154940
Compare
I have added 3 different tests testing weird cases of using the scheduler inside the commands. all 3 fail on the main branch and pass on this branch What other tests should I add? @Starlight220? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open for discussion- Should we move some of these tests to SchedulingRecursionTest
?
wpilibNewCommands/src/test/java/edu/wpi/first/wpilibj2/command/CommandScheduleTest.java
Outdated
Show resolved
Hide resolved
wpilibNewCommands/src/test/java/edu/wpi/first/wpilibj2/command/CommandScheduleTest.java
Outdated
Show resolved
Hide resolved
wpilibNewCommands/src/test/native/cpp/frc2/command/CommandScheduleTest.cpp
Outdated
Show resolved
Hide resolved
wpilibNewCommands/src/test/native/cpp/frc2/command/CommandScheduleTest.cpp
Outdated
Show resolved
Hide resolved
Co-Authored-By: Joseph Eng <91924258+KangarooKoala@users.noreply.github.com>
I say we move test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I say we move test cancelNextCommandTest for sure and also scheduleCommandInCommand since they are both similar to #4259 but keep commandKnowsWhenEndedTest since it feels more like a CommandScheduler test more than a scheduling recursion.
Any objections?
Sounds good!
scheduledCommands.erase()
was being called infor(Command* command : scheduledCommands)
this now gets put into a temporary vector (at the size of the set, I do not think all the commands will end in the same for loop but nevertheless it's the size of the set) and erases them after the for loop is done.
Found together with @PeterJohnson.