-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for synchronous mode #1803
Conversation
@nsubiron I use both world.tick() and world.wait_for_tick() function when running in synchronous mode at 10 fps. But carla stops suddenly and is stuck at a particular instance and I have to restart the whole simulation. I want to run 2000 episodes for reinforcement learning but it stops after 350 episodes at the maximum do i have to change some settings or initialize carla at different fps rate? |
Hi, @nsubiron
The client is fail to get the world class.
|
… and return the frame id when the changes took effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @marcgpuig)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved
@nsubiron Thanks for making this change - I agree with you that the previous We haven't been able to pin-point the issue but we're suspecting a packet drop / packet out-of-order or other concurrency problem. Currently, as far as I understand, the tick_cue call is sent from the client to the server via RPC (which immediately returns the next expected frame number), but the actual new frame info / world snapshot is sent back over the streaming connection asynchronously (and the client waits indefinitely till this is received, consequently it never sends the next tick_cue call). A hacky solution we've found is to replay the last world snapshot from the server if no new tick_cue call has been received in a while (see: #2038). But this ends up flooding the network with repeated world snapshots, sometimes when it is not needed. Instead, if we could have the entire tick call execute completely on the server side and directly return the world snapshot (not just the expected frame number), that would fantastic! (and truly a synchronous tick) Does that make sense? |
TLDR: I made
tick
andapply_settings
synchronize automatically with the server so users don't need to manually wait for tick. Unfortunately with this, old recipes usingwait_for_tick
will fail.Description
Requires #1802.Currently, we recommend using synchronous mode by calling "tick" and "wait_for_tick" on each iteration
but there is a race condition in here; although unlikely, it can happen that the tick arrives before we start waiting for it and we end up having a dead-lock. It's a difficult problem to solve cause we meed to synchronize two servers sending data asynchronously (rpc and streaming). The "tick" method sends a cue to the simulator via RPC, but "wait_for_tick" listens to the tick event received each update via streaming. See also #1795.
However, we can synchronize the
tick
method by returning the id of the newly started frame (this way we make sure the tick was applied upon function return, and also we guarantee the id of the frame we're expecting). For convenience,apply_setting
can return too the id of the frame when the settings took effect, with this, we know for sure at which frame the synchronous mode started.EDIT: I also made
tick
andapply_settings
block until the new tick is received, so no need to manually loop waiting for the state to be updated.I have written a Python example using these two changes to synchronize the output of several sensors. I think we can add something similar to the API, maybe in C++, to make the synchronous mode easier to use
EDIT: Now this example is in synchronous_mode.py.
An example script with this context manager
This change is