Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frames are returned incorrectly after an absolute action (e.g. teleport) #250

Closed
salaniz opened this issue Aug 2, 2016 · 3 comments
Closed
Milestone

Comments

@salaniz
Copy link

salaniz commented Aug 2, 2016

I am trying to get robust frame-action pairs as mentioned in issue #231 .
Currently, I have a workaround solution by waiting for the next new observation and then accepting the first frame that arrives thereafter. See my attached Lua file as a reference.

This method seems to work for discrete actions, but not for absolute actions like teleport.
After an absolute action, I receive a frame of the agent in the previous state (e.g. prior to teleportation) instead of the new state.

If I skip the first frame after a new observation and wait for the second observation-frame pair after the action then it works for both discrete and absolute actions. However, it would be nice if it's consistent and preferably the first frame after a new observation that corresponds to the new state.

I can reproduce the problem with the attached files written in Lua. The code uses the Lua modules torch and image as well as qtlua to display the frame after each action. Run it with qlua frame_action_pairs.lua. Alternatively, it can also be run without qtlua by saving images instead of displaying them (see lines 96 and 150).
Use the boolean variable skip_first_frame to either accept the first observation-frame pair or the second.

I am using the latest release: 0.16.0

frame_action_pairs.zip

This was referenced Aug 4, 2016
@timhutton timhutton modified the milestone: Dolphin Aug 15, 2016
@timhutton
Copy link
Contributor

timhutton commented Aug 16, 2016

@salaniz Thanks for looking into this. Sorry it's causing problems.

When you send a command to Minecraft, it gets added to a queue. Then, on every world tick (usually every 50ms unless overclocking) the pending commands are acted on. On a separate thread the rendering is happening as quickly as possible (up to a limit of 60fps usually). Together with random delays in the network messages this makes it hard to robustly get a frame containing the outcome of the command.

I've just been looking at the tabular_q_learning.py sample to see what can be done here. With the approach there it seems robust for discrete movement. I'll have a look at the teleporting movement now.

In the meanwhile, take a look at ObservationFromRecentCommands. This gets returned after a command has been acted on, and so hopefully by taking the next frame after that it should be robust. Let me know if this helps. I'll try it too and will make sure we include a sample for this when we have a solution.

@salaniz
Copy link
Author

salaniz commented Aug 16, 2016

@timhutton Thanks for the advice.

My current solution to the problem is based on how it is done in tabular_q_learning.py: I wait for a new observation to arrive and check if my current state has changed (x, y, z, pitch or yaw). If so, I accept the next frame that comes thereafter and if not, I discard the observation and wait for the next.

So far this methods seems to be robust. Both ObservationFromRecentCommands and the positional information in the frame (from #259) should provide enough additional means to associate frame-action pairs.

However, it remains that absolute actions are handled a little differently than the rest as I have noticed in #255, too.

@timhutton
Copy link
Contributor

@salaniz Yes, there's a difference between how the absolute movement commands (tp, etc.) work and how the discrete movement commands (movenorth, etc.) work - the latter are directly applied on the Minecraft client and thus act immediately, while the former must be sent to the server to be acted upon. This introduces extra delay, which I think is part of the problem.

The same thing applies to the discrete use and attack commands, which also need to be sent to the server.

We're actively looking at how to robustly get frame-action pairs with these server-side actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants