Skip to content
This repository has been archived by the owner on Apr 6, 2018. It is now read-only.

Solutions to common problems

Trevor Blackwell edited this page Mar 3, 2017 · 6 revisions

gym.make fails with AttributeError: module 'gym' has no attribute 'make'

When python sees import gym, it searches a list of places (including your current directory) for a module called gym. If you've called your test program gym.py or universe.py, you're going to have a bad time. Name them something else, and be sure to delete any gym.pyc or universe.pyc files that Python has cached.

You can verify that Python is finding the right things with

$ python
>>> import gym
gym
>>> gym.__file__
'/Users/tlb/openai/gym/gym/__init__.py'

universe env.render() fails with libGL error

If you get something like this:

libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast

then it's not able to create an OpenGL window to show you the environment running.

git-lfs fails inside container (universe.flashgames environments)

The container doesn't contain all the games (which together are huge), it downloads them as needed using git-lfs. So the container needs to connect to github.com on ports 22 and 443. If you see something like:

[Sat Dec 31 19:17:00 UTC 2016] [/usr/local/bin/sudoable-env-setup] Allowing outbound network traffic to non-private IPs for git-lfs. (Going to fetch files via git lfs.)
[unpack-lfs] [2016-12-31 19:17:01,081] Fetching files: git lfs pull -I git-lfs/flashgames.DuskDrive-v0.tar.gz
[unpack-lfs] [2016-12-31 19:19:38,009] Finished running git lfs pull
[unpack-lfs] [2016-12-31 19:19:38,009] git lfs pull failed; detected from output: stdout=b'\rGit LFS: (0 of 1 files) 0 B / 9.52 MB \n' stderr=b'batch request: exit status 255: ssh: connect to host github.com port 22: Connection refused\n'
[unpack-lfs] [2016-12-31 19:19:38,076] unpack failed

that means there's a network problem. Things to check:

  • Are you behind a firewall? Ask your admin how to get external access. It needs to connect to github.com:22 and github.com:443.

  • Run the container in diagnostics mode, which will try network operations and log everything to the console. To do this, run

$ docker network inspect bridge; docker run --rm --privileged --ipc host --cap-add SYS_ADMIN quay.io/openai/universe.flashgames:latest diagnostics

That will run various network operations and log them to the console. If you're reporting a problem, please cut and paste the entire output into your github issue.

If it reports a connection denied or timeout, your container can't get to the public internet.

  • Docker has a wide range of network options. If you're running a standalone docker system (not part of a clusted like Kubernetes) you want bridge mode. Read all about it at Docker container networking

Docker won't pull images

[2017-01-07 20:34:35,146] Image quay.io/openai/universe.flashgames:0.20.21 not present locally; pulling                                                       
0.20.21: Pulling from openai/universe.flashgames
aed15891ba52: Pull complete
773ae8583d14: Pull complete
...
universe.remotes.compose.progress_stream.StreamOutputError: failed to register layer: Error processing tar file(gzip: invalid checksum):

This can happen when Docker's network connection is interrupted while downloading the remote image. Try again, or download the image manually with docker pull quay.io/openai/universe.flashgames:0.20.21 (replace the version with the version Universe was trying to download.

  • If you're in China, the GFW may block access to quay.io. Daocloud has a mirror inside China. Sign up for it at https://www.daocloud.io/mirror#accelerator-doc. It'll give you a mirror ID, and you can configure your docker to use it by running (replacing MIRRORID with your assigned id)
$ curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://MIRRORID.m.daocloud.io

Then, tell universe to pull from docker.io, which daocloud will mirror within the GFW:

$ export OPENAI_DOCKER_REPO=docker.io/openai

Building go-vncdriver fails with undefined reference errors

If you're using pyenv and get errors like:

/home/user/.pyenv/versions/3.5.2/lib/libpython3.5m.a(floatobject.o): In function `float_is_integer':
/tmp/python-build.20161207101855.17159/Python-3.5.2/Objects/floatobject.c:812: undefined reference to `floor'

you need to rebuild pyenv to support shared libraries. See solution

Flashgames environments won't start

Universe environments need a lot of CPU power to run. Most environments need 2 cores of a modern Intel CPU, to run the Flash engine, browser, renderer, X11 server, VNC server, and the 'vexpect' logic to start games by detecting visual elements on the screen and clicking buttons. If you see many of these:

universe-98PmJS-0 | [2017-03-02 06:16:44,043] [play_vexpect] Fell behind by 0.4378662109375s from target; losing 26 frames

your computer is too slow or too heavily loaded.

If you're using AWS EC2 instances, an c4.xlarge is a good choice for a single worker and environment, and a c4.4xlarge is a good choice for 4 agent workers + 4 environments. The t2.* instances are not suitable.