Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous runners with CPU only? #195

Open
definitelyuncertain opened this issue Nov 2, 2020 · 2 comments
Open

Asynchronous runners with CPU only? #195

definitelyuncertain opened this issue Nov 2, 2020 · 2 comments

Comments

@definitelyuncertain
Copy link

Hi, I'm trying to run DQN with asynchronous sampling using rlpyt's async sampler and runner classes. However, it looks like they don't work with CPU only, and require the presence of a GPU. Here's my code based on the examples and docs to try and use only CPU:

from rlpyt.samplers.serial.sampler import SerialSampler
from rlpyt.samplers.async_.serial_sampler import AsyncSerialSampler
from rlpyt.algos.dqn.dqn import DQN
from rlpyt.agents.dqn.atari.atari_dqn_agent import AtariDqnAgent
from rlpyt.envs.atari.atari_env import AtariEnv, AtariTrajInfo
from rlpyt.runners.minibatch_rl import MinibatchRlEval
from rlpyt.runners.async_rl import AsyncRlEval
from rlpyt.samplers.async_.cpu_sampler import AsyncCpuSampler
from rlpyt.samplers.async_.gpu_sampler import AsyncGpuSampler
from rlpyt.utils.logging.context import logger_context
from rlpyt.envs.gym import make as gym_make
from rlpyt.utils.launching.affinity import make_affinity, encode_affinity, build_async_affinity

def build_and_train(game="pong", run_ID=0):
    config = dict(
        algo=dict(batch_size=32,
                  min_steps_learn=500,
                  double_dqn=True,
                  prioritized_replay=True),
        sampler=dict(batch_T=1, batch_B=32),
    )
    sampler = AsyncCpuSampler(
        EnvCls=AtariEnv,
        TrajInfoCls=AtariTrajInfo,
        env_kwargs=dict(game=game),
        eval_env_kwargs=dict(game=game),
        max_decorrelation_steps=20,
        eval_n_envs=4,
        eval_max_steps=int(2000),
        eval_max_trajectories=10,
        **config["sampler"],
    )
    algo = DQN(**config["algo"])  # Run with defaults.
    agent = AtariDqnAgent()
    affinity = make_affinity(
        run_slot=0,
        n_cpu_core=12,
        n_gpu=0,
        hyperthread_offset=6,
        n_socket=1,
        async_sample=True,
    )
    runner = AsyncRlEval(
        algo=algo,
        agent=agent,
        sampler=sampler,
        n_steps=int(1e5),
        log_interval_steps=500,
        affinity=affinity,
    )
    name = "dqn_" + game
    log_dir = "logs/dqn_test_"
    with logger_context(log_dir, run_ID, name, config):
        runner.train()

build_and_train()

The error I get is:

Traceback (most recent call last):
  File "test_rlpyt_async.py", line 56, in <module>
    build_and_train()
  File "test_rlpyt_async.py", line 35, in build_and_train
    affinity = make_affinity(
  File ".../rlpyt/utils/launching/affinity.py", line 165, in make_affinity
    return affinity_from_code(encode_affinity(run_slot=run_slot, **kwargs))
  File ".../rlpyt/utils/launching/affinity.py", line 160, in affinity_from_code
    return build_cpu_affinity(run_slot, **aff_params)
TypeError: build_cpu_affinity() got an unexpected keyword argument 'ass'

On the other hand, it works when I change n_gpu to 1. Any idea what could have gone wrong here?

@ankeshanand
Copy link
Contributor

I believe async_sample=True corresponds to asynchronous sampling and optimization. If you just want asynchronous sampling, you can use one of the parallel CPU sampler classes without setting async_sample=True in affinity

@definitelyuncertain
Copy link
Author

What I'm looking for when I say asynchronous sampling is to have the interactions with the environment happen independently from the DQN learning steps. Looks like I might have mistyped and what I would like is in fact asynchronous sampling and optimization.

My understanding then, is that the parallel CPU sampler doesn't do this, rather it runs several sampling processes at once, but nevertheless still synchronous with the optimization steps. Is that not the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants