GymD2D is a toolkit for building, evaluating and comparing D2D cellular offload resource allocation algorithms. It uses OpenAI Gym to make it easy to experiment with many popular reinforcement learning or AI algorithms. It is highly configurable, allowing users to experiment with UE configuration, path loss and traffic models.
GymD2D models a D2D cellular offload scenario containing a single macro base station surrounded with many cellular (CUE) and D2D (DUE) user equipment.
This project is still under active development and the API hasn't stabilised yet. Breaking changes are likely to occur between releases.
If you have found this project useful in your research, please consider citing our white paper (preprint, forthcoming at IEEE WCNC 2021).
@article{cotton2021gymd2d,
title={GymD2D: A Device-to-Device Underlay Cellular Offload Evaluation Platform},
author={Cotton, David and Chaczko, Zenon},
journal={arXiv preprint arXiv:2101.11188},
year={2021}
}
- Python 3.7 or greater (
dataclasses
&__future__.annotations
) - OpenAI Gym 0.9.6 or greater (env_config)
- NumPy
Use pip to install
pip install gym-d2d
Or, if you need to edit the source code, clone and install dev dependencies:
git clone git@github.com:davidcotton/gym-d2d.git
cd gym-d2d
pip install -e ".[dev]"
Import OpenAI Gym and GymD2D
import gym
import gym_d2d
Build a new D2D environment via the usual Gym factory method
env = gym.make('D2DEnv-v0')
Then run in the standard Gym observation, action, reward loop.
obses = env.reset()
game_over = False
while not game_over:
actions = {}
for agent_id, obs in obses.items():
action = env.action_space['due'].sample() # or: action = agent.act(obs)
actions[agent_id] = action
obses, rewards, game_over, infos = env.step(actions)
env.render()
The main difference between this environment and the usual classic control or ALE environments is that it is designed for multiple agents.
The environment's observation and action spaces use gym.spaces.DictSpace
, with 3 keys: due
, cue
& mbs
.
Observations, actions, rewards and info are passed via Python dicts like:
obs_dict = {
'cue00': np.ndarray(...),
'cue01': np.ndarray(...),
'due00': np.ndarray(...),
'due01': np.ndarray(...),
...
}
actions = {
'cue00': 23,
'cue01': 317,
'due00': 13,
'due01': 95,
...
}
We have some common usage examples in the examples directory.
One of the design principles of this project is that environments should be easily configurable and customisable to meet the variety of research needs present in D2D cellular offload research.
Following the Gym API (gym>=0.9.6), you can configure the environment via an env_config dictionary.
env = gym.make('D2DEnv-v0', env_config={'param': value})
The environment has the following configuration options:
Parameter | Description | Datatype | Default Value |
---|---|---|---|
num_rbs | The number of available resource blocks. | int |
25 |
num_cues | The number of cellular users. | int |
25 |
num_due_pairs | The number of D2D pairs | int |
25 |
cell_radius_m | The macro base station's cell radius in metres. This parameter controls the radius in which all other devices are contained. | float |
500.0 |
d2d_radius_m | The maximum distance between D2D pairs in metres. | float |
20.0 |
due_min_tx_power_dBm | The minimum DUE transmission power in dBm. | int |
0 |
due_max_tx_power_dBm | The maximum DUE transmission power in dBm. | int |
20 |
cue_max_tx_power_dBm | The maximum CUE transmission power in dBm. | int |
23 |
mbs_max_tx_power_dBm | The maximum MBS transmission power in dBm. | int |
46 |
path_loss_model | The type of path loss model to use. | gym_d2d. PathLoss |
gym_d2d. LogDistancePathLoss |
traffic_model | The model to generate automated traffic. | gym_d2d. TrafficModel |
gym_d2d. UplinkTrafficModel |
obs_fn | The function to calculate agent observations. | gym_d2d.envs. ObsFunction |
gym_d2d.envs. LinearObsFunction |
reward_fn | The function to calculate agent rewards. | gym_d2d.envs. RewardFunction |
gym_d2d.envs. SystemCapacityRewardFunction |
carrier_freq_GHz | The carrier frequency used, in GHz. | float |
2.1 |
num_subcarriers | The number of subcarriers. | int |
12 |
subcarrier_spacing_kHz | The spacing between subcarriers. | int |
15 |
channel_bandwidth_MHz | The channel bandwidth in MHz. | float |
20.0 |
device_config_file | A path to a device configuration JSON file. | pathlib.Path |
None (random device positions) |
By default, each time the environment is reset()
, each UE is randomly assigned a new position.
To make experiments repeatable and compare algorithms, you may wish to fix UE positions.
This can be achieved by saving and loading device configurations.
Gym D2D uses Python's Pathlib to make file handling easier.
from pathlib import Path
To save an environment's device configuration to file:
env = gym.make('D2DEnv-v0')
env.reset() # generate random device positions (if not supplied)
env.save_device_config(Path.cwd() / 'device_config.json')
To load from an existing configuration:
env_config = {'device_config_file': Path.cwd() / 'device_config.json'}
env = gym.make('D2DEnv-v0', env_config=env_config)
More info coming soon on how to customise observations and rewards...