This is a dataset accompanying the paper Scaling data-driven robotics with reward sketching and batch reinforcement learning. If you use this dataset in your research please cite
@article{cabi2019,
title={Scaling data-driven robotics with reward sketching and batch reinforcement learning},
author={Serkan Cabi and
Sergio G{\'o}mez Colmenarejo and
Alexander Novikov and
Ksenia Konyushkova and
Scott Reed and
Rae Jeong and
Konrad Zolna and
Yusuf Aytar and
David Budden and
Mel Vecerik and
Oleg Sushkov and
David Barker and
Jonathan Scholz and
Misha Denil and
Nando de Freitas and
Ziyu Wang},
journal={arXiv preprint arXiv:1909.12200},
year={2019}
}
There is a small amount of example data included in this repository. To examine it, run the following commands from the repository root (i.e. one level up from this folder):
python3 -m venv .sketchy_env
source .sketchy_env/bin/activate
pip install --upgrade pip
pip install -r sketchy/requirements.txt
python -m sketchy.dataset_example --show_images
For an example of loading rewards for episodes see reward_example.py
.
Run ./download.sh path/to/download/folder
to download the full dataset. The
full dataset requires ~5.0TB of disk space to download, and extracts to approximately the same size.
You can edit download.sh
to download subsets of the data.
Once the dataset has been downloaded it can be extracted wtih
./extract.sh path/to/download/folder
.
We provide several named subsets of the full dataset, which can be easily
downloaded on their own. See download.sh
for a description of the subsets
that are provided.
The episodes in each of these named subsets are identified by a tag in the
metadata.
If you would like to curate your own subset you can download the metadata
file and inspect the ArchiveFiles
table (see below) to figure out which
archive files contain the episodes you want.
The dataset is distribted as a metadata file (metadata.sqlite
) and a
collection of archive files (with names ending in .tar.bz2
).
The metadata file contains information about the episodes, including annotated rewards for a subset of the episodes.
Each archive file contains several episode files, which have names like
10000313341320364033_b615a417-ce34-41a8-8411-2a1ce3f3bd07
.
Each episode file is a tfrecord file, containing a sequence of timesteps for a single episode.
Each timestep is a tf.train.Example
proto containing features corresponding to
the observations and actions from a particular point in time.
The metadata file, metadata.sqlite
, is a sqlite database containing metadata
describing the contents of the files in the dataset.
The following sections describe the important metadata tables. You can find the full schema by running
sqlite3 metadata.sqlite <<< .schema
EpisodeId
: A string of digits that uniquely identifies the episode.TaskId
: A human readable name for the task corresponding to the behavior that generated the episode.DataPath
: The name of the episode file holding the data for this episode.EpisodeType
: A string describing the type of policy that generated the episode. Possible values are:EPISODE_ROBOT_AGENT
: The behavior policy is a learned or scripted controller.EPISODE_ROBOT_TELEOPERATION
: The behavior policy is a human teleoperating the robot.EPISODE_ROBOT_DAGGER
: The behavior policy is a mix of controller and human generated actions.
Timestamp
: A unix timestamp recording when the episode was generated.
EpisodeId
: Foreign key into theEpisodes
table.Tag
: A human readable identifier for some aspect of the episode (e.g. which object set is used).
EpisodeId
: Foreign key into theEpisodes
table.RewardSequenceId
: Distinguishes multiple rewards for the same episode.RewardTaskId
: A human readable name of the task for this reward signal. Typically the same as the correspondingTaskId
in theEpisodes
table.Type
: A string describing the type of reward signal. Currently the only value isREWARD_SKETCH
.Values
: A sequence of float32 values, packed as a binary blob. There is one float value for each frame of the episode, corresponding to the annotated reward.
EpisodeId
: Foreign key into theEpisodes
table.ArchiveFile
: Name of the archive file containing the corresponding episode.
Each episode file is a
tfrecords file
containing a sequence of timesteps, encoded as
tf.train.Example
protos.
Each episode file contains a single episode, and each timestep within an episode
contains all of the observations and actions associated with a that timestep as
a single tf.train.Example
. Within each episode file the timesteps are
temporally ordered, so reading a file from beginning to end will visit all of
the timesteps from the episode in the order they occurred.
Observations and actions occur at 10Hz.
Each timestep is a collection of observations and actions. Actions stored with a timestep correspond to actions taken in response to the observations they are stored with.
For a description of the shapes and types of the timestep data, see the data
loader in sketchy.py
.
The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.
property | value | ||||||
---|---|---|---|---|---|---|---|
name | Sketchy |
||||||
url | https://github.com/deepmind/deepmind-research/tree/master/sketchy |
||||||
sameAs | https://github.com/deepmind/deepmind-research/tree/master/sketchy |
||||||
description |
Data accompanying
[Scaling data-driven robotics with reward sketching and batch reinforcement learning](https://arxiv.org/abs/1909.12200).
|
||||||
provider |
|
||||||
citation | https://identifiers.org/arxiv:1909.12200 |