Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nuImages videos #432

Merged
merged 17 commits into from
Jul 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/instructions_nuimages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
TODO: Coming soon!
162 changes: 160 additions & 2 deletions docs/schema_nuimages.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,164 @@
nuImages schema
==========
This document describes the database schema used in nuImages.
All annotations and meta data (including calibration, maps, vehicle coordinates etc.) are covered in a relational database.
The database tables are listed below.
Every row can be identified by its unique primary key `token`.
Foreign keys such as `sample_token` may be used to link to the `token` of the table `sample`.
Please refer to the [tutorial](https://www.nuscenes.org/nuimages#tutorial) for an introduction to the most important database tables.

TODO: Coming soon!
![](https://www.nuscenes.org/public/images/nuimages-schema.svg)

![](https://www.nuscenes.org/public/images/nuimages-schema.svg)
attribute
---------
An attribute is a property of an instance that can change while the category remains the same.
Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider.
The attributes in nuImages are a superset of those in nuScenes.
```
attribute {
"token": <str> -- Unique record identifier.
"name": <str> -- Attribute name.
"description": <str> -- Attribute description.
}
```

calibrated_sensor
---------
Definition of a particular sensor (lidar/camera, but no radar) as calibrated on a particular vehicle.
All extrinsic parameters are given with respect to the ego vehicle body frame.
All camera images come undistorted and rectified.
```
calibrated_sensor {
"token": <str> -- Unique record identifier.
"sensor_token": <str> -- Foreign key pointing to the sensor type.
"translation": <float> [3] -- Coordinate system origin in meters: x, y, z.
"rotation": <float> [4] -- Coordinate system orientation as quaternion: w, x, y, z.
"camera_intrinsic": <float> [3, 3] -- Intrinsic camera calibration. Empty for sensors that are not cameras.
"camera_distortion": <float> [5 or 6] -- Camera calibration parameters. We use the 5 parameter camera convention of the CalTech camera calibration toolbox, that is also used in OpenCV. Only for fish-eye lenses in CAM_BACK do we use the 6th parameter.
}
```

category
---------
Taxonomy of object categories (e.g. vehicle, human).
Subcategories are delineated by a period (e.g. `human.pedestrian.adult`).
The categories in nuImages are the same as in the nuScenes (w/o lidarseg), plus `flat.driveable_surface`.
```
category {
"token": <str> -- Unique record identifier.
"name": <str> -- Category name. Subcategories indicated by period.
"description": <str> -- Category description.
}
```

ego_pose
---------
Ego vehicle pose at a particular timestamp. Given with respect to global coordinate system of the log's map.
The ego_pose is the output of a lidar map-based localization algorithm described in our paper.
The localization is 2-dimensional in the x-y plane.
Warning: nuImages is collected from almost 500 logs with different maps versions.
Therefore the coordinates **should not be compared across logs** or rendered on the semantic maps of nuScenes.
```
ego_pose {
"token": <str> -- Unique record identifier.
"translation": <float> [3] -- Coordinate system origin in meters: x, y, z. Note that z is always 0.
"rotation": <float> [4] -- Coordinate system orientation as quaternion: w, x, y, z.
"timestamp": <int> -- Unix time stamp.
"rotation_rate": <float> [3] -- The angular velocity vector (x, y, z) of the vehicle in rad/s. This is expressed in the ego vehicle frame.
"acceleration": <float> [3] -- Acceleration vector (x, y, z) in the ego vehicle frame in m/s/s. The z value is close to the gravitational acceleration `g = 9.81 m/s/s`.
"speed": <float> -- The speed of the ego vehicle in the driving direction in m/s.
}
```

log
---------
Information about the log from which the data was extracted.
```
log {
"token": <str> -- Unique record identifier.
"logfile": <str> -- Log file name.
"vehicle": <str> -- Vehicle name.
"date_captured": <str> -- Date (YYYY-MM-DD).
"location": <str> -- Area where log was captured, e.g. singapore-onenorth.
}
```

object_ann
---------
The annotation of a foreground object (car, bike, pedestrian) in an image.
Each foreground object is annotated with a 2d box, a 2d instance mask and category-specific attributes.
```
object_ann {
"token": <str> -- Unique record identifier.
"sample_data_token": <str> -- Foreign key pointing to the sample data, which must be a keyframe image.
"category_token": <str> -- Foreign key pointing to the object category.
"attribute_tokens": <str> [n] -- Foreign keys. List of attributes for this annotation.
"bbox": <int> [4] -- Annotated amodal bounding box. Given as [xmin, ymin, xmax, ymax].
"mask": <RLE> -- Run length encoding of instance mask using the pycocotools package.
}
```

sample_data
---------
A sensor data e.g. image or lidar pointcloud. Note that we don't have radar in nuImages.
Sample_data covers all sensor data, regardless of whether it is a keyframe or not.
For every keyframe image or lidar, we also include up to 6 past and 6 future sweeps at 2 Hz.
We can navigate between consecutive lidar or camera sample_datas using the `prev` and `next` pointers.
Only keyframe (sample) images are annotated.
The sample timestamp is inherited from the keyframe camera sample_data timestamp.
```
sample_data {
"token": <str> -- Unique record identifier.
"sample_token": <str> -- Foreign key. Sample to which this sample_data is associated.
"ego_pose_token": <str> -- Foreign key.
"calibrated_sensor_token": <str> -- Foreign key.
"filename": <str> -- Relative path to data-blob on disk.
"fileformat": <str> -- Data file format.
"width": <int> -- If the sample data is an image, this is the image width in pixels.
"height": <int> -- If the sample data is an image, this is the image height in pixels.
"timestamp": <int> -- Unix time stamp.
"is_key_frame": <bool> -- True if sample_data is part of key_frame, else False.
"next": <str> -- Foreign key. Sample data from the same sensor that follows this in time. Empty if end of scene.
"prev": <str> -- Foreign key. Sample data from the same sensor that precedes this in time. Empty if start of scene.
}
```

sample
---------
A sample is an annotated keyframe selected from a large pool of images in a log.
Every sample has up to 13 lidar sample_datas and 13 camera sample_datas corresponding to it.
These include the actual lidar and camera keyframe sample_datas, which can be accessed via the `key_*_token` fields.
```
sample {
"token": <str> -- Unique record identifier.
"timestamp": <int> -- Unix time stamp.
"log_token": <str> -- Foreign key pointing to the log.
"key_camera_token": <str> -- Foreign key of the sample_data corresponding to the camera keyframe.
"key_lidar_token": <str> -- Foreign key of the sample_data corresponding to the lidar keyframe.
}
```

sensor
---------
---------
A specific sensor type.
```
sensor {
"token": <str> -- Unique record identifier.
"channel": <str> -- Sensor channel name.
"modality": <str> {camera, lidar} -- Sensor modality. Supports category(ies) in brackets.
}
```

surface_ann
---------
The annotation of a background object (driveable surface) in an image.
Each background object is annotated with a 2d semantic segmentation mask.
```
surface_ann {
"token": <str> -- Unique record identifier.
"sample_data_token": <str> -- Foreign key pointing to the sample data, which must be a keyframe image.
"category_token": <str> -- Foreign key pointing to the surface category.
"mask": <RLE> -- Run length encoding of segmentation mask using the pycocotools package.
}
```
40 changes: 20 additions & 20 deletions docs/schema_nuscenes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,24 @@ All annotations and meta data (including calibration, maps, vehicle coordinates
The database tables are listed below.
Every row can be identified by its unique primary key `token`.
Foreign keys such as `sample_token` may be used to link to the `token` of the table `sample`.
Please refer to the [tutorial](https://www.nuscenes.org/tutorial) for an introduction to the most important database tables.
Please refer to the [tutorial](https://www.nuscenes.org/nuimages#tutorial) for an introduction to the most important database tables.

![](https://www.nuscenes.org/public/images/nuscenes-schema.svg)

attribute
---------

An attribute is a property of an instance that can change while the category remains the same.
Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider.
Example: a vehicle being parked/stopped/moving, and whether or not a bicycle has a rider.
```
attribute {
"token": <str> -- Unique record identifier.
"name": <str> -- Attribute name.
"description": <str> -- Attribute description.
}
```

calibrated_sensor
---------

Definition of a particular sensor (lidar/radar/camera) as calibrated on a particular vehicle.
All extrinsic parameters are given with respect to the ego vehicle body frame.
All camera images come undistorted and rectified.
Expand All @@ -36,11 +35,11 @@ calibrated_sensor {
"camera_intrinsic": <float> [3, 3] -- Intrinsic camera calibration. Empty for sensors that are not cameras.
}
```

category
---------

Taxonomy of object categories (e.g. vehicle, human).
Subcategories are delineated by a period (e.g. human.pedestrian.adult).
Subcategories are delineated by a period (e.g. `human.pedestrian.adult`).
```
category {
"token": <str> -- Unique record identifier.
Expand All @@ -49,9 +48,9 @@ category {
"index": <int> -- The index of the label used for efficiency reasons in the .bin label files of nuScenes-lidarseg. This field did not exist previously.
}
```

ego_pose
---------

Ego vehicle pose at a particular timestamp. Given with respect to global coordinate system of the log's map.
The ego_pose is the output of a lidar map-based localization algorithm described in our paper.
The localization is 2-dimensional in the x-y plane.
Expand All @@ -63,24 +62,24 @@ ego_pose {
"timestamp": <int> -- Unix time stamp.
}
```

instance
---------

An object instance, e.g. particular vehicle.
This table is an enumeration of all object instances we observed.
Note that instances are not tracked across scenes.
```
instance {
"token": <str> -- Unique record identifier.
"category_token": <str> -- Foreign key. Object instance category.
"category_token": <str> -- Foreign key pointing to the object category.
"nbr_annotations": <int> -- Number of annotations of this instance.
"first_annotation_token": <str> -- Foreign key. Points to the first annotation of this instance.
"last_annotation_token": <str> -- Foreign key. Points to the last annotation of this instance.
}
```

lidarseg
---------

Mapping between nuScenes-lidarseg annotations and sample_datas corresponding to the lidar pointcloud associated with a keyframe.
```
lidarseg {
Expand All @@ -89,9 +88,9 @@ lidarseg {
"sample_data_token": <str> -- Foreign key. Sample_data corresponding to the annotated lidar pointcloud with is_key_frame=True.
}
```

log
---------

Information about the log from which the data was extracted.
```
log {
Expand All @@ -102,9 +101,9 @@ log {
"location": <str> -- Area where log was captured, e.g. singapore-onenorth.
}
```

map
---------

Map data that is stored as binary semantic masks from a top-down view.
```
map {
Expand All @@ -114,10 +113,11 @@ map {
"filename": <str> -- Relative path to the file with the map mask.
}
```

sample
---------

A sample is data collected at (approximately) the same timestamp as part of a single LIDAR sweep.
A sample is an annotated keyframe at 2 Hz.
The data is collected at (approximately) the same timestamp as part of a single LIDAR sweep.
```
sample {
"token": <str> -- Unique record identifier.
Expand All @@ -127,17 +127,17 @@ sample {
"prev": <str> -- Foreign key. Sample that precedes this in time. Empty if start of scene.
}
```

sample_annotation
---------

A bounding box defining the position of an object seen in a sample.
All location data is given with respect to the global coordinate system.
```
sample_annotation {
"token": <str> -- Unique record identifier.
"sample_token": <str> -- Foreign key. NOTE: this points to a sample NOT a sample_data since annotations are done on the sample level taking all relevant sample_data into account.
"instance_token": <str> -- Foreign key. Which object instance is this annotating. An instance can have multiple annotations over time.
"attribute_tokens": <str> [n] -- Foreign keys. List of attributes for this annotation. Attributes can change over time, so they belong here, not in the object table.
"attribute_tokens": <str> [n] -- Foreign keys. List of attributes for this annotation. Attributes can change over time, so they belong here, not in the instance table.
"visibility_token": <str> -- Foreign key. Visibility may also change over time. If no visibility is annotated, the token is an empty string.
"translation": <float> [3] -- Bounding box location in meters as center_x, center_y, center_z.
"size": <float> [3] -- Bounding box size in meters as width, length, height.
Expand All @@ -148,9 +148,9 @@ sample_annotation {
"prev": <str> -- Foreign key. Sample annotation from the same object instance that precedes this in time. Empty if this is the first annotation for this object.
}
```

sample_data
---------

A sensor data e.g. image, point cloud or radar return.
For sample_data with is_key_frame=True, the time-stamps should be very close to the sample it points to.
For non key-frames the sample_data points to the sample that follows closest in time.
Expand All @@ -170,9 +170,9 @@ sample_data {
"prev": <str> -- Foreign key. Sample data from the same sensor that precedes this in time. Empty if start of scene.
}
```

scene
---------

A scene is a 20s long sequence of consecutive frames extracted from a log.
Multiple scenes can come from the same log.
Note that object identities (instance tokens) are not preserved across scenes.
Expand All @@ -187,9 +187,9 @@ scene {
"last_sample_token": <str> -- Foreign key. Points to the last sample in scene.
}
```

sensor
---------

A specific sensor type.
```
sensor {
Expand All @@ -198,9 +198,9 @@ sensor {
"modality": <str> {camera, lidar, radar} -- Sensor modality. Supports category(ies) in brackets.
}
```

visibility
---------

The visibility of an instance is the fraction of annotation visible in all 6 images. Binned into 4 bins 0-40%, 40-60%, 60-80% and 80-100%.
```
visibility {
Expand Down
Loading