Annotation information
Type of Annotations
We annotated the data with 3D bounding boxes with 9 degree of freedom (DoF):
- x, y, z: 3D location of the box’s center
- l, w, h: length, width, height of the box
- yaw, pitch, roll: orientation of the box
For each object, we also annotated the level of occlusion for two types of occlusions (“spatial” and “lighting”) and an activity attribute (“stopped,” “moving,” “parked,” “pushed,” “sitting”). Furthermore, same physical objects were assigned unique object ids over frames to make the dataset suitable for tracking and prediction tasks.
KITTI formatted labels
For now, we will provide the annotation in KITTI format, in the camera frame.
Note that there are 3 important differences between VoD and KITTI labels:
- 2D bounding boxes are automatically calculated: Annotation was done in 3D on the LiDAR point cloud. While we provide the 2D bounding boxes in the KITTI formatted labels, these were calculated automatically by projecting the 3D bounding boxes to the camera plane, and assigning a minimum fit rectangle.
- Truncation: We do not provide truncation information (it is used for other meta data) for the same reason (no annotation in image plane). Important: please be sure not to use Truncation values in your evaluation, if you do not use the provided eval module, see this issue.
- Rotation: the original KITTI devkit assumes that camera’s and LiDAR’s vertical axes (Y and Z) are parallel, just pointing to different directions. In our research vehicle however, the camera is slightly tilted. Thus for convenience, we define the rotation of objects around the LiDAR’s negative vertical ( -Z) axis. This is in fact what many open source library assumes anyway: that the LiDAR’s and camera’s vertical axes (Z and Y) are perfectly aligned.
#Values Name Description
----------------------------------------------------------------------------
1 Class Describes the type of object: 'Car', 'Pedestrian', 'Cyclist', etc.
1 truncated Not used, only there to be compatible with KITTI format.
1 occluded Integer (0,1,2) indicating occlusion state:
0 = fully visible, 1 = partly occluded
2 = largely occluded.
1 alpha Observation angle of object, ranging [-pi..pi]
4 bbox 2D bounding box of object in the image (0-based index):
contains left, top, right, bottom pixel coordinates. This was automatically calculated from the 3D boxes.
3 dimensions 3D object dimensions: height, width, length (in meters)
3 location 3D object location x,y,z in camera coordinates (in meters)
1 rotation Rotation around -Z axis of the LiDAR sensor [-pi..pi]
Tracking IDs
The labels in the original release do not include track ids.
If you are interested in the track ids, the zip file below has to be downloaded, and its content placed in the relevant location in:
<your root of view_of_delft>/lidar/training/label_2
, overwriting the original labels. There is no other difference between the two sets of labels, all boxes are identical.
Annotations with tracking IDs can be downloaded with this link.
Using your password received in email after registration.
We share the tracking IDs by overriding the standard’s KITTI format’s truncation value, see above, i.e. the first number after the class string.
For example the following line in the annotations:
bicycle 1757 1 -0.5150583918601345 1692.8588 873.00977 1935.0 1064.7266 0.9959256326426174 0.4582897348611458 1.737482152677817 5.230204792744421 2.477337074657124 8.676091008791296 0.027439126666472635 1
means that this annotated bicycle is the 1757th object in the dataset, and this number will be its tracking id. This number is going to be consistent along frames, i.e., if the bicycle is visible later, it will have the same number printed at this location.
Annotated area
Any object of interest within 50 meters of the LiDAR sensor and partially or fully within the camera’s field of view (horizontal FoV: ±32°, vertical FoV: ± 22°). was annotated.
Annotated classes
13 object classes were annotated:
- Car
- Pedestrian
- Cyclist (including both the bycicle and the rider)
- Rider (the human on the bycicle, motor, etc. separately)
- Unused bicycle
- Bicycle rack
- Human depiction (e.g. statues)
- Moped or scooter
- Motor
- Truck
- Other ride
- Other vehicle
- Uncertain ride
Please note that while sensor data of the test set frames is available for the public, the labels for these frames are not, thus you fill find fewer files in the label folder than in the sensors folder!
Annotation instructions will be available here soon (TODO).