Keypoint data format #507

yuyu2172 · 2017-12-23T10:13:40Z

The data format of keypoints have multiple possibilities.
In particular, there has not been enough discussion on the representation of unobservable points.
There are at least four possibilities. Note that K represents the number of keypoints.

keypoint= (K, 2), np.float32 and kp_mask=(K,), np.bool. The name of the second object can be keypoint_mask or valid_keypoint.
keypoint= (K, 3), np.float32. The three elements in a row represent y, x, visible (0 or 1).
keypoint = (K, 2), np.float32. Represent unobservable points as np.nan.
keypoint = (K', 2), np.float32 and labels=(K',), np.int32. The constant K' represents the number of observable keypoints in an image (K != K' is possible). labels represents the id of the keypoint (e.g. 0->head).

Some comments

The first option can be bad because it is tedious to handle two annotations.
The second option may look redundant. This is because some datasets may not contain any images with unobservable keypoints.
Also, the second option would be problematic when working with pairs of sets of keypoints because there may be no notion of "observable" (e.g. Dense correspondence and keypoint matching). In this case, the users would use two objects keypoint0 = np.array(kp0_0, kp0_1, ..., kp0_{K'-1}) and keypoint1 = np.array((kp_1_0, ..., kp1_{K'-1})). Here, kp0_i and kp1_i are corresponding. Usually, K' is much smaller than the maximum possible valid keypoint pairs, which is the size of the image.
The third option can be bad because some people may misunderstand np.nan as corrupted data.

EDIT:
The name of keypoint can be shortened to point.

The text was updated successfully, but these errors were encountered:

yuyu2172 · 2017-12-23T10:57:38Z

@mitmul (since #495 is related)

Do you have any comments on the representation of keypoints?
I prefer the first representation. Even though it adds an extra object, it can handle more scenarios than other representations as I mentioned in the third comment above. I like the name valid_keypoint the most for the visibility object.

mitmul · 2017-12-27T15:57:08Z

@yuyu2172 I agree with taking the first option. Returning valid_keypoint makes sense for me.

yuyu2172 · 2018-02-27T04:04:54Z

I had a discussion with other developers couple of weeks ago, and we concluded that point and mask are the right names.
Although mask can be confused with pixel-wise masks, the distinction is almost always clear from the context.
When we need to distinguish between the two, we may use point_mask and img_mask.
BTW, we have been using names that are ambiguous like this. For instance, labels can be image-wise, bbox-wise or pixel-wise, but we do not use different names for different types.

This was referenced Feb 27, 2018

Add assert_is_point #524

Merged

vis_keypoint --> vis_point #525

Merged

Change naming conventions in transforms #526

Merged

Add assert_is_point_dataset #527

Merged

Change name to CUBPointDataset and add tests #528

Merged

yuyu2172 closed this as completed May 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keypoint data format #507

Keypoint data format #507

yuyu2172 commented Dec 23, 2017 •

edited

Loading

yuyu2172 commented Dec 23, 2017

mitmul commented Dec 27, 2017

yuyu2172 commented Feb 27, 2018

Keypoint data format #507

Keypoint data format #507

Comments

yuyu2172 commented Dec 23, 2017 • edited Loading

yuyu2172 commented Dec 23, 2017

mitmul commented Dec 27, 2017

yuyu2172 commented Feb 27, 2018

yuyu2172 commented Dec 23, 2017 •

edited

Loading