Skip to content
This repository has been archived by the owner on Jul 2, 2021. It is now read-only.

Keypoint data format #507

Closed
yuyu2172 opened this issue Dec 23, 2017 · 3 comments
Closed

Keypoint data format #507

yuyu2172 opened this issue Dec 23, 2017 · 3 comments

Comments

@yuyu2172
Copy link
Member

yuyu2172 commented Dec 23, 2017

The data format of keypoints have multiple possibilities.
In particular, there has not been enough discussion on the representation of unobservable points.
There are at least four possibilities. Note that K represents the number of keypoints.

  1. keypoint= (K, 2), np.float32 and kp_mask=(K,), np.bool. The name of the second object can be keypoint_mask or valid_keypoint.
  2. keypoint= (K, 3), np.float32. The three elements in a row represent y, x, visible (0 or 1).
  3. keypoint = (K, 2), np.float32. Represent unobservable points as np.nan.
  4. keypoint = (K', 2), np.float32 and labels=(K',), np.int32. The constant K' represents the number of observable keypoints in an image (K != K' is possible). labels represents the id of the keypoint (e.g. 0->head).

Some comments

  • The first option can be bad because it is tedious to handle two annotations.
  • The second option may look redundant. This is because some datasets may not contain any images with unobservable keypoints.
  • Also, the second option would be problematic when working with pairs of sets of keypoints because there may be no notion of "observable" (e.g. Dense correspondence and keypoint matching). In this case, the users would use two objects keypoint0 = np.array(kp0_0, kp0_1, ..., kp0_{K'-1}) and keypoint1 = np.array((kp_1_0, ..., kp1_{K'-1})). Here, kp0_i and kp1_i are corresponding. Usually, K' is much smaller than the maximum possible valid keypoint pairs, which is the size of the image.
  • The third option can be bad because some people may misunderstand np.nan as corrupted data.

EDIT:
The name of keypoint can be shortened to point.

@yuyu2172
Copy link
Member Author

@mitmul (since #495 is related)

Do you have any comments on the representation of keypoints?
I prefer the first representation. Even though it adds an extra object, it can handle more scenarios than other representations as I mentioned in the third comment above. I like the name valid_keypoint the most for the visibility object.

@mitmul
Copy link
Member

mitmul commented Dec 27, 2017

@yuyu2172 I agree with taking the first option. Returning valid_keypoint makes sense for me.

@yuyu2172
Copy link
Member Author

I had a discussion with other developers couple of weeks ago, and we concluded that point and mask are the right names.
Although mask can be confused with pixel-wise masks, the distinction is almost always clear from the context.
When we need to distinguish between the two, we may use point_mask and img_mask.
BTW, we have been using names that are ambiguous like this. For instance, labels can be image-wise, bbox-wise or pixel-wise, but we do not use different names for different types.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants