Skip to content

A collection of resources on Shape Correspondences and some of my reading notes.

Notifications You must be signed in to change notification settings

yzy1996/Awesome-Shape-Correspondence

Repository files navigation

Shape Correspondence

Awesome Maintenance PR's Welcome GitHub contributors

A collection of resources on Shape Correspondences and some of my reading notes.

Contributing: Feedback and contributions are welcome! If you think I have missed out on something (or) have any suggestions (papers, implementations and other resources), feel free to pull a request or leave an issue. I will release the latex-pdf version in the future. ⬇️markdown format:

[Paper Name](abs link)  
*[Author 1](homepage), Author 2, and Author 3*
**[`Conference/Journal Year`] (`Institution`)** [[Github](link)] [[Project](link)]

😄 Now you can use this script to automatically generate the above text.

Table of Contents

Introduction

Both 2D and 3D keypoint detection are long-standing problems in computer vision.

A set of keypoints representing any object (shape/structure) is important for geometric reasoning, due to their simplicity and ease of handling. [^ intro2]

Keypoints-based methods have been crucial to the success of many vision applications. Examples include: 3D reconstruction, registration, human body pose, recognition, and generation. [^ intro2]

Conventional works define keypoints manually or learn from supervised examples, automatically discovering them from unlabeled data (unsupervised) is what we need.

The keypoints should be geometrically and semantically consistent across viewing angles and instances of an object category.

The model we learn often covers a collection of objects of a specific category.

Shape correspondence problem is stated as finding a set of corresponding points between given shapes.

Dense semantic correspondence - given two images, the goal is to predict for each pixel in the former, the corresponding pixel in the latter.

Sparse correspondences focus on only a few keypoints.

We can use infer/learn xx as a predicate, and we can use points with lines or same colors to assign correspondences.

先笼统地介绍:

  • 关键点很重要:因为可以看成是物体的一种最简洁形状表征,就可以用来形状编辑,重建,识别等;所以如何找关键点是一个很重要的研究问题。同时分类和识别工作同时伴随着的是特征提取,那么在geometric vision 领域,比如 3D reconstruction and shape alignment 是不是也伴随着有一个 keypoint detection module 的前置任务,然后再是 geometric reasoning。

  • 关键点的特点 - 不随视角,光线,形状变化,姿态 而变化

    Equivariance: equivariant to image transformation, including object and camera motions. 3D pose, size, position, viewing angle, and illumination conditions

  • 关键点检测的拓展:姿态估计

现在可以做到:

  • 2D/3D数据输入
  • 监督和无监督,这里的监督指的是特征点标记
  • 一个模型涵盖同一类物体

Keywords: landmark, parts, skeletons, category-specific

keypoint heatmap: 关键点热力图,图中数值越大的位置,越有可能是关键点

Impact

应用多 generic framework for: texture transfer \ pose and animation transfer \ statistical shape analysis \ 多视角识别

主要是: detection and segmentation. 对于相关性而言,都已经知道相关性了,one-shot标注后直接就迁移到了新的object上了。传统方法主要是依靠手动标记,所以重点找一下不需要手动标记的方法。

有一个最权威的人体关节点定位比赛: MS COCO Keypoint track

robotics applications need 3D keypoints for control

  • 2019 Keypoint affordances for category-level robotic manipulation
  • 2019 kpam-sc: Generalizable manipulation planning using keypoint affordance and shape completion

直接利用/借用keypoint的工作:

Non-Rigid Structure-from-Motion (NRSfM) methods ref:

  • Multiview aggregation for learning category-specific shape reconstruction
  • Symmetric non-rigid structure from motion for category-specific object structure estimation

The key idea is that a large number of object deformations can be explained by linearly combining a smaller K number of basis shapes at some pose. 对刚体而言,只有一个基础形状,秩为3。

里面用来对形状变形建模的主要方法有:

  • low-rank shape prior
    • A simple prior-free method for non-rigid structure-from-motion factorization
    • Recovering non-rigid 3D shape from image streams (鼻祖)
    • Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors
    • Nonrigid structure from motion in trajectory space
  • isometric prior
    • Non-rigid structure from locally-rigid motion
    • Isometric non-rigid shape-from-motion in linear time

Evaluation

可以手动标然后做回归

Data

annotated keypoints for:

  • face [^ face]

  • hands [^ hand]

  • human bodies [^ body1] [^ body2]

Literature

最早的肯定是有监督的一类方法,而后是一类无监督的,而我们重点关心的是无监督的。所以文献归类里先把有监督的混在一起,然后无监督的再按更小的方法类别划分。最后还有一些用到人体,鸟类,家具上的。

Survey

Supervised

下面分类是依据输入和输出数据的维度为2D还是3D

2D Perspective

(注意里面也包含了利用3D中间体过渡的一类方法)

local descriptor based

用一些特征算子找,用神经网络提取特征层面的对应关系,需要有标记的数据集

parametric warping

match local feature 提取像素点的特征,然后做匹配,既可以通过学习变形的function,也可以通过学习encoder压缩到一个低维共性点

同一物体,同一视角,很受限

Warpnet: Weakly supervised matching for singleview reconstruction

learn equivariant embeddings/decoder

除了直接找2D特征层面的相关性,还可以借助3D层面特征为中间过渡

Compared with directly learning correspondence maps from 2D images, learning from 3D structures as an intermediate medium is more powerful.

3D medium Template

Plato famously remarked that while there are many cups in the world, there is only one 'idea' of a cup, which he defined as a 'cupness'. So Any particular instance of a category can thus be understood via its relationship to this platonic ideal. We humans have an ability to reason 3D structure from a 2D image.

上面的方法需要假设存在这样一个“模板“,究竟是否真实存在呢?下面方法说可以不要模板

3D medium semantic transfer

用带pose的2D图片

3D Perspective

Dataset: ShapeNet, PartNet

Other domain

human bodies

bird

  • Deep Deformation Network for Object Landmark Localization

  • Part Localization using Multi-Proposal Consensus for Fine-Grained Categorization

  • Bird part localization using exemplar-based models with enforced pose and subcategory consistency

furniture

Knowledge

UV mapping:

[^ intro2]: Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets

About

A collection of resources on Shape Correspondences and some of my reading notes.

Topics

Resources

Stars

Watchers

Forks