Use Linemod dataset #14

Trulli99 · 2023-02-15T14:52:54Z

Hi,

Do you think it would be possible to run shapo with the linemod dataset if I follow the tips of some issues related to the use of custom datasets.

Thank you!

zubair-irshad · 2023-02-15T16:38:11Z

Yes, ofcourse. Unfortunately we don't have a script that prepares the Linemod dataset, but if it is very similar to NOCS and provides all the relevant data, you could follow the step by step things mentioned in this thread #11 and this thread #13 to train it with Linemod.

Trulli99 · 2023-02-15T18:52:37Z

Do you know how can I generate this images?

zubair-irshad · 2023-02-15T19:09:59Z

These are object NOCS images and you can find more information hughw19/NOCS_CVPR2019#62 or use blender proc.

But please note that you don't need these NOCS map for training our repo i.e. shapo, if you already have the GT 6D pose and sizes of rendered images (which i think Linemod already provides). Please see my answer here #11 (comment). You would have to save the relevant pose, image and depth data as datapoints here to train shapo. This information could be retrieved in any form i.e. 6D pose or size estimated from GT NOCS or just the GT 6D poses if available!

Trulli99 · 2023-02-16T11:32:48Z

Thank you, I will have a look. I thought that it was needed because you do processing of all the images in the camera train folder.

Trulli99 · 2023-02-16T16:07:59Z

In linemod I have the GT 6D, will I still need to generate the files in the folder sdf_rgb_pretrained?

zubair-irshad · 2023-02-16T16:32:29Z

Yes, you would still have to train SDF and RGB MLPs as well as the respective latent codes per object (if your categories are different than the categories we train on i.e. bottle, bowl, camera, mug and laptop) since our network requires them as a strong prior which we regress and later optimize from a single view observations. Please see this thread #13 on how you can train these for your own dataset.

peng25zhang · 2023-05-06T11:38:39Z

@zubair-irshad hi, thanks for your work.. Can you tell me what means the sizes of rendered images. I see the datapoint using *_norm.txt

DavidYaonanZhu · 2023-06-28T09:50:51Z

Also for YCB dataset

zubair-irshad · 2023-06-30T03:53:22Z

@peng25zhang the transformations are defined by R,T,s where R is a 3by3 rotation matrix, T is a 3by1 translation vector and s is a one-dimensional scale value. The scale value determines the scaling factor of the observed instance from canonical shape i.e. point clouds where the size is the 3dimensional extent of the canonical pointclouds.

In inference time, we are only given a single RGB-D observation, so we get size information from our predicted shape i.e. extracted pointclouds (canonical) and we regress R,T,s values from a neural network MLP and hence we could transform the canonical point-clouds to camera frame pointclouds. Hope it helps!

zubair-irshad closed this as completed Jun 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Linemod dataset #14

Use Linemod dataset #14

Trulli99 commented Feb 15, 2023

zubair-irshad commented Feb 15, 2023

Trulli99 commented Feb 15, 2023

zubair-irshad commented Feb 15, 2023

Trulli99 commented Feb 16, 2023

Trulli99 commented Feb 16, 2023

zubair-irshad commented Feb 16, 2023

peng25zhang commented May 6, 2023

DavidYaonanZhu commented Jun 28, 2023

zubair-irshad commented Jun 30, 2023

Use Linemod dataset #14

Use Linemod dataset #14

Comments

Trulli99 commented Feb 15, 2023

zubair-irshad commented Feb 15, 2023

Trulli99 commented Feb 15, 2023

zubair-irshad commented Feb 15, 2023

Trulli99 commented Feb 16, 2023

Trulli99 commented Feb 16, 2023

zubair-irshad commented Feb 16, 2023

peng25zhang commented May 6, 2023

DavidYaonanZhu commented Jun 28, 2023

zubair-irshad commented Jun 30, 2023