Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving segmentation mask to image in bop_writer #155

Closed
luigifaticoso opened this issue Feb 9, 2021 · 10 comments
Closed

Saving segmentation mask to image in bop_writer #155

luigifaticoso opened this issue Feb 9, 2021 · 10 comments

Comments

@luigifaticoso
Copy link

luigifaticoso commented Feb 9, 2021

Hello again,
Thank you for all the previous issues, always hoping they could help someone in the same situation.
I am currently using the bop writer to save my renders.
I was trying to generate a binary image for the mask ( like in linemod_preprocessed dataset ), I have tried to use the Annotation writer explained in #134 using the other solution:

for vert in bpy.context.scene.objects["Cube"].data.vertices:
    point2d_blender = bpy_extras.object_utils.world_to_camera_view(bpy.context.scene, bpy.context.scene.camera, vert.co)
    annotatons.append([point2d_blender[0] * width, (1 - point2d_blender[1]) * height])

At this point, the resulting output results in points with coordinate out of the dimension of my image (mine is 512x512 and coordinates are like 950 or 1100)

Here is an example:

these are the annotations saved in Annotations_0000.npy

[[1135.72924805 -936.28405762]
 [1135.52734375 -935.92370605]
 [1135.51086426 -935.80322266]
 ...
 [1116.79711914 -902.18640137]
 [1116.83508301 -902.25866699]
 [1116.79370117 -902.22290039]]

This is the Segmentation output from Hdf5:
segmentation

I don't know much about the bpy module. Is it an easy task to output them as an image directly using the bopWriter or have correct coordinates to draw a mask using OpenCV?

I have also tried extracting the segmentation keypoints from the coco_annotations.json, but, in that form, they don't look like the coordinates I need to generate the mask image

@MartinSmeyer
Copy link
Member

First, you don't need to write custom code for that, #134 is for keypoint annotations not masks.

The BopWriter does not output masks because the bop_toolkit can already compute masks. Please have a look here:
#58 (comment)

Alternatively, if you want to do it within BlenderProc, just add a SegMapRenderer to your config.

@luigifaticoso
Copy link
Author

luigifaticoso commented Feb 9, 2021

@MartinSmeyer Thank you for the answer.
Ok, I understand.
By using the SegMapRenderer in my config, it doesn't output the .csv.
I haven't found any example showing the creation of the csv file.
This is my module

   {
      "module": "renderer.SegMapRenderer",
      "config": {
        "map_by": ["instance", "class", "cp_bop_dataset_name"],
        "default_values": {"class":0, "cp_bop_dataset_name": "None"}
      }
   }

I am checking the bop_toolkit right now

@MartinSmeyer
Copy link
Member

I agree that the documentation does not read super well in this case, but it does say

[...] so a csv file is generated, which is attached to the .hdf5 container in the end.

So you find the mapping in the generated hdf5 file. You can print all keys of the hdf5 file using

python scripts/printHdf5Keys.py /path/to/0.hdf5

@MartinSmeyer
Copy link
Member

In fact, you also need the hdf5 writer for that.

@luigifaticoso
Copy link
Author

Thank you.

Ok, I see it, I have never had previous experience with hdf5, didn't know how they work.

Going to use that output to generate the mask image, thank you a lot. I can consider the issue closed!

@MartinSmeyer
Copy link
Member

Yeah hdf5 files are useful because they can store data in a compressed way and you can access it like a dictionary.

If you still want just the raw outputs and not use the hdf5 writer, you can also set the global config to "output_is_temp": False,

@luigifaticoso
Copy link
Author

luigifaticoso commented Feb 10, 2021

Hi @MartinSmeyer
I was checking out the hdf5 output

Keys: 'colors': (512, 512, 3),
 'colors_version': 2.0.0,
 'distance': (512, 512, 3),
 'distance_version': 2.0.0,
 'segcolormap': 
b'[{"idx": "1", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"}, 
{"idx": "2", "category_id": "33","bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"}, 
{"idx": "3", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"}, 
{"idx": "8", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"},
 {"idx": "10", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"},
 {"idx": "11", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"}, 
{"idx": "15", "category_id": "33", "bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"},
 {"idx": "30", "category_id": "33",
 [...]
"bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"},
 {"idx": "517", "category_id": "33","bop_dataset_name": "None", "channel_instance": "0", "channel_class": "1"}, 
{"idx": "518", "category_id": "1", "bop_dataset_name": "blm", "channel_instance": "0", "channel_class": "1"}]', 'segcolormap_version': 2.0.0, 'segmap': (512, 512, 2), 'segmap_version': 2.0.0

I can't find any information about coordinates of the mask, am I missing something?
Edit:
By setting

   {
      "module": "renderer.SegMapRenderer",
      "config": {
        "map_by": ["instance", "class", "cp_bop_dataset_name"],
        "default_values": {"class":0, "cp_bop_dataset_name": "none"},
        "output_is_temp": False,
      }
   },

The csv class_inst_col_map.csv output file is generated, I'm going to use that, thank you!

@MartinSmeyer
Copy link
Member

The csv ouput is in there under 'segcolormap'

The actual masks are save under 'segmap' which has the size (width, height, 2) and saves instance segmentation in channel 0 and semantic segmentation in channel 1. The values in segmap correspond to the idx or category_id saved in segcolormap respectively.

@luigifaticoso
Copy link
Author

luigifaticoso commented Feb 10, 2021

Thank you @MartinSmeyer
I have been struggling with this a lot. I still don't know if there is an easy way of doing this.

  • I have used this SegMapRenderer config:
    {
      "module": "renderer.SegMapRenderer",
      "config": {
        "map_by": "class",
        "output_is_temp": False,
        "default_values": {"class": 0}
      }
    },
  • I have reused the script to read the Hdf5Keys and appended the whole segmap instead of just the shape
  • Extracted the coordinates from the 512x512 matrix
  • Wrote a simple script in opencv to draw pixel by pixel in white when the object of interest is found:
        black_img = 255*np.ones((512, 512 , 3), dtype=np.uint8)
        for i in range(len(segmap)):
            for j in range(len(segmap[i])):
                if(segmap[i][j]==100.0): # <- this is my "cp_category_id"
                    black_img[i][j] = (255,255,255)
                else:
                    black_img[i][j] = (0,0,0)

Could have been done differently and maybe simpler, but this solution works to me. I was trying to use the bop toolkit and had a problem with config and paths.

May I ask you if the possibility of saving the png image of binary segmentation mask will be implemented in the future?

@themasterlink
Copy link
Contributor

May I ask you why the possibility of saving png image of binary segmentation mask is not implemented?

There are several reasons, first if you only need grey colors, you are limited to 255 classes, which in instance setting might be limiting. We wanted to make sure that our solution scales to any problem.

Wrote a simple script in opencv to draw pixel by pixel in white when the object of interest is found:

This can be done a bit simpler in python:

black_img = np.zeros(segmap.shape, dtype=np.uint8)
black_img[segmap == 100.0] = 255
# this last line is only needed if the output really needs to have 3 channels, matplotlib can also vis images without 3 channels
black_img = np.repeat(np.reshape(black_img, (black_img.shape[0], black_img.shape[1], 1)), 3, axis=-1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants