Does sam2 have any parameters to adjust the inference result？ #299

luoshuiyue · 2024-09-09T02:48:03Z

The following is the result I predicted, may I ask if there is any way to improve the result? I have adjusted mask_threshold to -1.0, -0.5,-0.2, and max_hole_area to 1, 20. None of these methods worked.

heyoeyo · 2024-09-09T22:53:38Z

One thing to try if you haven't already is using the different models (i.e. large vs. base), since they behave differently and one might work better than the other in some cases (i.e. large isn't always the best). It's also worth checking the different mask outputs (from multimask), since sometimes there can be one good mask even if the rest aren't great.

I'd also recommend trying to use as few prompts as possible. From what I've seen, the quality of the output really starts to drop once there are lots of prompts. In the worst case, where the masking isn't getting everything needed, you could try masking different pieces separately (using just 1 or 2 prompts) and combining the masks afterwards if that works for your use case (though it is inconvenient...).

And lastly, if you haven't already tried it, box prompts sometimes work well for objects that have lots of distinct areas like the person in the picture (i.e. legs + shorts + shirt + arms etc.). For example, one box prompt (using the large model) does fairly well on the last picture at least:

luoshuiyue · 2024-09-11T07:37:43Z

Thanks. I change the base plus model and the results doesn't get better. I use bbox setting just by copying the code in jupyter notebook and putting it in for loop, and the improvement is very small. So, I want to ask:

How to get the result just as you show in the GIF in your previous reply?
How to use the result of automatic_mask_generator_example.ipynb, I want to get the mask of person in the middle:

heyoeyo · 2024-09-11T14:23:18Z

How to get the result just as you show in the GIF in your previous reply?

That gif is a screencapture of using this script.

How to use the result of automatic_mask_generator_example.ipynb, I want to get the mask of person in the middle

I think it would be tricky to do with the auto mask generator alone. The default point grid covers the whole image and is going to pick up loads of stuff in the background that will make it hard to deal with, so you could try using a custom point_grid that is limited to the center of the image. You could also try adjusting the min_mask_region_area setting, to see if that can help to filter out 'small' masks.

If you don't mind bringing in other models, you could also try using an object (person) detector to at least get a bounding box around the person and use that to ignore all the masks outside. Or similarly, you could maybe use a depth prediction model to ignore any masks that come from parts of the image that are 'too far away' to be the person. Otherwise I think it's difficult to target specific objects with the auto mask generator, since the SAM models alone don't have a way to classify the segmentation results.

luoshuiyue changed the title ~~Do sam2 have any parameters to adjust the inference result？~~ Does sam2 have any parameters to adjust the inference result？ Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does sam2 have any parameters to adjust the inference result？ #299

Does sam2 have any parameters to adjust the inference result？ #299

luoshuiyue commented Sep 9, 2024

heyoeyo commented Sep 9, 2024

luoshuiyue commented Sep 11, 2024

heyoeyo commented Sep 11, 2024

Does sam2 have any parameters to adjust the inference result？ #299

Does sam2 have any parameters to adjust the inference result？ #299

Comments

luoshuiyue commented Sep 9, 2024

heyoeyo commented Sep 9, 2024

luoshuiyue commented Sep 11, 2024

heyoeyo commented Sep 11, 2024