gaussians._semantic_feature is 128. #50

Lee-JaeWon · 2024-11-12T13:44:53Z

Thank you for your great work. I have some questions.

When using LSeg with the speedup option, the dimension of gaussians._semantic_feature is 128. I want to keep the speedup while ensuring that gaussians._semantic_feature has a dimension of 512 (to match the dimension of the text feature 150*512). Is there any way to achieve this?

I understand that reducing it to 128 is to shorten the rendering time, but the rendered image becomes 512-dimensional during segmentation, which allows it to combine with the text feature.

It would be helpful to get some hints on how to achieve this.

The text was updated successfully, but these errors were encountered:

ShijieZhou-UCLA · 2024-11-12T17:55:34Z

Once you enable speedup, the CNN decoder will achieve this for you. Check here:

feature-3dgs/train.py

Line 52 in 6c570d6

cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim)

Lee-JaeWon · 2024-11-13T02:07:19Z

@ShijieZhou-UCLA
Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

huanhuanyuan7 · 2024-11-14T12:58:11Z

@ShijieZhou-UCLA Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

In the train.py, from lines106-107, only the feature_map (rendered feature) is used to upsampling for speeding up.
if dataset.speedup: feature_map = cnn_decoder(feature_map)
In addition, I encountered a problem similar to yours.
I adopt SAM to obtain the sematic_feature (dimension (256,48,64)). When using the following code, the CNN_decoder's input dim is 64.

gt_feature_map = viewpoint_cam.semantic_feature.cuda() feature_out_dim = gt_feature_map.shape[0] # speed up if dataset.speedup: feature_in_dim = int(feature_out_dim/4) cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim)
However, from lines 101-107, this leads to a dimension mismatch error because the feature_map channels dim is 128.
gt_feature_map = viewpoint_cam.semantic_feature.cuda() # (256,48,64) feature_map = F.interpolate(feature_map.unsqueeze(0), size=(gt_feature_map.shape[1], gt_feature_map.shape[2]), mode='bilinear', align_corners=True).squeeze(0) print("feature_map.shape:", feature_map.shape) feature_map = feature_map.unsqueeze(0) # (1,128,48,64) print("unsqueezed feature_map.shape:", feature_map.shape) if dataset.speedup: feature_map = cnn_decoder(feature_map)

liangyuanZhang1 · 2024-12-02T09:43:13Z

@ShijieZhou-UCLA Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

In the train.py, from lines106-107, only the feature_map (rendered feature) is used to upsampling for speeding up. if dataset.speedup: feature_map = cnn_decoder(feature_map) In addition, I encountered a problem similar to yours. I adopt SAM to obtain the sematic_feature (dimension (256,48,64)). When using the following code, the CNN_decoder's input dim is 64.

gt_feature_map = viewpoint_cam.semantic_feature.cuda() feature_out_dim = gt_feature_map.shape[0] # speed up if dataset.speedup: feature_in_dim = int(feature_out_dim/4) cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim) However, from lines 101-107, this leads to a dimension mismatch error because the feature_map channels dim is 128. gt_feature_map = viewpoint_cam.semantic_feature.cuda() # (256,48,64) feature_map = F.interpolate(feature_map.unsqueeze(0), size=(gt_feature_map.shape[1], gt_feature_map.shape[2]), mode='bilinear', align_corners=True).squeeze(0) print("feature_map.shape:", feature_map.shape) feature_map = feature_map.unsqueeze(0) # (1,128,48,64) print("unsqueezed feature_map.shape:", feature_map.shape) if dataset.speedup: feature_map = cnn_decoder(feature_map)

I encountered the same problem, I solved it by modifying the dimension of cnndecoder but then I encountered new problems
RuntimeError: Function _RasterizeGaussiansBackward returned an invalid gradient at index 4 - got [138766, 1, 128] but expected shape compatible with [138766, 1, 64]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gaussians._semantic_feature is 128. #50

gaussians._semantic_feature is 128. #50

Lee-JaeWon commented Nov 12, 2024

ShijieZhou-UCLA commented Nov 12, 2024

Lee-JaeWon commented Nov 13, 2024

huanhuanyuan7 commented Nov 14, 2024

liangyuanZhang1 commented Dec 2, 2024

gaussians._semantic_feature is 128. #50

gaussians._semantic_feature is 128. #50

Comments

Lee-JaeWon commented Nov 12, 2024

ShijieZhou-UCLA commented Nov 12, 2024

Lee-JaeWon commented Nov 13, 2024

huanhuanyuan7 commented Nov 14, 2024

liangyuanZhang1 commented Dec 2, 2024