Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gaussians._semantic_feature is 128. #50

Open
Lee-JaeWon opened this issue Nov 12, 2024 · 4 comments
Open

gaussians._semantic_feature is 128. #50

Lee-JaeWon opened this issue Nov 12, 2024 · 4 comments

Comments

@Lee-JaeWon
Copy link

Thank you for your great work. I have some questions.

When using LSeg with the speedup option, the dimension of gaussians._semantic_feature is 128. I want to keep the speedup while ensuring that gaussians._semantic_feature has a dimension of 512 (to match the dimension of the text feature 150*512). Is there any way to achieve this?

I understand that reducing it to 128 is to shorten the rendering time, but the rendered image becomes 512-dimensional during segmentation, which allows it to combine with the text feature.

It would be helpful to get some hints on how to achieve this.

@ShijieZhou-UCLA
Copy link
Owner

Once you enable speedup, the CNN decoder will achieve this for you. Check here:

cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim)

@Lee-JaeWon
Copy link
Author

@ShijieZhou-UCLA
Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

@huanhuanyuan7
Copy link

@ShijieZhou-UCLA Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

In the train.py, from lines106-107, only the feature_map (rendered feature) is used to upsampling for speeding up.
if dataset.speedup: feature_map = cnn_decoder(feature_map)
In addition, I encountered a problem similar to yours.
I adopt SAM to obtain the sematic_feature (dimension (256,48,64)). When using the following code, the CNN_decoder's input dim is 64.

gt_feature_map = viewpoint_cam.semantic_feature.cuda() feature_out_dim = gt_feature_map.shape[0] # speed up if dataset.speedup: feature_in_dim = int(feature_out_dim/4) cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim)
However, from lines 101-107, this leads to a dimension mismatch error because the feature_map channels dim is 128.
gt_feature_map = viewpoint_cam.semantic_feature.cuda() # (256,48,64) feature_map = F.interpolate(feature_map.unsqueeze(0), size=(gt_feature_map.shape[1], gt_feature_map.shape[2]), mode='bilinear', align_corners=True).squeeze(0) print("feature_map.shape:", feature_map.shape) feature_map = feature_map.unsqueeze(0) # (1,128,48,64) print("unsqueezed feature_map.shape:", feature_map.shape) if dataset.speedup: feature_map = cnn_decoder(feature_map)

@liangyuanZhang1
Copy link

@ShijieZhou-UCLA Thank you for your reply. Isn't cnn decoder intended for image features? Can it also be used for gaussians._semantic_feature for each Gaussian?

In the train.py, from lines106-107, only the feature_map (rendered feature) is used to upsampling for speeding up. if dataset.speedup: feature_map = cnn_decoder(feature_map) In addition, I encountered a problem similar to yours. I adopt SAM to obtain the sematic_feature (dimension (256,48,64)). When using the following code, the CNN_decoder's input dim is 64.

gt_feature_map = viewpoint_cam.semantic_feature.cuda() feature_out_dim = gt_feature_map.shape[0] # speed up if dataset.speedup: feature_in_dim = int(feature_out_dim/4) cnn_decoder = CNN_decoder(feature_in_dim, feature_out_dim) However, from lines 101-107, this leads to a dimension mismatch error because the feature_map channels dim is 128. gt_feature_map = viewpoint_cam.semantic_feature.cuda() # (256,48,64) feature_map = F.interpolate(feature_map.unsqueeze(0), size=(gt_feature_map.shape[1], gt_feature_map.shape[2]), mode='bilinear', align_corners=True).squeeze(0) print("feature_map.shape:", feature_map.shape) feature_map = feature_map.unsqueeze(0) # (1,128,48,64) print("unsqueezed feature_map.shape:", feature_map.shape) if dataset.speedup: feature_map = cnn_decoder(feature_map)

I encountered the same problem, I solved it by modifying the dimension of cnndecoder but then I encountered new problems
RuntimeError: Function _RasterizeGaussiansBackward returned an invalid gradient at index 4 - got [138766, 1, 128] but expected shape compatible with [138766, 1, 64]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants