Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_minimum_axis seems wrong? #19

Open
initialneil opened this issue Apr 17, 2024 · 8 comments
Open

get_minimum_axis seems wrong? #19

initialneil opened this issue Apr 17, 2024 · 8 comments

Comments

@initialneil
Copy link

initialneil commented Apr 17, 2024

image

I've got an example here:

  1. The sorting selects 2 (the last column)
  2. gather moves the last column to the first column
  3. BUT R_sorted[:,0,:] selects the first row

So R_sorted[:,0,:] should be changed to R_sorted[:,:,0] ?

Why sorting the columns of R instead of rows?


After playing with the math, I believe that gather should work on rows instead of columns.

R_sorted = torch.gather(R, dim=2, index=sorted_idx[:,None,:].repeat(1, 3, 1)).squeeze()
# changes to 
R_sorted = torch.gather(R, dim=1, index=sorted_idx[:,:,None].repeat(1,1,3)).squeeze()

After the discussion below, we(@nyy618 and me) think that the normal is got by selecting the column.

R_sorted = torch.gather(R, dim=2, index=sorted_idx[:,None,:].repeat(1, 3, 1)).squeeze()
return R_sorted[:,:,0]

Pull request updated accordingly.

initialneil added a commit to initialneil/GaussianShader that referenced this issue Apr 17, 2024
Fix `get_minimum_axis` in `general_utils.py`
Asparagus15#19
@nyy618
Copy link

nyy618 commented Apr 21, 2024

I find this part weird too. It sorts the columns yet takes the row as the normal. But I still can't understand your modification. I think you should sort the columns instead of sorting the rows. According to linear algebra, the columns are the eigenvectors, which represent the direction after the rotation transformation. So R_sorted[:,0,:] should be changed to R_sorted[:,:,0] as you first mentioned. GaussianPro also sorts the column, just like my opinion.

        rotations_mat = build_rotation(rotations)
        scales = pc.get_scaling
        min_scales = torch.argmin(scales, dim=1)
        indices = torch.arange(min_scales.shape[0])
        normal = rotations_mat[indices, :, min_scales]

Is there something wrong with my understanding?

@initialneil
Copy link
Author

@nyy618 I had a case where I had to warp these R matrices by motion. And I came to the conclusion that these R matrices are w2c rotations for the gauss that transform from world coordinates to the gauss' local coordinates.
And for w2c rotations the rows are the axis vectors viewed in the world coordinates.

My case is like this:

  1. I have R for each gauss.
  2. Selecting sorted row as normal
  3. Apply additional rotation R' to the model gives: R <- R * inv(R').
  4. After applied additional rotation, the normal still looks fine.

I tried selecting columns, and couldn't make it work.

@nyy618
Copy link

nyy618 commented Apr 24, 2024

@initialneil Thank you for your inspiring clarification. I think the key point is the difference between rotation of world coordinates and rotation of the Gaussian in the world coordinates. The inverse of R is equal to the transpose of R since it is orthogonal. R is the rotation in world coordinates. The columns of R means how to represent the axis of Gaussian ellipsoid in world coordinates. However, if you want to transform from world coordinates to the Gaussian's local coordinates, you have to apply the inverse of R, namely the transpose of R. In theory, transition matrix from world coordinates basis to Gaussian coordinates basis is R, which means how to represent basis of Gaussian coordinates with the basis of world coordinates. Let the basis of world be e and the basis of Gaussian be e':
}B3PO@BU@CDUPXQP7E26TVW

If you want to represent the basis of world with basis of Gaussian, you have to apply the inverse of R. BTW, the getWorld2View2 function also takes the transpose of Camera.R as the rotation of w2c matrix.

def getWorld2View2(R, t, translate=np.array([.0, .0, .0]), scale=1.0):
    Rt = np.zeros((4, 4))
    Rt[:3, :3] = R.transpose()
    Rt[:3, 3] = t
    Rt[3, 3] = 1.0

    C2W = np.linalg.inv(Rt)
    cam_center = C2W[:3, 3]
    cam_center = (cam_center + translate) * scale
    C2W[:3, 3] = cam_center
    Rt = np.linalg.inv(C2W)
    return np.float32(Rt)

Still I am not sure with my conclusion, I will refer to others for help. Hope you can point out my misunderstanding.

@initialneil
Copy link
Author

@nyy618 In the definition of camera projection, the R is w2c: P_cam = R * P_world + t
So the original Camera.R should be w2c. But for the use of glm in the cuda code, the author of GS specifically stored camera's R in transposed:
https://github.com/graphdeco-inria/gaussian-splatting/blob/472689c0dc70417448fb451bf529ae532d32c095/scene/dataset_readers.py#L196-L197

# get the world-to-camera transform and set R, T
w2c = np.linalg.inv(c2w)
R = np.transpose(w2c[:3,:3])  # R is stored transposed due to 'glm' in CUDA code

For the R of gauss, it seems that it's stored directly in w2c, so the axis should be rows instead of columns.

@nyy618
Copy link

nyy618 commented Apr 24, 2024

@initialneil Thank you for your correction. I made a wrong example. Let the problem reduced to 2D Gauss. According to the paper, the covariance of the matrix is equal to RSS(T)R(T). For a particular problem:
O11WGR 60PGRG8W7FMLBP
As you can see, the direction of the long axis is equal to the first column of the R and the direction of short axis is equal to the second. I think you should apply the transpose of R to rotate the coordinate. Is there something I missed?

@initialneil
Copy link
Author

@nyy618 I finally got some time to settle this question.
I did some experiments and I think your math is correct. The normal is the columns instead of rows.

  1. One GS in the eye of a camera with identity rotation matrix.
    image

  2. Setting one of the scaling to very small makes the gs to shrink
    image

  3. The shrinked edge is the last column of R if we set the last scaling to be small
    image

@yinyunie
Copy link

yinyunie commented May 22, 2024

Hi,

I also agree with @initialneil .

When I read this line, from my understanding, R's column space is a transformation from Gaussian to world system, and the shortest axis should be the first column, like below

x_axis = R_sorted[:,0,:] # normalized by defaut
should be --->
x_axis = R_sorted[:,:,0] # normalized by defaut

I also did an experiment on the horse_blender. The PSNR doesnot change much, so I assume the normal and normal_2 take the major effect in regressing the correct normal.

[ITER 30000] Evaluating test: L1 0.016133079305291176 PSNR 26.573974609375 [21/05 19:52:34]

[ITER 30000] Evaluating train: L1 0.010304585099220276 PSNR 29.289199829101562 [21/05 19:52:39]

@initialneil
Copy link
Author

@yinyunie Agree. For static scene here, the R is like a black box of parameters anyway. But it gets important when extending to dynamic scenarios. So better be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants