You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here's an explanation of the poses_bounds.npy file format. This file stores a numpy array of size Nx17 (where N is the number of input images). You can see how that is loaded in the three lines here. Each row of length 17 gets reshaped into a 3x5 pose matrix and 2 depth values that bound the closest and farthest scene content from that point of view.
The pose matrix is a 3x4 camera-to-world affine transform concatenated with a 3x1 column [image height, image width, focal length] along axis=1.
The rotation (first 3x3 block in the camera-to-world transform) is stored in a somewhat unusual order, which is why there are the transposes. From the point of view of the camera, the three axes are [ down, right, backwards ]
which some people might consider to be [-y,x,z].
So the steps to reproduce this should be (if you have a set of 3x4 poses for your images, plus focal lengths and close/far depth bounds):
Make sure your poses are in camera-to-world format, not world-to-camera.
Make sure your rotation matrices have the columns in the same order I use (downward, right, backwards).
Concatenate each pose with the [height, width, focal] vector to get a 3x5 matrix.
Flatten each of those into 15 elements and concatenate the close/far depths.
Concatenate each 17d vector to get a Nx17 matrix and use np.save to store it as poses_bounds.npy.
Still have a question about Rt, since I use opencv Camera Cooradinate([right, down, forward]) as source pose, then for meeting the requirements of [ down, right, backwards ]
which some people might consider to be [-y,x,z]. I have to switch x and y, then negative z, but how about T(3x1).
Below it is OpenCV Camera Cooradinate.
Looking forward to your reply~
I just added the four test scenes from Figure 9 (airplants, pond, fern, t-rex) to the google drive supplement, you can find them here now:
https://drive.google.com/open?id=1Xzn-bRYhNE5P9N7wnwLDXmo37x7m3RsO
Here's an explanation of the
poses_bounds.npy
file format. This file stores a numpy array of size Nx17 (where N is the number of input images). You can see how that is loaded in the three lines here. Each row of length 17 gets reshaped into a 3x5 pose matrix and 2 depth values that bound the closest and farthest scene content from that point of view.The pose matrix is a 3x4 camera-to-world affine transform concatenated with a 3x1 column [image height, image width, focal length] along axis=1.
The rotation (first 3x3 block in the camera-to-world transform) is stored in a somewhat unusual order, which is why there are the transposes. From the point of view of the camera, the three axes are
[ down, right, backwards ]
which some people might consider to be
[-y,x,z]
.So the steps to reproduce this should be (if you have a set of 3x4 poses for your images, plus focal lengths and close/far depth bounds):
poses_bounds.npy
.Hopefully that helps explain my pose processing after colmap. Let me know if you have any more questions.
Originally posted by @bmild in #10 (comment)
The text was updated successfully, but these errors were encountered: