You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @tinghuiz
Thanks for make the code open-source. I wonder if you can elaborate the method that you encode the pose data in KITTI dataset? The original data in KITTI is a 12-D vector, while in your code, I found that the dimension is 1,6,224,224.
Can you please elaborate your encoding method?
The text was updated successfully, but these errors were encountered:
For the paper, I fed the 12-D difference vector through two FC layers (12 -> 128 -> 256), and concatenate the output with the image features (4096) to form the input to the flow decoder pathway.
After the submission, I found that it actually performs better by using the Euler angles + 3D translation (6 numbers) pose representation, and concatenate them along the color channels of the input image (spatially replicated for each pixel) as the input to the network. This way there's actually no need for FC layers, and the network can be fully-convolutional.
Hi @tinghuiz
Thanks for make the code open-source. I wonder if you can elaborate the method that you encode the pose data in KITTI dataset? The original data in KITTI is a 12-D vector, while in your code, I found that the dimension is 1,6,224,224.
Can you please elaborate your encoding method?
The text was updated successfully, but these errors were encountered: