You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Mainly I want to clarify the intent of how the images are paired for the loss in case I misunderstand.. MapNet ensures global consistency via its clever relative loss. I'm just having hardtime grasping intuitively why a value of skip defaults to 10. It is a hyperparam and for sure can be optimized but you providing it as the default must mean you had good results for that value.
To that extent, skip=10 means if the dataloader picks an index of say 36, given steps=3, the loader would pick the images indexed: [26, 36, 46] with a gap of 10 images.
But doesn't it mean you're picking images which are farther apart chronologically, and thus also in translation. (a person collecting the data moving at 1m/sec means the 3 images would be 10 meters apart, so we loose the point of relative loss).
Actually I trained my own model with the default hyperparams and then with skip=1 and I got poorer results so I wanted to clarify the intent of how the images are paired for the loss in case I misunderstand.
Thank you for your time.
The text was updated successfully, but these errors were encountered:
Hi @AntiLibrary5 you are right in your understanding of skip.
The reason skip > 1 is because often the camera motion and image change between two consecutive frames at 30 fps is quite small. This is true both from the camera pose as well as motions of dynamic objects in the scene. So it might not provide a strong learning signal, because anyway the network will predict a similar pose (since the input images are almost the same).
On the other hand, if you connect two far-away images with the relative loss, it provides a stronger learning signal. For example, the network's prediction might jump by a large amount between two images 10 frames apart (let's say because of the large image change from the motion of a dynamic object like a car, or the sudden brightness change from the appearance of a new light).
At the same time, if skip is set to a very large separation, then the two images will not have any overlap in their view frustums.
So skip needs to be a compromise between all these considerations, and good skip values are also likely to be different for different datasets.
Hi,
Mainly I want to clarify the intent of how the images are paired for the loss in case I misunderstand.. MapNet ensures global consistency via its clever relative loss. I'm just having hardtime grasping intuitively why a value of skip defaults to 10. It is a hyperparam and for sure can be optimized but you providing it as the default must mean you had good results for that value.
To that extent, skip=10 means if the dataloader picks an index of say 36, given steps=3, the loader would pick the images indexed: [26, 36, 46] with a gap of 10 images.
But doesn't it mean you're picking images which are farther apart chronologically, and thus also in translation. (a person collecting the data moving at 1m/sec means the 3 images would be 10 meters apart, so we loose the point of relative loss).
Actually I trained my own model with the default hyperparams and then with skip=1 and I got poorer results so I wanted to clarify the intent of how the images are paired for the loss in case I misunderstand.
Thank you for your time.
The text was updated successfully, but these errors were encountered: