a. Implemented Attention Map to see how the model makes its decisions.
- Utilized an unreal engine generated images with geometric signals to pretrain our model
- Performed a classification task on DIVA dataset to determine whether a door of the car is open or not.
- Tested to see if the incorporation of geometric signals (Surface Normal, Depth) into the loss function can improve the performance.
- Hypothesized based on the attention map projection that a multilabel classification task (Predicting opened or closed state of Front, Back, Left, Right) doors independently) will improve the performance on a simple binary classification task(Predicting whether any door is opened or not).
- Demonstrates increase in 5% accuracy when pretrained with SAVED dataset compared to our control(ImageNet pretrained)
- Incorporation of geometric signals lead to further 5% increase in accuracy compared to the model pretrained without geometric information.
-
Need for an improved control experiment: Rather than employing an ImageNet pretrained model, V-KITTI pretrained model will serve as a better control. Showing an improved performance compared to V-KITTI will strongly support the benefit of this dataset.
-
Normalization of background noise for SAVED dataset: unrealistic demonstration of the background noisy for simulated images can perturb domain adaptation and result in hindered performance on the real data.