You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your code and engineering.I am doing a classification task and have tried various models, including the Resnet series, Efficientnet series, NFnet, vit, Efficientformer, and have made some structural modifications. The top 1 scores of the trained models are all very high(The highest top 1 score can reach 98.5), including in the validation set, test set, and special scenario test set that I have specifically built.But during model deduction, the data comes from the previous detection model(The training set data is also obtained in this way). Discovering that consecutive frames of the same object typically change by 10 pixels in both horizontal and vertical directions(Sorry, I haven't tested if the lighting has changed yet). Will cause a change in predicted scores between 0.5 and 0.9. Even in extreme cases, it is found that the same object has different classification for two consecutive frames, but the prediction score is above 0.9. I'm not sure what the reason is, perhaps it's due to excessive data augmentation? (During training, it was found that the validation set loss was lower than the training set loss)
I also tried to output the layers in front of the last fc layer. I attempted to test the output tensors of different images and calculated the dot product between them. But no obvious patterns were found(I tried using tensors to see if clustering can solve the error situation).
How can I improve the robustness of the model to inputs and minimize the probability of errors? I hope to receive your reply. Thank you.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Thank you for your code and engineering.I am doing a classification task and have tried various models, including the Resnet series, Efficientnet series, NFnet, vit, Efficientformer, and have made some structural modifications. The top 1 scores of the trained models are all very high(The highest top 1 score can reach 98.5), including in the validation set, test set, and special scenario test set that I have specifically built.But during model deduction, the data comes from the previous detection model(The training set data is also obtained in this way). Discovering that consecutive frames of the same object typically change by 10 pixels in both horizontal and vertical directions(Sorry, I haven't tested if the lighting has changed yet). Will cause a change in predicted scores between 0.5 and 0.9. Even in extreme cases, it is found that the same object has different classification for two consecutive frames, but the prediction score is above 0.9. I'm not sure what the reason is, perhaps it's due to excessive data augmentation? (During training, it was found that the validation set loss was lower than the training set loss)
I also tried to output the layers in front of the last fc layer. I attempted to test the output tensors of different images and calculated the dot product between them. But no obvious patterns were found(I tried using tensors to see if clustering can solve the error situation).
How can I improve the robustness of the model to inputs and minimize the probability of errors? I hope to receive your reply. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions