-
Notifications
You must be signed in to change notification settings - Fork 45.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CenterNet MobileNetV2 FPN 512x512 not trainable (or a bug of evaluation) #10065
Comments
To be honest, I am finding it difficult to debug the codes in this repository. I believe centernet is one of the most important architecture for mobile purpose (ex. Next-Generation Pose Detection with MoveNet and TensorFlow.js. |
I decided to use size=1 dataset to simplify the things (batch_size set to 1 for train and eval). Below change eliminate the mismatch betweeen train-loss and eval-loss and I got mAP=1.0 at 1000step.
So the problem stems from Batchnorm (or Dropout maybe). I will inspect the way to properly handle the issue later if possible.
|
I finally found that eval loss is big just because learned moving average of μ,σ are not yet close to the μ,σ of the batch (which is unique one in this experiment), The difference of ssd and centernet is probably just a batchnorm decay value: models/research/object_detection/configs/tf2/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.config Line 106 in 66264b2
The problem would go away after sufficient steps. When evaluation, Batchnorm probably behave as inference mode (as expected) without any code modifications in this repo. I hope someone in the know can confirm above conclusion. I am NOT 100% sure at this point:
|
@lisosia I don't have the solution to this problem. But I was wondering if you know how to encode key points for this model as they are supported for this one. I cannot find any documentation for this. |
First of all, object detection and keypoint detection are two different things and need to be distinguished. I have not used this repository for keypoint detection, but as far as I know, there is no documentation for keypoint detection. For example
If you want to know about centernet itself, check the original paper. |
My conclusion is probably correct, and since there seems to be no response, I will close the issue. I think the proper way is to add an option to |
Is this issue solved now?? CenterNet MobileNetV2 FPN 512x512 is trainable now? |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/research/object_detection
2. Describe the bug
A issue of training "CenterNet MobileNetV2 FPN 512x512" while other models trainnable.
I conducted overfit-training test to verify that the model can be trained.
I tested 3 models and only the "CenterNet MobileNetV2" training fails.
3. Steps to reproduce
checkout commit 0c9253b
used configs and create_voc_subset_tfrecord.py
sanitiy_check.zip
please refer to attached configs
mofify model_lib_v2.py to run evaluation properly
then run evaluation
only the result of centernet mobilenetv2 is apparently incorrect.
train loss decreases during training, but val-loss is high and mAP@0.75 is 0.388
4. Expected behavior
5. Additional context
although it is not directly related to the subject:
6. System information
The text was updated successfully, but these errors were encountered: