Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training not working anymore #3

Open
chris-doe opened this issue Sep 3, 2019 · 6 comments
Open

Training not working anymore #3

chris-doe opened this issue Sep 3, 2019 · 6 comments

Comments

@chris-doe
Copy link

chris-doe commented Sep 3, 2019

Hi Tom,

first of all, thanks for updating the repo and providing the inference script.
However, there seems to be an issue now with the heatmap based scores during training. I did a clean clone of the repo and launched training as explained in the readme. Looking up the results in Tensorboard after 600 epochs, it can be seen, that the confidence maps don't show up any local maximas (while for the previous version of the repo, the confidence maps correctly showed that the network resolved depth uncertainity with increasing number of epochs and learned to localize objects). Hyperparameters as set by default (only set batch size to 8).

The inference script - using the old model checkpoints - worked for me after adapting NMS stage. Only one method (bbox_corners) in utils.py was missing.

Do you have any idea, to get the training running again? Would appreciate any help on that - thank you!

Best regards,
Chris

@aloukkal
Copy link

aloukkal commented Oct 12, 2019

Hi Chris,

If you have a look at the compute_loss function in train.py, the loss function that was used before is the binary cross-entropy whereas in the latest version it is the Huber loss. One thing to notice as well is that total_loss= score_loss in both versions. Maybe it is more suitable to first learn the score only then finetune on the other tasks.

@yhkim8412
Copy link

yhkim8412 commented Oct 16, 2019

Hi Tom,

Thanks again for updating the repo and providing the inference script (only set batch size to 8).
Like Chris, I did a clean clone of the repo and launched training as explained in the readme.

However, there seems to be an issue.

I got these values during training.
==> Training epoch complete
score : 1.9330e+02
position: 1.6398e+07
dimension: 3.2379e+06
angle : 9.0900e+04
total : 1.9727e+07
=== Beginning epoch 100 of 600 ===

This does not seem to be trained correctly.
Is there any issue on the SIZE of INPUT IMAGE?

I would appreciate any help on that - thanks again!

Best regards,
Younghyun

@chris-doe
Copy link
Author

Hi aloukkal,

Yes, I am aware of the changes affecting the loss function and confidence map representation.
The problem I was facing was: Using the new representation and loss computation, my network was not able to get certainity about depth at all. Even if I only trained on one single example/image and even if I tried to only learn the confidence score map of that single example, the network was not able to learn that specific score map (which would result in a right detection for this single training example).

@jackkwok
Copy link

Can someone share the last known working version in this repo?

@IAMShashankk
Copy link

@chris-doe @aloukkal @yhkim8412 @jackkwok Do you have any update on the issues you described here?

@IAMShashankk
Copy link

Hi Tom,

Thanks again for updating the repo and providing the inference script (only set batch size to 8). Like Chris, I did a clean clone of the repo and launched training as explained in the readme.

However, there seems to be an issue.

I got these values during training. ==> Training epoch complete score : 1.9330e+02 position: 1.6398e+07 dimension: 3.2379e+06 angle : 9.0900e+04 total : 1.9727e+07 === Beginning epoch 100 of 600 ===

This does not seem to be trained correctly. Is there any issue on the SIZE of INPUT IMAGE?

I would appreciate any help on that - thanks again!

Best regards, Younghyun

Even I am getting the same losses on the current version of the repo. How did you fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants