-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report Progress as a Fraction of Remaining Epochs #59
Comments
Count the batches(iterations) instead of epochs. |
Ah, so that means the number of epochs in the job just (100k(the default epoch end) * 128(the batch size)) / number of data samples(10 M for number of images in dataset) @nttstar ? |
If total size of iterations is 100K, then the number of epochs equals to |
But hang on, it appears that the number of batches per epoch is much less than 100K. What is the determinant of the number of batches per Epoch? Also @nttstar How long did it originally take to train the models given s examples in the README? We're currently training Mobilenet using the ArcFace method using one NVIDIA Tesla GPU. We're about 3 days and 22 hours in, with 8 epochs and 9640 batches passed. the accuracy reported at each step only recently approached 0.010156. The accuracy also appears to be rising much more slowly than when the softmax method is used. Is this normal? |
number of batches per Epoch = total_sample_size/total_batch_size Also I just did experiments with batch_size=512(128*4). I'm not very sure if it works very well with smaller batch size like your case. |
Is there any way to get the script to report the number of remaining epochs (or number of remaining batches within an epoch?). I'm training Mobilenet using the InsightFace method using the MSM1 dataset. I'm on the 2nd epoch, but I have no idea how many more epochs remain.
The text was updated successfully, but these errors were encountered: