-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error First step cannot be zero when running train.py #51
Comments
I have the same error.Do you find how to solve it? |
yes, edit this in your config file in ...\models\research\object_detection\training
|
Thank you. It work here :)
…On Mon, 4 Jun 2561 at 16:52 blockhunts ***@***.***> wrote:
yes, edit this in your config file in
...\models\research\object_detection\training
train_config: {
batch_size: 1
optimizer {
momentum_optimizer: {
learning_rate: {
manual_step_learning_rate {
initial_learning_rate: 0.0002
schedule {
step: 900000
learning_rate: .00002
}
schedule {
step: 1200000
learning_rate: .000002
}
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#51 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AmB6BPsiQVe1w3fV6oSj5jUW-ATlE7pEks5t5QNIgaJpZM4UPjDR>
.
|
if you download model from the github repository files are up to date |
I ran into this same error while using the AWS DL AMI (Deep Learning AMI (Ubuntu) Version 10.0 (ami-23c4fb46)) and following, as far as I can tell, the same steps I used on Windows with obvious substitutions since this AMI is Ubuntu. Both Ubuntu and Windows are using TF 1.8. But when I use the train_config that blockhunts mentioned I get: Any ideas? |
I see that epratheeban has the solution to my problem mentioned here #11: It's easy. Go to the utils folder. Find the learning_schedules.py file. Go to the line 167. And replace the line 167 with below rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries), |
Hi @jim-meyer WARNING:tensorflow:From C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-pack Future major versions of TensorFlow will allow gradients to flow See @{tf.nn.softmax_cross_entropy_with_logits_v2}. Traceback (most recent call last): |
TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>) |
@tamizharasank what file ? this kind of error copy it in google you will find the fix easily |
@tamizharasank did you solve this error? I got the same error, any suggesstions? |
After making changes in configure file in training folder I got this error: (tensorflow1) C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config Future major versions of TensorFlow will allow gradients to flow See @{tf.nn.softmax_cross_entropy_with_logits_v2}. C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\ops\gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. |
Looks like you probably did not follow all of the steps in 2a, "Download TensorFlow Object Detection API repository from GitHub" and/or 2b, "Download the Faster-RCNN-Inception-V2-COCO model from TensorFlow's model zoo". Try following those steps again exactly and that should fix your problem. |
File "C:\tensorflow1\models\research\object_detection\utils\learning_schedules.py", line 160, in manual_stepping i edit the file and save it and when i train it again it's return to it's original value |
I'm getting below error while i was trying to run: WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
WARNING:tensorflow:From C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Traceback (most recent call last): |
@ShubhranshuMaurya that error seems to indicate that there is something wrong with C:\Tensorflow\workspace raining_demonnotations/label_map.pbtxt. Have you opened that file in a text editor to see if it looks right? That file file should look something like this: item { IIRC this file could also be a binary protobuf file in which case viewing it in a text editor won't tell you much. But if it appears to be binary perhaps you could try creating a text version with your training labels and see if that works. |
#tessor flow custom training ERROR:raise ValueError('First step cannot be zero.') ValueError: First step cannot be zero. SOLUTION: object_detection\training\ .config train_config: { |
For me it worked with 'step: 1' |
Did you find a solution? |
can you explain what is happening in learning rate?, what does the both step size signify in manual learning rate and also what is initial learning rate? |
python train.py --logtostderr -train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v2_quantized_300x300_coco.config Current thread 0x00005734 (most recent call first): help |
i tried to use the same images (card) provided, i just delete all the processed file (csv,dll) and follow all the step.
And when i tried to issue python train.py
I got this error
Any clues why this happen?
The text was updated successfully, but these errors were encountered: