Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow not accepting the checkpoints downloaded from the Dropbox link under BestModel #24

Closed
sushantMoon opened this issue Mar 1, 2020 · 2 comments

Comments

@sushantMoon
Copy link

sushantMoon commented Mar 1, 2020

@neccam I downloaded (using the dropbox link in the bash file under BestModel) and extracted the files in sign2text.tar.gz and placed them under BestModel/sign2text/ and tried to run nmt.py with the parameters as specified below.

python -m nmt --out_dir=BestModel/sign2text --inference_input_file=Data/phoenix2014T.test.sign --inference_output_file=model-output --inference_ref_file=Data/phoenix2014T.test.de --base_gpu=0 --vocab_prefix=Data/phoenix2014T.vocab --tgt=de

Error is given by tf.train.latest_checkpoint in nmt.py line number 350 as it returns None.

Following it the copy of the error message that I have received.

WARNING:tensorflow:From /mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py:378: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

W0301 17:29:48.804346 139654575957568 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

# Job id 0
# Set random seed to 285
WARNING:tensorflow:From /mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py:334: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.

W0301 17:29:49.116646 139654575957568 module_wrapper.py:139] From /mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py:334: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.

# Loading hparams from BestModel/sign2text/hparams
WARNING:tensorflow:From utils/misc_utils.py:83: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

W0301 17:29:49.117575 139654575957568 module_wrapper.py:139] From utils/misc_utils.py:83: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

  saving hparams to BestModel/sign2text/hparams
  saving hparams to BestModel/sign2text/best_bleu/hparams
  attention=
  attention_architecture=standard
  base_gpu=0
  batch_size=1
  beam_width=3
  best_bleu=0
  best_bleu_dir=BestModel/sign2text/best_bleu
  bpe_delimiter=None
  colocate_gradients_with_ops=True
  decay_factor=0.98
  decay_steps=10000
  dev_prefix=None
  dropout=0.2
  encoder_type=uni
  eos=</s>
  epoch_step=0
  eval_on_fly=True
  forget_bias=1.0
  infer_batch_size=32
  init_op=glorot_normal
  init_weight=0.1
  learning_rate=1e-05
  length_penalty_weight=0.0
  log_device_placement=False
  max_gradient_norm=5.0
  max_train=0
  metrics=[u'bleu']
  num_buckets=0
  num_embeddings_partitions=0
  num_gpus=1
  num_layers=2
  num_residual_layers=0
  num_train_steps=10000
  num_units=32
  optimizer=adam
  out_dir=BestModel/sign2text
  pass_hidden_state=True
  random_seed=285
  residual=False
  snapshot_interval=1000
  sos=<s>
  source_reverse=False
  src=None
  src_max_len=300
  src_max_len_infer=300
  start_decay_step=0
  steps_per_external_eval=None
  steps_per_stats=100
  test_prefix=None
  tgt=de
  tgt_max_len=50
  tgt_max_len_infer=None
  tgt_vocab_file=Data/phoenix2014T.vocab.de
  tgt_vocab_size=2891
  time_major=True
  train_prefix=None
  unit_type=lstm
  vocab_prefix=Data/phoenix2014T.vocab
WARNING:tensorflow:From inference.py:55: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0301 17:29:49.130767 139654575957568 module_wrapper.py:139] From inference.py:55: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/api/_v1/estimator/__init__.py:12: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

W0301 17:29:49.143486 139654575957568 module_wrapper.py:139] From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_estimator/python/estimator/api/_v1/estimator/__init__.py:12: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

WARNING:tensorflow:From utils/iterator_utils.py:47: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    
W0301 17:29:49.180721 139654575957568 deprecation.py:323] From utils/iterator_utils.py:47: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.
Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.
    
WARNING:tensorflow:From utils/iterator_utils.py:57: make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_initializable_iterator(dataset)`.
W0301 17:29:49.321150 139654575957568 deprecation.py:323] From utils/iterator_utils.py:57: make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_initializable_iterator(dataset)`.
WARNING:tensorflow:From alexnet.py:130: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0301 17:29:49.328442 139654575957568 module_wrapper.py:139] From alexnet.py:130: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From alexnet.py:132: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

W0301 17:29:49.329063 139654575957568 module_wrapper.py:139] From alexnet.py:132: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From alexnet.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

W0301 17:29:49.346043 139654575957568 module_wrapper.py:139] From alexnet.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From alexnet.py:171: The name tf.nn.xw_plus_b is deprecated. Please use tf.compat.v1.nn.xw_plus_b instead.

W0301 17:29:49.431602 139654575957568 module_wrapper.py:139] From alexnet.py:171: The name tf.nn.xw_plus_b is deprecated. Please use tf.compat.v1.nn.xw_plus_b instead.

WARNING:tensorflow:From alexnet.py:192: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0301 17:29:49.434792 139654575957568 deprecation.py:506] From alexnet.py:192: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From model.py:77: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead.

W0301 17:29:49.450800 139654575957568 module_wrapper.py:139] From model.py:77: The name tf.get_variable_scope is deprecated. Please use tf.compat.v1.get_variable_scope instead.

# creating infer graph ...
  num_layers = 2, num_residual_layers=0
  cell 0  LSTM, forget_bias=1WARNING:tensorflow:From model_helper.py:75: __init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
W0301 17:29:49.459327 139654575957568 deprecation.py:323] From model_helper.py:75: __init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
  DeviceWrapper, device=/gpu:0
  cell 1  LSTM, forget_bias=1  DeviceWrapper, device=/gpu:0
WARNING:tensorflow:From model_helper.py:160: __init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.StackedRNNCells, and will be replaced by that in Tensorflow 2.0.
W0301 17:29:49.461143 139654575957568 deprecation.py:323] From model_helper.py:160: __init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.StackedRNNCells, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:From model.py:446: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
W0301 17:29:49.461896 139654575957568 deprecation.py:323] From model.py:446: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:735: add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
W0301 17:29:49.531052 139654575957568 deprecation.py:323] From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:735: add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
WARNING:tensorflow:From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:739: calling __init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0301 17:29:49.539202 139654575957568 deprecation.py:506] From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:739: calling __init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn.py:244: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0301 17:29:49.575788 139654575957568 deprecation.py:323] From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/ops/rnn.py:244: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From model.py:277: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0301 17:29:49.610135 139654575957568 deprecation.py:323] From model.py:277: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From model.py:277: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0301 17:29:49.613713 139654575957568 deprecation.py:323] From model.py:277: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
  cell 0  LSTM, forget_bias=1  DeviceWrapper, device=/gpu:0
  cell 1  LSTM, forget_bias=1  DeviceWrapper, device=/gpu:0
WARNING:tensorflow:From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/contrib/seq2seq/python/ops/beam_search_decoder.py:971: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0301 17:29:49.953177 139654575957568 deprecation.py:323] From /home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/contrib/seq2seq/python/ops/beam_search_decoder.py:971: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From model.py:113: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

W0301 17:29:50.130557 139654575957568 module_wrapper.py:139] From model.py:113: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

WARNING:tensorflow:From model.py:149: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

W0301 17:29:50.131427 139654575957568 module_wrapper.py:139] From model.py:149: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From model.py:149: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

W0301 17:29:50.131817 139654575957568 module_wrapper.py:139] From model.py:149: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

  start_decay_step=0, learning_rate=1e-05, decay_steps 10000, decay_factor 0.98
# Trainable variables
  conv1/weights:0, (11, 11, 3, 96), /device:GPU:0
  conv1/biases:0, (96,), /device:GPU:0
  conv2/weights:0, (5, 5, 48, 256), /device:GPU:0
  conv2/biases:0, (256,), /device:GPU:0
  conv3/weights:0, (3, 3, 256, 384), /device:GPU:0
  conv3/biases:0, (384,), /device:GPU:0
  conv4/weights:0, (3, 3, 192, 384), /device:GPU:0
  conv4/biases:0, (384,), /device:GPU:0
  conv5/weights:0, (3, 3, 192, 256), /device:GPU:0
  conv5/biases:0, (256,), /device:GPU:0
  fc6/weights:0, (9216, 4096), /device:GPU:0
  fc6/biases:0, (4096,), /device:GPU:0
  fc7/weights:0, (4096, 4096), /device:GPU:0
  fc7/biases:0, (4096,), /device:GPU:0
  embeddings/decoder/embedding_decoder:0, (2891, 32), 
  dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0, (4128, 128), /device:GPU:0
  dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0, (128,), /device:GPU:0
  dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel:0, (64, 128), /device:GPU:0
  dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias:0, (128,), /device:GPU:0
  dynamic_seq2seq/decoder/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0, (64, 128), /device:GPU:0
  dynamic_seq2seq/decoder/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0, (128,), /device:GPU:0
  dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel:0, (64, 128), /device:GPU:0
  dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias:0, (128,), /device:GPU:0
  dynamic_seq2seq/decoder/output_projection/kernel:0, (32, 2891), 
WARNING:tensorflow:From inference.py:137: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0301 17:29:50.161480 139654575957568 module_wrapper.py:139] From inference.py:137: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From utils/misc_utils.py:134: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0301 17:29:50.161895 139654575957568 module_wrapper.py:139] From utils/misc_utils.py:134: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

Traceback (most recent call last):
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py", line 378, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py", line 368, in main
    run_main(FLAGS, default_hparams, train_fn, inference_fn)
  File "/mnt/Alice/ISI/Thesis/nslt/nslt/nmt.py", line 351, in run_main
    inference_fn(ckpt, flags.inference_input_file, trans_file, hparams, num_workers, jobid)
  File "inference.py", line 125, in inference
    single_worker_inference(infer_model, ckpt, inference_input_file, inference_output_file, hparams)
  File "inference.py", line 139, in single_worker_inference
    loaded_infer_model = model_helper.load_model(infer_model.model, ckpt, sess, "infer")
  File "model_helper.py", line 173, in load_model
    model.saver.restore(session, ckpt)
  File "/home/lorenzo/anaconda3/envs/nslt/lib/python2.7/site-packages/tensorflow_core/python/training/saver.py", line 1277, in restore
    raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.

Note : I am working with Cuda 10, tensorflow-gpu==1.15 and I am wanting to perform the inference task only.
Any help would be very much appreciated. I am picking up on tensorflow so I might have missed something very simple, please guide.

Update 2/3/2020 :
After some digging, I found that Checkpoint Guide states that when checkpoints are created, a file called checkpoint is also creted along with the index, meta and data file, this checkpoint file is a protocol buffer containing the data for the recent checkpoints and is read by tf.train.get_checkpoint_state for selecting the latest checkpoint when function tf.train.latest_checkpoint is called.

Now if author can help us with the missing file then we should be able to run the trained model hopefully.

Please pardon and guide me if I am missing something very obvious.

@sushantMoon
Copy link
Author

sushantMoon commented May 7, 2020

After training the model on my own, I found that the file checkpoint was indeed missing(or others knew about it and I didn't) in the BestModel Folder.

For others who faced the same issue, create a file called checkpoint in BestModel directory, then add the following lines into that file,

model_checkpoint_path: "translate.ckpt-102000"
all_model_checkpoint_paths: "translate.ckpt-102000"

Closing this issue.
Thank You

@hshreeshail
Copy link

@sushantMoon Could you share a link to the file sign2text under BestModel? The dropbox link to download that file no longer works. Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants