You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using ubuntu 22, training works fine on CPU but when --num_gpus=1 I get this error stack.
This stack appears on following the instructions for the demo.
I first thought it is a tensorflow issue so I ran training on GPU using example from tensorflow tutorials, but that worked fine.
Detected at node 'SelectV2' defined at (most recent call last):
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/threading.py", line 908, in _bootstrap
self._bootstrap_inner()
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/threading.py", line 950, in _bootstrap_inner
self.run()
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 864, in train_step
return self.compute_metrics(x, y, y_pred, sample_weight)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 957, in compute_metrics
self.compiled_metrics.update_state(y, y_pred, sample_weight)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/compile_utils.py", line 459, in update_state
metric_obj.update_state(y_t, y_p, sample_weight=mask)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/utime/evaluation/utils.py", line 22, in wrapper
mask = tf.where(tf.logical_and(
Node: 'SelectV2'
Detected at node 'SelectV2' defined at (most recent call last):
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/threading.py", line 908, in _bootstrap
self._bootstrap_inner()
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/threading.py", line 950, in _bootstrap_inner
self.run()
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step
outputs = model.train_step(data)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 864, in train_step
return self.compute_metrics(x, y, y_pred, sample_weight)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/training.py", line 957, in compute_metrics
self.compiled_metrics.update_state(y, y_pred, sample_weight)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/keras/engine/compile_utils.py", line 459, in update_state
metric_obj.update_state(y_t, y_p, sample_weight=mask)
File "/home/shubham/anaconda3/envs/u-sleep/lib/python3.9/site-packages/utime/evaluation/utils.py", line 22, in wrapper
mask = tf.where(tf.logical_and(
Node: 'SelectV2'
2 root error(s) found.
(0) UNKNOWN: JIT compilation failed.
[[{{node SelectV2}}]]
[[div_no_nan_1/ReadVariableOp/_12]]
(1) UNKNOWN: JIT compilation failed.
[[{{node SelectV2}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_13068]
The text was updated successfully, but these errors were encountered:
Using ubuntu 22, training works fine on CPU but when --num_gpus=1 I get this error stack.
This stack appears on following the instructions for the demo.
I first thought it is a tensorflow issue so I ran training on GPU using example from tensorflow tutorials, but that worked fine.
The text was updated successfully, but these errors were encountered: