You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 10, 2023. It is now read-only.
Again, this might be a problem with how I'm using the repo - there's no instructions - so I'm attaching a BASH terminal.
All is working great (ffmpeg, sorting scripts) until I try to train :
(deepfacelab) anaconda@8edf84a647e9:~/scripts$ ./6_train_Quick96_no_preview.sh
Running trainer.
[new] No saved models found. Enter a name of a new model :
new
Model first run.
Choose one or several GPU idxs (separated by comma).
[CPU] : CPU
[0] : Quadro RTX 5000
[0] Which GPU indexes to choose? :
0
Initializing models: 0%| | 0/5 [00:00<?, ?it/s]
Error: No OpKernel was registered to support Op 'DepthToSpace' used by node DepthToSpace (defined at /deepfacelab/core/leras/ops/__init__.py:336) with these attrs: [data_format="NCHW", block_size=2, T=DT_FLOAT]
Registered devices: [CPU]
Registered kernels:
device='GPU'; T in [DT_QINT8]
device='GPU'; T in [DT_HALF]
device='GPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_VARIANT]; data_format in ["NHWC"]
device='CPU'; T in [DT_RESOURCE]; data_format in ["NHWC"]
device='CPU'; T in [DT_STRING]; data_format in ["NHWC"]
device='CPU'; T in [DT_BOOL]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX128]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX64]; data_format in ["NHWC"]
device='CPU'; T in [DT_DOUBLE]; data_format in ["NHWC"]
device='CPU'; T in [DT_FLOAT]; data_format in ["NHWC"]
device='CPU'; T in [DT_BFLOAT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_HALF]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT64]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT64]; data_format in ["NHWC"]
[[DepthToSpace]]
Errors may have originated from an input operation.
Input Source operations connected to node DepthToSpace:
LeakyRelu_4 (defined at /deepfacelab/core/leras/archis/DeepFakeArchi.py:58)
Traceback (most recent call last):
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1375, in _do_call
return fn(*args)
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1358, in _run_fn
self._extend_graph()
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1398, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'DepthToSpace' used by {{node DepthToSpace}} with these attrs: [data_format="NCHW", block_size=2, T=DT_FLOAT]
Registered devices: [CPU]
Registered kernels:
device='GPU'; T in [DT_QINT8]
device='GPU'; T in [DT_HALF]
device='GPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_VARIANT]; data_format in ["NHWC"]
device='CPU'; T in [DT_RESOURCE]; data_format in ["NHWC"]
device='CPU'; T in [DT_STRING]; data_format in ["NHWC"]
device='CPU'; T in [DT_BOOL]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX128]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX64]; data_format in ["NHWC"]
device='CPU'; T in [DT_DOUBLE]; data_format in ["NHWC"]
device='CPU'; T in [DT_FLOAT]; data_format in ["NHWC"]
device='CPU'; T in [DT_BFLOAT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_HALF]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT64]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT64]; data_format in ["NHWC"]
[[DepthToSpace]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/deepfacelab/mainscripts/Trainer.py", line 46, in trainerThread
model = models.import_model(model_class_name)(
File "/usr/local/deepfacelab/models/ModelBase.py", line 189, in __init__
self.on_initialize()
File "/usr/local/deepfacelab/models/Model_Quick96/Model.py", line 222, in on_initialize
model.init_weights()
File "/usr/local/deepfacelab/core/leras/layers/Saveable.py", line 104, in init_weights
nn.init_weights(self.get_weights())
File "/usr/local/deepfacelab/core/leras/ops/__init__.py", line 48, in init_weights
nn.tf_sess.run (ops)
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1368, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/anaconda3/envs/deepfacelab/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1394, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'DepthToSpace' used by node DepthToSpace (defined at /deepfacelab/core/leras/ops/__init__.py:336) with these attrs: [data_format="NCHW", block_size=2, T=DT_FLOAT]
Registered devices: [CPU]
Registered kernels:
device='GPU'; T in [DT_QINT8]
device='GPU'; T in [DT_HALF]
device='GPU'; T in [DT_FLOAT]
device='CPU'; T in [DT_VARIANT]; data_format in ["NHWC"]
device='CPU'; T in [DT_RESOURCE]; data_format in ["NHWC"]
device='CPU'; T in [DT_STRING]; data_format in ["NHWC"]
device='CPU'; T in [DT_BOOL]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX128]; data_format in ["NHWC"]
device='CPU'; T in [DT_COMPLEX64]; data_format in ["NHWC"]
device='CPU'; T in [DT_DOUBLE]; data_format in ["NHWC"]
device='CPU'; T in [DT_FLOAT]; data_format in ["NHWC"]
device='CPU'; T in [DT_BFLOAT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_HALF]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT8]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT16]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT32]; data_format in ["NHWC"]
device='CPU'; T in [DT_INT64]; data_format in ["NHWC"]
device='CPU'; T in [DT_UINT64]; data_format in ["NHWC"]
[[DepthToSpace]]
Errors may have originated from an input operation.
Input Source operations connected to node DepthToSpace:
LeakyRelu_4 (defined at /deepfacelab/core/leras/archis/DeepFakeArchi.py:58)
I have CUDA in Docker:
(deepfacelab) anaconda@8edf84a647e9:~/scripts$ nvidia-smi
Thu Jun 3 08:51:38 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:65:00.0 On | Off |
| 34% 34C P8 18W / 230W | 853MiB / 16124MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
And before I run the above command, I installed CUDNN inside Docker:
(deepfacelab) anaconda@8edf84a647e9:~/scripts$ conda install -c conda-forge cudnn
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.9.2
latest version: 4.10.1
Please update conda by running
$ conda update -n base -c defaults conda
## Package Plan ##
environment location: /usr/local/anaconda3/envs/deepfacelab
added / updated specs:
- cudnn
The following packages will be downloaded:
package | build
---------------------------|-----------------
ca-certificates-2021.5.30 | ha878542_0 136 KB conda-forge
certifi-2021.5.30 | py38h578d9bd_0 141 KB conda-forge
cudatoolkit-11.2.2 | he111cf0_8 877.3 MB conda-forge
cudnn-8.1.0.77 | h90431f1_0 634.8 MB conda-forge
openssl-1.1.1k | h7f98852_0 2.1 MB conda-forge
------------------------------------------------------------
Total: 1.48 GB
The following NEW packages will be INSTALLED:
cudatoolkit conda-forge/linux-64::cudatoolkit-11.2.2-he111cf0_8
cudnn conda-forge/linux-64::cudnn-8.1.0.77-h90431f1_0
The following packages will be UPDATED:
ca-certificates 2020.12.5-ha878542_0 --> 2021.5.30-ha878542_0
certifi 2020.12.5-py38h578d9bd_1 --> 2021.5.30-py38h578d9bd_0
openssl 1.1.1j-h7f98852_0 --> 1.1.1k-h7f98852_0
Proceed ([y]/n)? y
Downloading and Extracting Packages
cudatoolkit-11.2.2 | 877.3 MB | #################################################################################################################### | 100%
openssl-1.1.1k | 2.1 MB | #################################################################################################################### | 100%
certifi-2021.5.30 | 141 KB | #################################################################################################################### | 100%
ca-certificates-2021 | 136 KB | #################################################################################################################### | 100%
cudnn-8.1.0.77 | 634.8 MB | #################################################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: \ By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html
- By downloading and using the cuDNN conda packages, you accept the terms and conditions of the NVIDIA cuDNN EULA -
https://docs.nvidia.com/deeplearning/cudnn/sla/index.html
done
I believe the repo installs CUDA correctly, but doesn't install CUDNN. These are requirements - I was looking here for the error. Any advice/help appreciated!
The text was updated successfully, but these errors were encountered:
Again, this might be a problem with how I'm using the repo - there's no instructions - so I'm attaching a BASH terminal.
All is working great (ffmpeg, sorting scripts) until I try to train :
I have CUDA in Docker:
And before I run the above command, I installed CUDNN inside Docker:
I believe the repo installs CUDA correctly, but doesn't install CUDNN. These are requirements - I was looking here for the error. Any advice/help appreciated!
The text was updated successfully, but these errors were encountered: