Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: tuple index out of range #38

Open
amir-tagh opened this issue Jun 7, 2022 · 12 comments
Open

IndexError: tuple index out of range #38

amir-tagh opened this issue Jun 7, 2022 · 12 comments

Comments

@amir-tagh
Copy link

Hello,

I am following the the example for "Molecule generation pretraining procedure". first step "python get_vocab.py --ncpu 16 < data/chembl/all.txt > vocab.txt" is done with no error, but I am getting the "IndexError: tuple index out of range" for the second step
python preprocess.py --train data/chembl/all.txt --vocab data/chembl/all.txt --ncpu 16 --mode single

can you please let me know what could be the problem.

Best,
Amir

@orubaba
Copy link

orubaba commented Jun 14, 2022

#34 should answer your question. I had same also. Then, I did according to that tread and viola....it worked.
Run this first:
python preprocess.py --train data/chembl/all.txt --vocab vocab.txt --ncpu 16 --mode single

After completion,

then, this:
mkdir train_processed

After,
then this
mv tensor* train_processed/

@amir-tagh
Copy link
Author

Thanks for your response.

I have a set of smiles which I am working on, extracting the substructures is done successfully but the second step is giving the following error:

do you have any idea what could be problem?

Thanks for your help.

python preprocess.py --train Inforna_correct_for_ML.txt --vocab inforna_vocab.txt --ncpu 16 --mode single

File "preprocess.py", line 109, in
le = (len(all_data) + num_splits - 1) // num_splits
ZeroDivisionError: integer division or modulo by zero

@orubaba
Copy link

orubaba commented Jun 16, 2022

I will suggest you adjust the number of split formular:
num_splits = len(all_data) // 1000
if your len(data) is < 1000, num_split = 0 because of the floor division.
So my advice is you use a denominator that can give your num_split >= 1. maybe you use 100 or 10 or 5.

@amir-tagh
Copy link
Author

Thanks a lot for your help.
Now I am at the third step "Train graph generation model" and I am getting the following error. I googled the error but couldnt find a solution.

Thanks,

here is the pytorch version I am using, if it helps:

Name Version Build Channel

pytorch 1.11.0 py3.7_cuda11.1_cudnn8.0.5_0 pytorch

Traceback (most recent call last):
File "train_generator.py", line 96, in
meters = meters + np.array([kl_div, loss.item(), wacc * 100, iacc * 100, tacc * 100, sacc * 100])
File "/home/amir/anaconda3/envs/sampledock/lib/python3.7/site-packages/torch/_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

@orubaba
Copy link

orubaba commented Jun 17, 2022

The error is due to lack of nvidia gpu that enable cuda on your machine.

@amir-tagh
Copy link
Author

but I have the nvidia gpu?

NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P2200 Off | 00000000:65:00.0 On | N/A |
| 44% 30C P5 9W / 75W | 1842MiB / 5050MiB | 68% Default |
| | | N/A

@orubaba
Copy link

orubaba commented Jun 18, 2022

Perhaps, the driver is not properly installed. Something must be wrong somewhere.!

@orubaba
Copy link

orubaba commented Jun 18, 2022

maybe this can help: 2e56392

@amir-tagh
Copy link
Author

Thanks, I finally figured out what was wrong and now it is working.

Now I have a problem with finetune_generator.py

I have used the chemprop_train on my dataset and got the following in the save_dir:
args.json, fold_0, verbose.log, test_scores.csv, quiet.log

after running the finetune_generator.py I get the following error, can you please let me know how can I trace the problem.

Thanks for your help.


Traceback (most recent call last):
File "/apps/hgraph2graph/20210428/hgraph2graph/finetune_generator.py", line 124, in
score_func = Chemprop(args.chemprop_model)
File "/apps/hgraph2graph/20210428/hgraph2graph/finetune_generator.py", line 37, in init
scaler, features_scaler = load_scalers(fname)
ValueError: too many values to unpack (expected 2)

@muammar
Copy link

muammar commented Jun 27, 2022

Traceback (most recent call last):
File "/apps/hgraph2graph/20210428/hgraph2graph/finetune_generator.py", line 124, in
score_func = Chemprop(args.chemprop_model)
File "/apps/hgraph2graph/20210428/hgraph2graph/finetune_generator.py", line 37, in init
scaler, features_scaler = load_scalers(fname)
ValueError: too many values to unpack (expected 2)

Did you solve it? I am trying to figure it out, if I get to solve it, I will push the changes to my own version of this package https://github.com/muammar/hgraph2graph

@muammar
Copy link

muammar commented Jun 27, 2022

Ok, I solved it... First, your fine-tune set does not need to have any headers. It should look like this:

CC
CCO
CNOO

Then, you need to apply the following patch:

diff --git a/finetune_generator.py b/finetune_generator.py
index d406d38..995cad3 100755
--- a/finetune_generator.py
+++ b/finetune_generator.py
@@ -35,9 +35,9 @@ class Chemprop(object):
             for fname in files:
                 if fname.endswith(".pt"):
                     fname = os.path.join(root, fname)
-                    scaler, features_scaler = load_scalers(fname)
-                    self.scalers.append(scaler)
-                    self.features_scalers.append(features_scaler)
+                    # scaler, features_scaler = load_scalers(fname)
+                    # self.scalers.append(scaler)
+                    # self.features_scalers.append(features_scaler)
                     model = load_checkpoint(fname)
                     self.checkpoints.append(model)
 
@@ -164,10 +164,10 @@ if __name__ == "__main__":
                     [
                         kl_div,
                         loss.item(),
-                        wacc * 100,
-                        iacc * 100,
-                        tacc * 100,
-                        sacc * 100,
+                        wacc.item() * 100,
+                        iacc.item() * 100,
+                        tacc.item() * 100,
+                        sacc.item() * 100,
                     ]
                 )

See: muammar@a714e29

@amir-tagh
Copy link
Author

Hi muammar,

Thanks for the solution.
I am using the train_translator.py for lead optimization and I am getting the following error. ahve you seen this error before? do you know how to solve it.

Thanks,

Traceback (most recent call last):
File "/apps/hgraph2graph/20210428/hgraph2graph/train_translator.py", line 86, in
loss, kl_div, wacc, iacc, tacc, sacc = model(*batch)
File "/apps/hgraph2graph/20210428/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() missing 2 required positional arguments: 'y_orders' and 'beta'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants