Replacing the numpy `coo_matrix` by `torch_coo_tensor` #261

DomInvivo · 2023-03-22T17:42:14Z

Using torch sparse instead of numpy sparse. Didn't check if it affects the memory, but now you can better use Batch.from_data_list since you don't have numpy objects, only torch tensors

zhiyil1230

I see this error when testing this PR, not sure if you've seen the same?

`2023-03-23 10:28:53.930 | WARNING | goli.data.datamodule:_featurize_molecules:1484 - 1 molecules will be removed since they failed featurization:
idx=0 - smiles=nan - Error_msg[:-200]=
Python argument types in
rdkit.Chem.rdmolops.RemoveHs(float)
did not match C++ signature:

Traceback (most recent call last):
File "expts/main_run_multitask.py", line 93, in
main(cfg)
File "expts/main_run_multitask.py", line 69, in main
predictor.set_max_nodes_edges_per_graph(datamodule, stages=["train", "val"])
File "/nethome/zhiyil/git/goli/goli/trainer/predictor.py", line 590, in set_max_nodes_edges_per_graph
datamodule.setup()
File "/nethome/zhiyil/git/goli/goli/data/datamodule.py", line 1350, in setup
print(self.train_ds)
File "/nethome/zhiyil/git/goli/goli/data/datamodule.py", line 536, in repr
+ f"\tnum_nodes_total = {self.num_nodes_total}\n"
File "/nethome/zhiyil/git/goli/goli/data/datamodule.py", line 339, in num_nodes_total
features = features.to_data_list()
File "/nethome/zhiyil/.venv/goli_ipu/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 169, in to_data_list
return [self.get_example(i) for i in range(self.num_graphs)]
File "/nethome/zhiyil/.venv/goli_ipu/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 169, in
return [self.get_example(i) for i in range(self.num_graphs)]
File "/nethome/zhiyil/.venv/goli_ipu/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 103, in get_example
data = separate(
File "/nethome/zhiyil/.venv/goli_ipu/lib/python3.8/site-packages/torch_geometric/data/separate.py", line 37, in separate
data_store[attr] = _separate(attr, batch_store[attr], idx, slices,
File "/nethome/zhiyil/.venv/goli_ipu/lib/python3.8/site-packages/torch_geometric/data/separate.py", line 65, in _separate
value = value.narrow(cat_dim or 0, start, end - start)
NotImplementedError: Could not run 'aten::as_strided' with arguments from the 'SparseCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::as_strided' is only available for these backends: [CPU, CUDA, IPU, Meta, QuantizedCPU, QuantizedCUDA, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].`

DomInvivo · 2023-03-23T14:58:32Z

I think this will I don't think this will work. For some reason, you cannot do batch[idx] if your batch contains sparse tensors. There this error, reported by @zhiyil-graphcore above.

NotImplementedError: Could not run 'aten::as_strided' with arguments from the 'SparseCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::as_strided' is only available for these backends: [CPU, IPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher].

DomInvivo · 2023-03-23T15:51:27Z

I opened an issue on the pyg repo

Replacing the numpy coo_matrix by torch_coo_tensor

916d1b5

DomInvivo requested review from joao-alex-cunha and zhiyil1230 March 22, 2023 17:42

zhiyil1230 reviewed Mar 23, 2023

View reviewed changes

Fixing DataModule properties

64e308b

DomInvivo closed this Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing the numpy `coo_matrix` by `torch_coo_tensor` #261

Replacing the numpy `coo_matrix` by `torch_coo_tensor` #261

DomInvivo commented Mar 22, 2023

zhiyil1230 left a comment •

edited

Loading

DomInvivo commented Mar 23, 2023 •

edited

Loading

DomInvivo commented Mar 23, 2023

Replacing the numpy coo_matrix by torch_coo_tensor #261

Replacing the numpy coo_matrix by torch_coo_tensor #261

Conversation

DomInvivo commented Mar 22, 2023

zhiyil1230 left a comment • edited Loading

Choose a reason for hiding this comment

DomInvivo commented Mar 23, 2023 • edited Loading

DomInvivo commented Mar 23, 2023

Replacing the numpy `coo_matrix` by `torch_coo_tensor` #261

Replacing the numpy `coo_matrix` by `torch_coo_tensor` #261

zhiyil1230 left a comment •

edited

Loading

DomInvivo commented Mar 23, 2023 •

edited

Loading