You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/home/shared/tabzilla/TabSurvey/tabzilla_experiment.py", line 137, in __call__
result = cross_validation(model, self.dataset, self.time_limit)
File "/home/shared/tabzilla/TabSurvey/tabzilla_utils.py", line 236, in cross_validation
loss_history, val_loss_history = curr_model.fit(
File "/home/shared/tabzilla/TabSurvey/models/tabtransformer.py", line 120, in fit
loss.backward()
File "/opt/conda/envs/torch/lib/python3.10/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/conda/envs/torch/lib/python3.10/site-packages/torch/autograd/__init__.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
The text was updated successfully, but these errors were encountered:
update - this is a nasty bug.. there are a handful discussions on stackexchange and other github repos trying to diagnose this "CUDA error: invalid configuration argument" error.
this is also an intermediate bug - e.g. it occurs on the datasets listed in the original post, but doesn't occur on many other datasets (e.g., "openml__credit-approval__29" is fine)
occurs on datasets:
traceback:
The text was updated successfully, but these errors were encountered: