Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Offline-ER for NestedTensors #336

Merged
merged 2 commits into from
Jul 24, 2023
Merged

Enable Offline-ER for NestedTensors #336

merged 2 commits into from
Jul 24, 2023

Conversation

wistuba
Copy link
Contributor

@wistuba wistuba commented Jul 11, 2023

Offline-ER was expecting torch.Tensors to be returned by the dataloader which is oftentimes not the case, e.g., text data with transformers.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@wistuba wistuba requested a review from 610v4nn1 July 11, 2023 12:22
@github-actions
Copy link

github-actions bot commented Jul 11, 2023

Coverage report

The coverage rate went from 85.68% to 85.33% ⬇️

86.36% of new lines are covered.

Diff Coverage details (click to unfold)

src/renate/updaters/learner_components/losses.py

0% of new lines are covered (54.19% of the complete file).
Missing lines: 368

src/renate/utils/pytorch.py

100% of new lines are covered (92.72% of the complete file).

src/renate/updaters/experimental/offline_er.py

66.66% of new lines are covered (81.08% of the complete file).
Missing lines: 118, 119

"""Given a NestedTensor, return its batch size."""
if isinstance(batch, torch.Tensor):
return batch.shape
if isinstance(batch, tuple):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we assuming all the tensors in the tuple have the same shape? or that only the first one actually contains data? I think it's important to make this clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, technically they could be of different shapes. I've renamed the function to reflect that it now returns the first dim only and the docstring to say that we expect the first dim to match.

assert get_shape_nested_tensors(dict_tensor)[0] == expected_batch_size


def test_cat_nested_tensors():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to test also the behavior in case of failure (e.g., shape mismatch)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test

@610v4nn1 610v4nn1 merged commit 38d9143 into dev Jul 24, 2023
@610v4nn1 610v4nn1 deleted the mw-offline-er-fix branch July 24, 2023 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants