-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tests for (Dataset|Image)Folder #3477
Conversation
@pmeier I'm not sure I get this. Is this a bug in the dataset implementation or in the test? In principle, having an empty folder in the root dir shouldn't be a problem, so I would say I'd change the test to take this into account. |
Depends. Do we allow entries in the |
Following the discussion we had for |
Blocked by #3496. |
empty_classes = available_classes - set(class_to_idx.keys()) | ||
empty_classes = set(class_to_idx.keys()) - available_classes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was missed in #3496.
This will close #963 since this adds tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There is a conflict, can you look into it? |
Codecov Report
@@ Coverage Diff @@
## master #3477 +/- ##
==========================================
- Coverage 79.48% 79.47% -0.02%
==========================================
Files 105 105
Lines 9822 9822
Branches 1582 1582
==========================================
- Hits 7807 7806 -1
Misses 1527 1527
- Partials 488 489 +1
Continue to review full report at Codecov.
|
Summary: * add tests for (Dataset|Image)Folder * lint * remove old tests * cleanup * more cleanup * adapt tests * fix make_dataset * remove powerset * readd import Reviewed By: fmassa Differential Revision: D27433923 fbshipit-source-id: 6ea3fb79f41e255045a642dcadedd8fa813e9dcc
The failing test is valid. We first
_find_classes()
vision/torchvision/datasets/folder.py
Line 126 in 9846569
by scanning the root directory for subdirectories
vision/torchvision/datasets/folder.py
Line 164 in 9846569
and afterwards collect all samples
vision/torchvision/datasets/folder.py
Line 127 in 9846569
In the case that a subdirectory contains no or no matching files, it is still listed in the
classes
attribute.A brute force fix would be to re-iterate over the samples remove all classes that are not present. A more efficient way would be flag all classes that contain at least one sample during
make_dataset
and simply remove empty ones. But this would require to change the output which is BC breaking. Thoughts?