-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supervised Reconstruction Dataset + bug fixes and formatting #25
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
8fa5854
Supervised Semantic Segmentation Dataset
GabrielBG0 42f5551
some bug fixes
GabrielBG0 f8582b3
take out redundant __len__
GabrielBG0 1d33e4f
corrections to supervised_dataset
GabrielBG0 ffb620f
Fix typos in documentation and add more docs
otavioon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
from typing import List, Tuple | ||
|
||
import numpy as np | ||
from base import SimpleDataset | ||
|
||
from sslt.data.readers.reader import _Reader | ||
from sslt.transforms.transform import _Transform | ||
|
||
|
||
class SupervisedReconstructionDataset(SimpleDataset): | ||
"""A simple dataset class for supervised reconstruction tasks. | ||
|
||
In summary, each element of the dataset is a pair of data, where the first | ||
element is the input data and the second element is the target data. | ||
Usually, both input and target data have the same shape. | ||
|
||
This dataset is useful for supervised tasks such as image reconstruction, | ||
segmantic segmentation, and object detection, where the input data is the | ||
original data and the target is a mask or a segmentation map. | ||
|
||
Examples | ||
-------- | ||
|
||
1. Semantic Segmentation Dataset: | ||
|
||
```python | ||
from sslt.data.readers import ImageReader | ||
from sslt.transforms import ImageTransform | ||
from sslt.data.datasets import SupervisedReconstructionDataset | ||
|
||
# Create the readers | ||
image_reader = ImageReader("path/to/images") | ||
mask_reader = ImageReader("path/to/masks") | ||
|
||
# Create the transforms | ||
image_transform = ImageTransform() | ||
|
||
# Create the dataset | ||
dataset = SupervisedReconstructionDataset( | ||
readers=[image_reader, mask_reader], | ||
transforms=image_transform | ||
) | ||
|
||
# Load the first sample | ||
dataset[0] # Returns a tuple: (image, mask) | ||
``` | ||
""" | ||
|
||
def __init__( | ||
self, readers: List[_Reader], transforms: _Transform | None = None | ||
): | ||
"""A simple dataset class for supervised reconstruction tasks. | ||
|
||
Parameters | ||
---------- | ||
readers: List[_Reader] | ||
List of data readers. It must contain exactly 2 readers. | ||
The first reader for the input data and the second reader for the | ||
target data. | ||
transforms: _Transform | None | ||
Optional data transformation pipeline. | ||
|
||
Raises | ||
------- | ||
AssertionError: If the number of readers is not exactly 2. | ||
""" | ||
super().__init__(readers, transforms) | ||
|
||
assert ( | ||
len(self.readers) == 2 | ||
), "SupervisedReconstructionDataset requires exactly 2 readers" | ||
|
||
def __getitem__(self, index: int) -> Tuple[np.ndarray, np.ndarray]: | ||
"""Load data from sources and apply specified transforms. The same | ||
transform is applied to both input and target data. | ||
|
||
Parameters | ||
---------- | ||
index : int | ||
The index of the sample to load. | ||
|
||
Returns | ||
------- | ||
Tuple[np.ndarray, np.ndarray] | ||
A tuple containing two numpy arrays representing the data. | ||
|
||
""" | ||
data = super().__getitem__(index) | ||
|
||
return (data[0], data[1]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SimpleDataset already implements the equivaltent __getitem__code to this one.
Thus, you can omit this getitem implementation.
It worth noticing that we cannnot infer the return type (inside the tuple) yet, as it depends of the reader. The second element could be an
int
instead ofnp.ndarray
, that represents the label, for instance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal with this dataset is to be as specific as possible, hence the Transform Pipeline and the (exactly) two readers. The implementation is almost the same but it uses only one transform for both data points and it returns a known type as its output (the numpy array tuple), which is better for code downstream in the full pipeline process. Sure, as it is implemented, I can’t be sure what types are returned by the readers but if that’s a problem I would rather put a check to ensure it then to return Any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @GabrielBG0. I did not see the class name, my bad.
SupervisedReconstructionDataset
. What do you think?__getitem__
impementation to a simplesuper().__getitem__(index)
. This would reduce code duplication and unit test cases.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, what do you think about it now?