[doc] Update docstrings/documentations of all the datasets #931

mthrok · 2020-10-01T22:13:40Z

Currently the documentation page does not list constructor arguments or overwrote __getitem__ methods. This PR fixes it.

(Note: the annotation is broken form the beginning and I cannot fix it.)

Example:

torchaudio/datasets/cmuarctic.py

mruberry · 2020-10-01T22:33:57Z

torchaudio/datasets/cmuarctic.py

@@ -76,9 +76,15 @@ def load_cmuarctic_item(line: str,


 class CMUARCTIC(Dataset):
-    """
-    Create a Dataset for CMU_arctic. Each item is a tuple of the form:


Do you really want to remove the information about the structure of these datasets here? I understand it's going in the getitem documentation.

It's moved to __getitem__ method.

I think describing it in __getitem__ has better locality, which is better for the developer experience. If something is changed on the implementation side, they only need to change the corresponding docstring and there are less chances that there are discrepancy between the docstring description and the actual implementation, compared to the case where the same information are repeated in class description.
Of course, we are having the current docathon to catch such discrepancies, so one can argue that it's okay to have such discrepancies, but it will confuse users on master branch still.

torchaudio/datasets/commonvoice.py

torchaudio/datasets/gtzan.py

torchaudio/datasets/cmuarctic.py

mruberry · 2020-10-01T22:40:50Z

torchaudio/datasets/tedlium.py

@@ -140,7 +116,7 @@ def _load_tedlium_item(self, fileid: str, line: int, path: str) -> Tuple[Tensor,
            path (str): Dataset root path

        Returns:
-            Tedlium_item: A namedTuple containing [waveform, sample_rate, transcript, talk_id, speaker_id, identifier]
+            tuple: ``(waveform, sample_rate, transcript, talk_id, speaker_id, identifier)``


Nice fix here.

torchaudio/datasets/vctk.py

torchaudio/datasets/cmuarctic.py

torchaudio/datasets/speechcommands.py

mruberry

This changes look like a significant improvement.

In the future it may be interesting to enumerate all supported inputs for each param along with a brief description of what they do.

vincentqb · 2020-10-02T14:58:00Z

docs/source/datasets.rst

+  :members:
+  :special-members: __getitem__


nit: out of curiosity, does this change the output?

torchaudio/datasets/librispeech.py

vincentqb · 2020-10-02T15:00:30Z

torchaudio/datasets/libritts.py

+
+    Args:
+        root (str): Path to the directory where the dataset is found or downloaded.
+        url (str, optional): Type of the dataset to dowload. This is **NOT** the actual URL.


same about code path for url

torchaudio/datasets/speechcommands.py

torchaudio/datasets/commonvoice.py

torchaudio/datasets/cmuarctic.py

vincentqb

thanks for updating this, lgtm overall :)

Add profiling tracer example

Update docstrings/documentations of all the datasets

f87da36

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/cmuarctic.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/cmuarctic.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/commonvoice.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/gtzan.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/cmuarctic.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/vctk.py Show resolved Hide resolved

mthrok added 6 commits October 1, 2020 18:52

Revert removal of -> None

4460195

Update download help message

69ae2c5

list all the possible value for commonvoice

5b23324

Update gtzan subset

bb31eae

add cmuarctic urls

6be3f6f

Remove returm item description from VCTK_092 class description

80a97ed

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/cmuarctic.py Outdated Show resolved Hide resolved

mruberry reviewed Oct 1, 2020

View reviewed changes

torchaudio/datasets/speechcommands.py Show resolved Hide resolved

mruberry approved these changes Oct 1, 2020

View reviewed changes

vincentqb reviewed Oct 2, 2020

View reviewed changes

torchaudio/datasets/librispeech.py Outdated Show resolved Hide resolved

vincentqb reviewed Oct 2, 2020

View reviewed changes

torchaudio/datasets/speechcommands.py Outdated Show resolved Hide resolved

vincentqb reviewed Oct 2, 2020

View reviewed changes

torchaudio/datasets/commonvoice.py Outdated Show resolved Hide resolved

vincentqb reviewed Oct 2, 2020

View reviewed changes

torchaudio/datasets/cmuarctic.py Outdated Show resolved Hide resolved

mthrok added 3 commits October 2, 2020 11:09

Fill allowed values

fd99e1f

Update VCTK

189fcd5

Fix URL description

3a54ac0

vincentqb approved these changes Oct 2, 2020

View reviewed changes

mthrok merged commit e3d1d74 into pytorch:master Oct 2, 2020

mthrok deleted the doc-datasets branch October 2, 2020 16:50

mthrok changed the title ~~Update docstrings/documentations of all the datasets~~ [doc] Update docstrings/documentations of all the datasets Oct 14, 2020

mthrok added the dockathon label Jan 13, 2021

mpc001 pushed a commit to mpc001/audio that referenced this pull request Aug 4, 2023

Merge pull request pytorch#931 from jamesr66a/profiling_tracer

3970e06

Add profiling tracer example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[doc] Update docstrings/documentations of all the datasets #931

[doc] Update docstrings/documentations of all the datasets #931

mthrok commented Oct 1, 2020 •

edited

Loading

mruberry Oct 1, 2020 •

edited

Loading

mthrok Oct 1, 2020

mthrok Oct 1, 2020

mruberry Oct 1, 2020

mruberry left a comment

vincentqb Oct 2, 2020

vincentqb Oct 2, 2020

vincentqb left a comment

[doc] Update docstrings/documentations of all the datasets #931

[doc] Update docstrings/documentations of all the datasets #931

Conversation

mthrok commented Oct 1, 2020 • edited Loading

mruberry Oct 1, 2020 • edited Loading

Choose a reason for hiding this comment

mthrok Oct 1, 2020

Choose a reason for hiding this comment

mthrok Oct 1, 2020

Choose a reason for hiding this comment

mruberry Oct 1, 2020

Choose a reason for hiding this comment

mruberry left a comment

Choose a reason for hiding this comment

vincentqb Oct 2, 2020

Choose a reason for hiding this comment

vincentqb Oct 2, 2020

Choose a reason for hiding this comment

vincentqb left a comment

Choose a reason for hiding this comment

mthrok commented Oct 1, 2020 •

edited

Loading

mruberry Oct 1, 2020 •

edited

Loading