Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for to_tf_dataset #3175

Merged
merged 3 commits into from
Nov 3, 2021
Merged

Conversation

stevhliu
Copy link
Member

This PR adds some documentation for new features released in v1.13.0, with the main addition being to_tf_dataset:

  • Show how to use to_tf_dataset in the tutorial, and move set_format(type='tensorflow'...) to the Process section (let me know if I'm missing anything @Rocketknight1 😅).
  • Add an example for loading dataset from multiple zipped CSV files to the Load section.
  • Add an example for removing columns for an IterableDataset.
  • Add graphic for visualizing streaming.

@stevhliu stevhliu added the documentation Improvements or additions to documentation label Oct 28, 2021
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding all this, this is very helpful !

And damn that gif is really nice

docs/source/loading.rst Outdated Show resolved Hide resolved
docs/source/process.rst Show resolved Hide resolved
Comment on lines +9 to +10
.. image:: /imgs/stream.gif
:align: center
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG I love it ! Awesome <3

docs/source/use_dataset.rst Outdated Show resolved Hide resolved
@Rocketknight1
Copy link
Member

This looks great, thank you!

@lhoestq
Copy link
Member

lhoestq commented Nov 2, 2021

Thanks !

For some reason the new GIF is 6MB, which is a bit heavy for an image on a website. The previous one was around 200KB though which is perfect. For a good experience we usually expect images to be less than 500KB - otherwise for users with poor connection it takes too long to load. Could you try to reduce its size ? Than I think we can merge :)

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

@lhoestq lhoestq merged commit bf5aa9c into huggingface:master Nov 3, 2021
@stevhliu stevhliu deleted the to-tf-dataset-docs branch November 3, 2021 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants