PyTorchKorea · 9bow · Sep 8, 2020 · Aug 13, 2020 · Aug 15, 2020 · Aug 28, 2020
diff --git a/recipes_source/recipes/loading_data_recipe.py b/recipes_source/recipes/loading_data_recipe.py
@@ -1,32 +1,33 @@
 """
-Loading data in PyTorch
+PyTorch에서 데이터 불러오기
 =======================
-PyTorch features extensive neural network building blocks with a simple,
-intuitive, and stable API. PyTorch includes packages to prepare and load
-common datasets for your model.
+PyTorch는 인공신경망을 만드는데 필요한 다양한 기본 요소를 간단하고 직관적이며
+안정적인 API로 제공합니다. PyTorch는 공용 데이터셋을 쉽게 사용할 수 있도록
+도와주는 패키지를 포함하고 있습니다.
 
-Introduction
+개요
 ------------
-At the heart of PyTorch data loading utility is the
+PyTorch 데이터 불러오기 기능의 핵심은
 `torch.utils.data.DataLoader <https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader>`__
-class. It represents a Python iterable over a dataset. Libraries in
-PyTorch offer built-in high-quality datasets for you to use in
-`torch.utils.data.Dataset <https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset>`__.
-These datasets are currently available in:
+클래스입니다. 데어터를 파이썬 iterable로써 접근할 수 있게 해주는 클래스입니다.
+또한, `torch.utils.data.Dataset <https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset>`__
+클래스를 통해 PyTorch에 내장된 다양한 고품질 데이터셋을 이용하실 수 있습니다.
+
+
+개별 데이터셋은 아래 패키지에서 확인하실 수 있으며, 데이터셋은 계속해서 추가될 예정입니다.
 
 * `torchvision <https://pytorch.org/docs/stable/torchvision/datasets.html>`__
 * `torchaudio <https://pytorch.org/audio/datasets.html>`__
 * `torchtext <https://pytorch.org/text/datasets.html>`__
 
-with more to come.
-Using the Yesno dataset from ``torchaudio.datasets.YESNO``, we will
-demonstrate how to effectively and efficiently load data from a PyTorch
-``Dataset`` into a PyTorch ``DataLoader``.
 
-Setup
+이번 레시피에서는 ``torchaudio.datasets.YESNO`` 데이터셋을 살펴보면서,
+PyTorch ``Dataset`` 에서 PyTorch ``DataLoader`` 로 데이터를 효과적이고 효율적으로
+불러오는 방법을 살펴보겠습니다.
+
+초기 설정
 -----
-Before we begin, we need to install ``torchaudio`` to have access to the
-dataset.
+시작하기 전에, 데이터셋이 포함된 ``torchaudio`` 패키지를 설치합니다.
 
 ::
 
@@ -42,37 +43,36 @@
 
 
 ######################################################################
-# Steps
+# 단계(Steps)
 # -----
 # 
-# 1. Import all necessary libraries for loading our data
-# 2. Access the data in the dataset
-# 3. Loading the data
-# 4. Iterate over the data
-# 5. [Optional] Visualize the data
-# 
+# 1. 데이터를 불러오는데 필요한 라이브러리 import하기
+# 2. 데이터 접근하기
+# 3. 데이터 불러오기
+# 4. 데이터 순회하기
+# 5. [선택 사항] 데이터 시각화하기
+#
 # 
-# 1. Import necessary libraries for loading our data
+# 1. 데이터를 불러오는데 필요한 라이브러리 import하기
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# 
-# For this recipe, we will use ``torch`` and ``torchaudio``. Depending on
-# what built-in datasets you use, you can also install and import
-# ``torchvision`` or ``torchtext``.
-# 
+#
+# 이번 레시피는 ``torch`` 와 ``torchaudio`` 를 사용합니다. 다른 내장 데이터셋이
+# 필요하다면 ``torchvision`` 혹은 ``torchtext`` 를 설치해서 사용해도 됩니다.
+#
 
 import torch
 import torchaudio
 
 
 ######################################################################
-# 2. Access the data in the dataset
+# 2. 데이터에 접근하기
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# 
-# The Yesno dataset in ``torchaudio`` features sixty recordings of one
-# individual saying yes or no in Hebrew; with each recording being eight
-# words long (`read more here <https://www.openslr.org/1/>`__).
-# 
-# ``torchaudio.datasets.YESNO`` creates a dataset for YesNo.
+#
+# ``torchaudio`` 의 YesNo 데이터셋은 한 사람이 히브리어로 yes 혹은
+# no를 녹음한 오디오 클립 60개로 구성되어 있습니다. 오디오 클립 각각의 길이는 단어 8개입니다.
+# (`더 알아보기 <https://www.openslr.org/1/>`__).
+#
+# ``torchaudio.datasets.YESNO`` 클래스를 사용하여 YesNo 데이터셋을 생성합니다.
 # 
 # ::
 #    
@@ -83,43 +83,40 @@
 #      download=False,
 #      transform=None,
 #      target_transform=None)
-# 
-# Each item in the dataset is a tuple of the form: (waveform, sample_rate,
-# labels).
-# 
-# You must set a ``root`` for the Yesno dataset, which is where the
-# training and testing dataset will exist. The other parameters are
-# optional, with their default values shown. Here is some additional
-# useful info on the other parameters:
-
-# * ``download``: If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
-# * ``transform``: Using transforms on your data allows you to take it from its source state and transform it into data that’s joined together, de-normalized, and ready for training. Each library in PyTorch supports a growing list of transformations.
-# * ``target_transform``: A function/transform that takes in the target and transforms it.
-# 
-# Let’s access our Yesno data:
-# 
-
-# A data point in Yesno is a tuple (waveform, sample_rate, labels) where labels 
-# is a list of integers with 1 for yes and 0 for no.
+#
+# 각각의 데이터 항목 (item)은 튜플 형태 (waveform: 파형, sample_rate: 샘플 속도, labels: 라벨)를 갖습니다.
+#
+# YesNo 데이터셋을 불러올 때 ``root`` 매개변수는 꼭 지정해주셔야 합니다. ``root`` 는
+# 학습(training) 및 테스트(testing) 데이터셋이 존재하는 위치를 가르켜야 합니다.
+# 그 외의 매개변수는 선택 사항이며, 위 예시에서 기본값을 확인하실 있습니다. 아래와
+# 같은 매개변수도 사용 가능합니다.
+#
+# * ``download``: 참인 경우, 데이터셋 파일을 인터넷에서 다운받고 root 폴더에 저장합니다. 파일이 이미 존재하면 다시 다운받지 않습니다.
+# * ``transform``: 데이터를 변환하여 학습에 사용할 수 있도록 이어붙이고 비정규화된 형태로 불러오실 수 있습니다. 라이브러리마다 다양한 transformation을 지원하고 있으며, 앞으로도 추가될 예정입니다.
+# * ``target_transform``: 타겟 데이터를 변환하기 위한 함수 혹은 transform입니다.
+#
+# 이제 YesNo 데이터를 확인해봅시다:
+
+# YesNo 안에 각각의 데이터 항목은 튜플 형태 (파형, 샘플 속도, 라벨)를 가지며,
+# 이때 labels는 0(no)과 1(yes)을 담은 리스트 형태로 되어 있습니다.
 yesno_data_trainset = torchaudio.datasets.YESNO('./', download=True)
 
-# Pick data point number 3 to see an example of the the yesno_data:
+# 실제 데이터에 접근해서 yesno_data의 형태를 확인합니다. 세 번째 항목을 예시로 살펴봅니다.
 n = 3
 waveform, sample_rate, labels = yesno_data[n]
 print("Waveform: {}\nSample rate: {}\nLabels: {}".format(waveform, sample_rate, labels))
 
 
 ######################################################################
-# When using this data in practice, it is best practice to provision the
-# data into a “training” dataset and a “testing” dataset. This ensures
-# that you have out-of-sample data to test the performance of your model.
-# 
-# 3. Loading the data
+# 실제 상황에서는 데이터를 "학습(training)" 데이터셋과 "테스트(testing)" 데이터셋으로 나누는 것이
+# 권장됩니다. 모델의 성능을 제대로 평가하려면 학습에 쓰이지 않은 out-of-sample
+# 데이터를 이용해야 하기 때문입니다.
+#
+# 3. 데이터 불러오기
 # ~~~~~~~~~~~~~~~~~~~~~~~
 # 
-# Now that we have access to the dataset, we must pass it through
-# ``torch.utils.data.DataLoader``. The ``DataLoader`` combines the dataset
-# and a sampler, returning an iterable over the dataset.
+# 데이터셋에 성공적으로 접근했으니, 이제 데이터셋을 ``torch.utils.data.DataLoader`` 로 넘겨줍니다.
+# ``DataLoader`` 는 데이터셋을 sampler와 조합시켜 데이터셋을 순회할 수 있는 iterable을 만들어줍니다.
 # 
 
 data_loader = torch.utils.data.DataLoader(yesno_data,
@@ -128,14 +125,13 @@
 
 
 ######################################################################
-# 4. Iterate over the data
+# 4. 데이터 순회하기
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 # 
-# Our data is now iterable using the ``data_loader``. This will be
-# necessary when we begin training our model! You will notice that now
-# each data entry in the ``data_loader`` object is converted to a tensor
-# containing tensors representing our waveform, sample rate, and labels.
-# 
+# 이제 ``data_loader`` 를 이용해서 데이터를 순회할 수 있습니다. 모델을 학습하려면 이처럼
+# 데이터를 순회할 수 있어야 합니다. 아래 예시를 보시면 ``data_loader`` 안에 있는 각각의
+# 데이터 항목이 파형, 샘플 속도, 라벨을 담은 텐서로 바뀌었음을 확인할 수 있습니다.
+#
 
 for data in data_loader:
   print("Data: ", data)
@@ -144,11 +140,10 @@
 
 
 ######################################################################
-# 5. [Optional] Visualize the data
+# 5. [선택 사항] 데이터 시각화하기
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 # 
-# You can optionally visualize your data to further understand the output
-# from your ``DataLoader``.
+# ``DataLoader`` 의 데이터를 시각화해서 더 자세히 확인해보실 수 있습니다.
 # 
 
 import matplotlib.pyplot as plt
@@ -160,12 +155,12 @@
 
 
 ######################################################################
-# Congratulations! You have successfully loaded data in PyTorch.
+# 축하드립니다! PyTorch에서 데이터를 불러오는데 성공하셨습니다.
 # 
-# Learn More
+# 더 알아보기
 # ----------
 # 
-# Take a look at these other recipes to continue your learning:
-# 
-# - `Defining a Neural Network <https://pytorch.org/tutorials/recipes/recipes/defining_a_neural_network.html>`__
-# - `What is a state_dict in PyTorch <https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html>`__
+# 다른 레시피를 둘러보고 계속 배워보세요:
+#
+# - :doc:`/recipes/recipes/defining_a_neural_network`
+# - :doc:`/recipes/recipes/what_is_state_dict`
diff --git a/recipes_source/recipes_index.rst b/recipes_source/recipes_index.rst
@@ -31,8 +31,9 @@ Recipes are bite-sized bite-sized, actionable examples of how to use specific Py
 .. Basics
 
 .. customcarditem::
-   :header: Loading data in PyTorch
-   :card_description: Learn how to use PyTorch packages to prepare and load common datasets for your model.
+   :header: PyTorch에서 데이터 불러오기
+   :card_description: PyTorch 패키지를 이용해서 공용 데이터셋을 불러오고 모델에 적용하는 방법을 알아봅니다.
+Learn how to use PyTorch packages to prepare and load common datasets for your model.
    :image: ../_static/img/thumbnails/cropped/loading-data.PNG
    :link: ../recipes/recipes/loading_data_recipe.html
    :tags: Basics