update docs

Fanchengyan · Jul 28, 2024 · 165cc10 · 165cc10
1 parent b64f512
commit 165cc10
Show file tree

Hide file tree

Showing 9 changed files with 119 additions and 36 deletions.
diff --git a/docs/source/Tutorials/base/gpm.rst b/docs/source/Tutorials/base/gpm.rst
@@ -1,11 +1,13 @@
+.. _gpm_example:
+
 ==========================
 GPM Data Download Tutorial
 ==========================
 
 1. Finding Data
 ---------------
 
-You can find and download data on GES DISC: https://disc.gsfc.nasa.gov/datasets?keywords=GPM&page=1
+You can find and download GPM data on GES DISC: https://disc.gsfc.nasa.gov/datasets?keywords=GPM&page=1
 
 Many datasets are now available on the GPM official website. To quickly locate data, we can use filtering options such as ``Measurement``, ``Project``, ``Spatial Resolution``, etc.
 
@@ -92,5 +94,5 @@ Create a Python file, copy the code below, change the ``folder_out`` and ``url_f
     # Path of the file containing URLs
     url_file = "/media/fancy/gpm/subset_GPM_3IMERGM_06_20200513_134318.txt"
 
-    urls = parse_urls.from_urls_file(url_file)
+    urls = parse_urls.from_file(url_file)
     downloader.download_datas(urls, folder_out)
diff --git a/docs/source/Tutorials/base/sentinel1.rst b/docs/source/Tutorials/base/sentinel1.rst
@@ -75,10 +75,8 @@ Create a Python file, copy the code below, and modify the ``folder_out`` and ``a
 
 .. code-block:: python
 
-    import pandas as pd
     import geopandas as gpd
-    import faninsar as fis
-    from data_downloader import services
+    from data_downloader import downloader
 
     # Specify the folder to save the data
     folder_out = "/Volumes/Data/sentinel1"
@@ -92,6 +90,41 @@ Create a Python file, copy the code below, and modify the ``folder_out`` and ``a
     # Download data
     downloader.download_datas(urls, folder_out)
 
-.. image:: /_static/images/sentinel2/download.png
-    :width: 95%
-    :align: center
+.. image:: /_static/images/sentinel1/download.png
+    :width: 100%
+
+2.3 Retry Download
+------------------
+
+If your download is frequently interrupted, you can use the following code to automatically retry the download:
+
+
+.. code-block:: python
+
+    from pathlib import Path
+    import geopandas as gpd
+    from data_downloader import downloader
+
+    # Specify the folder to save the data
+    folder_out = Path("/Volumes/Data/sentinel1")
+    # Load the ASF metadata 
+    asf_file = "/Volumes/Data/asf-datapool-results-2024-03-29_11-24-18.geojson"
+
+    # get the sentinel-1 urls from the ASF metadata
+    df_asf = gpd.read_file(asf_file)
+    urls = df_asf.url
+
+    # Download data
+    while True:
+        try:
+            downloader.download_datas(urls, folder_out)
+
+            # check if the download is completed
+            files_local = list(folder_out.glob("*.zip"))
+            if len(files_local) >= len(urls):
+                print("Download completed.")
+                break
+        except Exception as e:
+            print(e)
+            print("Retry download...")
+            continue
diff --git a/docs/source/_static/images/sentinel1/download.png b/docs/source/_static/images/sentinel1/download.png
diff --git a/docs/source/api/parse_urls.rst b/docs/source/api/parse_urls.rst
@@ -12,11 +12,13 @@ parse_urls
 Functions
 ---------
 
-.. autofunction:: data_downloader.parse_urls.from_urls_file
+.. autofunction:: data_downloader.parse_urls.from_file
 
 .. autofunction:: data_downloader.parse_urls.from_html
 
 .. autofunction:: data_downloader.parse_urls.from_sentinel_meta4
 
 .. autofunction:: data_downloader.parse_urls.from_EarthExplorer_order
 
+.. autofunction:: data_downloader.parse_urls.from_urls_file
+
diff --git a/docs/source/api/tables/parse_urls.csv b/docs/source/api/tables/parse_urls.csv
@@ -1,5 +1,5 @@
 Functions                         ,Description
-:func:`.from_urls_file`           ,parse urls from a file which only contains urls
+:func:`.from_file`                ,parse urls from a file which only contains urls
 :func:`.from_html`                ,parse urls from html website
 :func:`.from_sentinel_meta4`      ,parse a urls from a given JSON file
 :func:`.from_EarthExplorer_order` ,parse urls from orders in earthexplorer
diff --git a/docs/source/changelog.rst b/docs/source/changelog.rst
@@ -2,6 +2,14 @@
 Change Log
 ==========
 
+
+Version 1.2 (2024-07-28)
+------------------------
+
+- Refine the documentation
+- API change: from_urls_file -> from_file
+
+
 Version 1.1 (2024-04-14)
 ------------------------
 

diff --git a/docs/source/user_guide/index.rst b/docs/source/user_guide/index.rst
@@ -14,4 +14,4 @@ This is a quick start guide to get you started with ``DataDownloader``. You will
 
    netrc
    download
-   parse_url
+   parse_urls
diff --git a/docs/source/user_guide/parse_url.ipynb b/docs/source/user_guide/parse_url.ipynb
diff --git a/docs/source/user_guide/parse_urls.rst b/docs/source/user_guide/parse_urls.rst
@@ -0,0 +1,63 @@
+===========================================
+parse_urls: Parse URLs from various sources
+===========================================
+
+:ref:`parse_urls` module provides basic functions to parse URLs from different sources. The module provides functions to parse URLs from:
+
+.. csv-table:: Different functions to parse URLs
+   :file: ../api/tables/parse_urls.csv
+   :header-rows: 1
+
+You can import ``parse_urls`` at the beginning.
+
+.. code-block:: python
+
+    from data_downloader import parse_urls
+
+Following is a brief introduction to those functions.
+
+from_file
+---------
+
+This function parses URLs from a given file, which only contains URLs. 
+
+.. tip::
+
+   this function is only useful when the file only contains URLs (one column). 
+   If the file contains multiple columns, you are suggested to use ``pandas`` 
+   to read the file.
+
+Example:
+
+.. code-block:: python
+
+   from data_downloader import parse_urls, downloader
+
+   url_file = '/media/fancy/gpm/subset_GPM_3IMERGM_06_20200513_134318.txt'
+   urls = parse_urls.from_file(url_file)
+
+   downloader.download_datas(urls, folder_out)
+
+Here is an example of use case: :ref:`gpm_example`.
+
+from_html
+---------
+
+This function parses URLs from a given HTML websites (url). It can parse URLs with a specific suffix and depth. Following example shows how to parse URLs with suffix ``.nc`` and depth 1.
+
+Example:
+
+.. code-block:: python
+
+   from data_downloader import parse_urls
+
+   url = 'https://cds-espri.ipsl.upmc.fr/espri/pubipsl/iasib_CH4_2014_uk.jsp'
+   urls = parse_urls.from_html(url, suffix=['.nc'], suffix_depth=1)
+   urls_all = parse_urls.from_html(url, suffix=['.nc'], suffix_depth=1, url_depth=1)
+
+   print(f"Found {len(urls)} urls, {len(urls_all)} urls in total")
+
+.. code-block:: none
+
+   Found 357 urls, 2903 urls in total
+