Preparing release 2.11.0 (#885)

* Adding Data Api tutorial and fixing docs * [skip ci] - Minor - Incorrect bump * CB Tests 2.11.0 * Minor - Clarifying Glue Spark docs
aws · Sep 1, 2021 · e216d53 · e216d53
1 parent 11e58a8
commit e216d53
Show file tree

Hide file tree

Showing 24 changed files with 198 additions and 87 deletions.
diff --git a/.bumpversion.cfg b/.bumpversion.cfg
@@ -1,5 +1,5 @@
 [bumpversion]
-current_version = 2.10.0
+current_version = 2.11.0
 commit = False
 tag = False
 tag_name = {new_version}

diff --git a/CONTRIBUTING_COMMON_ERRORS.md b/CONTRIBUTING_COMMON_ERRORS.md
@@ -13,9 +13,9 @@ Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in ./.venv/lib/python3.7/site-
 Using legacy 'setup.py install' for python-Levenshtein, since package 'wheel' is not installed.
 Installing collected packages: awswrangler, python-Levenshtein
   Attempting uninstall: awswrangler
-    Found existing installation: awswrangler 2.10.0
-    Uninstalling awswrangler-2.10.0:
-      Successfully uninstalled awswrangler-2.10.0
+    Found existing installation: awswrangler 2.11.0
+    Uninstalling awswrangler-2.11.0:
+      Successfully uninstalled awswrangler-2.11.0
   Running setup.py develop for awswrangler
     Running setup.py install for python-Levenshtein ... error
     ERROR: Command errored out with exit status 1:

diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, Clo
 
 > An [AWS Professional Service](https://aws.amazon.com/professional-services/) open source initiative | aws-proserve-opensource@amazon.com
 
-[![Release](https://img.shields.io/badge/release-2.10.0-brightgreen.svg)](https://pypi.org/project/awswrangler/)
+[![Release](https://img.shields.io/badge/release-2.11.0-brightgreen.svg)](https://pypi.org/project/awswrangler/)
 [![Python Version](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9-brightgreen.svg)](https://anaconda.org/conda-forge/awswrangler)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
@@ -23,7 +23,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, Clo
 | **[PyPi](https://pypi.org/project/awswrangler/)**  | [![PyPI Downloads](https://pepy.tech/badge/awswrangler)](https://pypi.org/project/awswrangler/) | `pip install awswrangler` |
 | **[Conda](https://anaconda.org/conda-forge/awswrangler)** | [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/awswrangler.svg)](https://anaconda.org/conda-forge/awswrangler) | `conda install -c conda-forge awswrangler` |
 
-> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
+> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
 ➡️ `pip install pyarrow==2 awswrangler`
 
 Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](https://arrow.apache.org/powered_by/)
@@ -42,7 +42,7 @@ Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](http
 
 Installation command: `pip install awswrangler`
 
-> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
+> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
 ➡️`pip install pyarrow==2 awswrangler`
 
 ```py3
@@ -96,17 +96,17 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
 
 ## [Read The Docs](https://aws-data-wrangler.readthedocs.io/)
 
-- [**What is AWS Data Wrangler?**](https://aws-data-wrangler.readthedocs.io/en/2.10.0/what.html)
-- [**Install**](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html)
-  - [PyPi (pip)](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#pypi-pip)
-  - [Conda](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#conda)
-  - [AWS Lambda Layer](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#aws-lambda-layer)
-  - [AWS Glue Python Shell Jobs](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#aws-glue-python-shell-jobs)
-  - [AWS Glue PySpark Jobs](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#aws-glue-pyspark-jobs)
-  - [Amazon SageMaker Notebook](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#amazon-sagemaker-notebook)
-  - [Amazon SageMaker Notebook Lifecycle](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#amazon-sagemaker-notebook-lifecycle)
-  - [EMR](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#emr)
-  - [From source](https://aws-data-wrangler.readthedocs.io/en/2.10.0/install.html#from-source)
+- [**What is AWS Data Wrangler?**](https://aws-data-wrangler.readthedocs.io/en/2.11.0/what.html)
+- [**Install**](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html)
+  - [PyPi (pip)](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#pypi-pip)
+  - [Conda](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#conda)
+  - [AWS Lambda Layer](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#aws-lambda-layer)
+  - [AWS Glue Python Shell Jobs](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#aws-glue-python-shell-jobs)
+  - [AWS Glue PySpark Jobs](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#aws-glue-pyspark-jobs)
+  - [Amazon SageMaker Notebook](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#amazon-sagemaker-notebook)
+  - [Amazon SageMaker Notebook Lifecycle](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#amazon-sagemaker-notebook-lifecycle)
+  - [EMR](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#emr)
+  - [From source](https://aws-data-wrangler.readthedocs.io/en/2.11.0/install.html#from-source)
 - [**Tutorials**](https://github.com/awslabs/aws-data-wrangler/tree/main/tutorials)
   - [001 - Introduction](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/001%20-%20Introduction.ipynb)
   - [002 - Sessions](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/002%20-%20Sessions.ipynb)
@@ -136,22 +136,22 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
   - [026 - Amazon Timestream](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/026%20-%20Amazon%20Timestream.ipynb)
   - [027 - Amazon Timestream 2](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/027%20-%20Amazon%20Timestream%202.ipynb)
   - [028 - Amazon DynamoDB](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/028%20-%20DynamoDB.ipynb)
-- [**API Reference**](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html)
-  - [Amazon S3](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-s3)
-  - [AWS Glue Catalog](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#aws-glue-catalog)
-  - [Amazon Athena](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-athena)
-  - [Amazon Redshift](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-redshift)
-  - [PostgreSQL](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#postgresql)
-  - [MySQL](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#mysql)
-  - [SQL Server](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#sqlserver)
-  - [DynamoDB](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#dynamodb)
-  - [Amazon Timestream](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-timestream)
-  - [Amazon EMR](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-emr)
-  - [Amazon CloudWatch Logs](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-cloudwatch-logs)
-  - [Amazon Chime](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-chime)
-  - [Amazon QuickSight](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#amazon-quicksight)
-  - [AWS STS](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#aws-sts)
-  - [AWS Secrets Manager](https://aws-data-wrangler.readthedocs.io/en/2.10.0/api.html#aws-secrets-manager)
+- [**API Reference**](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html)
+  - [Amazon S3](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-s3)
+  - [AWS Glue Catalog](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#aws-glue-catalog)
+  - [Amazon Athena](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-athena)
+  - [Amazon Redshift](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-redshift)
+  - [PostgreSQL](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#postgresql)
+  - [MySQL](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#mysql)
+  - [SQL Server](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#sqlserver)
+  - [DynamoDB](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#dynamodb)
+  - [Amazon Timestream](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-timestream)
+  - [Amazon EMR](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-emr)
+  - [Amazon CloudWatch Logs](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-cloudwatch-logs)
+  - [Amazon Chime](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-chime)
+  - [Amazon QuickSight](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#amazon-quicksight)
+  - [AWS STS](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#aws-sts)
+  - [AWS Secrets Manager](https://aws-data-wrangler.readthedocs.io/en/2.11.0/api.html#aws-secrets-manager)
 - [**License**](https://github.com/awslabs/aws-data-wrangler/blob/main/LICENSE.txt)
 - [**Contributing**](https://github.com/awslabs/aws-data-wrangler/blob/main/CONTRIBUTING.md)
 - [**Legacy Docs** (pre-1.0.0)](https://aws-data-wrangler.readthedocs.io/en/0.3.3/)

diff --git a/awswrangler/__metadata__.py b/awswrangler/__metadata__.py
@@ -7,5 +7,5 @@
 
 __title__: str = "awswrangler"
 __description__: str = "Pandas on AWS."
-__version__: str = "2.10.0"
+__version__: str = "2.11.0"
 __license__: str = "Apache License 2.0"
diff --git a/awswrangler/athena/_read.py b/awswrangler/athena/_read.py
@@ -617,11 +617,11 @@ def read_sql_query(
 
     **Related tutorial:**
 
-    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/006%20-%20Amazon%20Athena.html>`_
-    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/019%20-%20Athena%20Cache.html>`_
-    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/021%20-%20Global%20Configurations.html>`_
 
     **There are two approaches to be defined through ctas_approach parameter:**
@@ -669,7 +669,7 @@ def read_sql_query(
     /athena.html#Athena.Client.get_query_execution>`_ .
 
     For a practical example check out the
-    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
     tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
 
 
@@ -890,11 +890,11 @@ def read_sql_table(
 
     **Related tutorial:**
 
-    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/006%20-%20Amazon%20Athena.html>`_
-    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/019%20-%20Athena%20Cache.html>`_
-    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
       tutorials/021%20-%20Global%20Configurations.html>`_
 
     **There are two approaches to be defined through ctas_approach parameter:**
@@ -939,7 +939,7 @@ def read_sql_table(
     /athena.html#Athena.Client.get_query_execution>`_ .
 
     For a practical example check out the
-    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.10.0/
+    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.11.0/
     tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
 
 

diff --git a/awswrangler/data_api/rds.py b/awswrangler/data_api/rds.py
@@ -139,6 +139,8 @@ def read_sql_query(sql: str, con: RdsDataApi, database: Optional[str] = None) ->
     ----------
     sql: str
         SQL query to run.
+    con: RdsDataApi
+        A RdsDataApi connection instance
     database: str
         Database to run query on - defaults to the database specified by `con`.
 

diff --git a/awswrangler/data_api/redshift.py b/awswrangler/data_api/redshift.py
@@ -189,6 +189,8 @@ def read_sql_query(sql: str, con: RedshiftDataApi, database: Optional[str] = Non
     ----------
     sql: str
         SQL query to run.
+    con: RedshiftDataApi
+        A RedshiftDataApi connection instance
     database: str
         Database to run query on - defaults to the database specified by `con`.
 

diff --git a/awswrangler/s3/_read_parquet.py b/awswrangler/s3/_read_parquet.py
@@ -788,7 +788,7 @@ def read_parquet_table(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     columns : List[str], optional
         Names of columns to read from the file(s).
     validate_schema:

diff --git a/awswrangler/s3/_read_text.py b/awswrangler/s3/_read_text.py
@@ -241,7 +241,7 @@ def read_csv(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs :
         KEYWORD arguments forwarded to pandas.read_csv(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.
@@ -389,7 +389,7 @@ def read_fwf(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs:
         KEYWORD arguments forwarded to pandas.read_fwf(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.
@@ -541,7 +541,7 @@ def read_json(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs:
         KEYWORD arguments forwarded to pandas.read_json(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.

diff --git a/awswrangler/s3/_write_parquet.py b/awswrangler/s3/_write_parquet.py
@@ -279,18 +279,18 @@ def to_parquet(  # pylint: disable=too-many-arguments,too-many-locals
     concurrent_partitioning: bool
         If True will increase the parallelism level during the partitions writing. It will decrease the
         writing time and increase the memory usage.
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
     mode: str, optional
         ``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
         For details check the related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
     catalog_versioning : bool
         If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
     schema_evolution : bool
         If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
         (Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
         Related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/014%20-%20Schema%20Evolution.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/014%20-%20Schema%20Evolution.html
     database : str, optional
         Glue/Athena catalog: Database name.
     table : str, optional

diff --git a/awswrangler/s3/_write_text.py b/awswrangler/s3/_write_text.py
@@ -174,18 +174,18 @@ def to_csv(  # pylint: disable=too-many-arguments,too-many-locals,too-many-state
     concurrent_partitioning: bool
         If True will increase the parallelism level during the partitions writing. It will decrease the
         writing time and increase the memory usage.
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
     mode : str, optional
         ``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
         For details check the related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
     catalog_versioning : bool
         If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
     schema_evolution : bool
         If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
         (Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
         Related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.10.0/tutorials/014%20-%20Schema%20Evolution.html
+        https://aws-data-wrangler.readthedocs.io/en/2.11.0/tutorials/014%20-%20Schema%20Evolution.html
     database : str, optional
         Glue/Athena catalog: Database name.
     table : str, optional

diff --git a/docs/source/api.rst b/docs/source/api.rst
@@ -183,6 +183,7 @@ Data API Redshift
 .. autosummary::
     :toctree: stubs
 
+    RedshiftDataApi
     connect
     read_sql_query
 
@@ -194,6 +195,7 @@ Data API RDS
 .. autosummary::
     :toctree: stubs
 
+    RdsDataApi
     connect
     read_sql_query