Preparing release 2.9.0 (#751)

* Preparing release 2.9.0 * 3.6 compatibility fixes
aws · Jun 17, 2021 · 89b459d · 89b459d
1 parent c685abb
commit 89b459d
Show file tree

Hide file tree

Showing 24 changed files with 112 additions and 91 deletions.
diff --git a/.bumpversion.cfg b/.bumpversion.cfg
@@ -1,5 +1,5 @@
 [bumpversion]
-current_version = 2.8.0
+current_version = 2.9.0
 commit = False
 tag = False
 tag_name = {new_version}

diff --git a/CONTRIBUTING_COMMON_ERRORS.md b/CONTRIBUTING_COMMON_ERRORS.md
@@ -13,9 +13,9 @@ Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in ./.venv/lib/python3.7/site-
 Using legacy 'setup.py install' for python-Levenshtein, since package 'wheel' is not installed.
 Installing collected packages: awswrangler, python-Levenshtein
   Attempting uninstall: awswrangler
-    Found existing installation: awswrangler 2.8.0
-    Uninstalling awswrangler-2.8.0:
-      Successfully uninstalled awswrangler-2.8.0
+    Found existing installation: awswrangler 2.9.0
+    Uninstalling awswrangler-2.9.0:
+      Successfully uninstalled awswrangler-2.9.0
   Running setup.py develop for awswrangler
     Running setup.py install for python-Levenshtein ... error
     ERROR: Command errored out with exit status 1:

diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, Clo
 
 > An [AWS Professional Service](https://aws.amazon.com/professional-services/) open source initiative | aws-proserve-opensource@amazon.com
 
-[![Release](https://img.shields.io/badge/release-2.8.0-brightgreen.svg)](https://pypi.org/project/awswrangler/)
+[![Release](https://img.shields.io/badge/release-2.9.0-brightgreen.svg)](https://pypi.org/project/awswrangler/)
 [![Python Version](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9-brightgreen.svg)](https://anaconda.org/conda-forge/awswrangler)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
@@ -24,7 +24,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, Clo
 | **[PyPi](https://pypi.org/project/awswrangler/)**  | [![PyPI Downloads](https://pepy.tech/badge/awswrangler)](https://pypi.org/project/awswrangler/) | `pip install awswrangler` |
 | **[Conda](https://anaconda.org/conda-forge/awswrangler)** | [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/awswrangler.svg)](https://anaconda.org/conda-forge/awswrangler) | `conda install -c conda-forge awswrangler` |
 
-> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
+> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
 ➡️ `pip install pyarrow==2 awswrangler`
 
 Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](https://arrow.apache.org/powered_by/)
@@ -42,7 +42,7 @@ Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](http
 
 Installation command: `pip install awswrangler`
 
-> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
+> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#emr-cluster), [Glue PySpark Job](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
 ➡️`pip install pyarrow==2 awswrangler`
 
 ```py3
@@ -96,17 +96,17 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
 
 ## [Read The Docs](https://aws-data-wrangler.readthedocs.io/)
 
-- [**What is AWS Data Wrangler?**](https://aws-data-wrangler.readthedocs.io/en/2.8.0/what.html)
-- [**Install**](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html)
-  - [PyPi (pip)](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#pypi-pip)
-  - [Conda](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#conda)
-  - [AWS Lambda Layer](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#aws-lambda-layer)
-  - [AWS Glue Python Shell Jobs](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#aws-glue-python-shell-jobs)
-  - [AWS Glue PySpark Jobs](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#aws-glue-pyspark-jobs)
-  - [Amazon SageMaker Notebook](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#amazon-sagemaker-notebook)
-  - [Amazon SageMaker Notebook Lifecycle](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#amazon-sagemaker-notebook-lifecycle)
-  - [EMR](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#emr)
-  - [From source](https://aws-data-wrangler.readthedocs.io/en/2.8.0/install.html#from-source)
+- [**What is AWS Data Wrangler?**](https://aws-data-wrangler.readthedocs.io/en/2.9.0/what.html)
+- [**Install**](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html)
+  - [PyPi (pip)](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#pypi-pip)
+  - [Conda](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#conda)
+  - [AWS Lambda Layer](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#aws-lambda-layer)
+  - [AWS Glue Python Shell Jobs](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#aws-glue-python-shell-jobs)
+  - [AWS Glue PySpark Jobs](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#aws-glue-pyspark-jobs)
+  - [Amazon SageMaker Notebook](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#amazon-sagemaker-notebook)
+  - [Amazon SageMaker Notebook Lifecycle](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#amazon-sagemaker-notebook-lifecycle)
+  - [EMR](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#emr)
+  - [From source](https://aws-data-wrangler.readthedocs.io/en/2.9.0/install.html#from-source)
 - [**Tutorials**](https://github.com/awslabs/aws-data-wrangler/tree/main/tutorials)
   - [001 - Introduction](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/001%20-%20Introduction.ipynb)
   - [002 - Sessions](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/002%20-%20Sessions.ipynb)
@@ -136,22 +136,22 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
   - [026 - Amazon Timestream](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/026%20-%20Amazon%20Timestream.ipynb)
   - [027 - Amazon Timestream 2](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/027%20-%20Amazon%20Timestream%202.ipynb)
   - [028 - Amazon DynamoDB](https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/028%20-%20DynamoDB.ipynb)
-- [**API Reference**](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html)
-  - [Amazon S3](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-s3)
-  - [AWS Glue Catalog](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#aws-glue-catalog)
-  - [Amazon Athena](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-athena)
-  - [Amazon Redshift](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-redshift)
-  - [PostgreSQL](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#postgresql)
-  - [MySQL](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#mysql)
-  - [SQL Server](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#sqlserver)
-  - [DynamoDB](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#dynamodb)
-  - [Amazon Timestream](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-timestream)
-  - [Amazon EMR](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-emr)
-  - [Amazon CloudWatch Logs](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-cloudwatch-logs)
-  - [Amazon Chime](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-chime)
-  - [Amazon QuickSight](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#amazon-quicksight)
-  - [AWS STS](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#aws-sts)
-  - [AWS Secrets Manager](https://aws-data-wrangler.readthedocs.io/en/2.8.0/api.html#aws-secrets-manager)
+- [**API Reference**](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html)
+  - [Amazon S3](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-s3)
+  - [AWS Glue Catalog](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#aws-glue-catalog)
+  - [Amazon Athena](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-athena)
+  - [Amazon Redshift](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-redshift)
+  - [PostgreSQL](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#postgresql)
+  - [MySQL](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#mysql)
+  - [SQL Server](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#sqlserver)
+  - [DynamoDB](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#dynamodb)
+  - [Amazon Timestream](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-timestream)
+  - [Amazon EMR](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-emr)
+  - [Amazon CloudWatch Logs](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-cloudwatch-logs)
+  - [Amazon Chime](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-chime)
+  - [Amazon QuickSight](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#amazon-quicksight)
+  - [AWS STS](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#aws-sts)
+  - [AWS Secrets Manager](https://aws-data-wrangler.readthedocs.io/en/2.9.0/api.html#aws-secrets-manager)
 - [**License**](https://github.com/awslabs/aws-data-wrangler/blob/main/LICENSE.txt)
 - [**Contributing**](https://github.com/awslabs/aws-data-wrangler/blob/main/CONTRIBUTING.md)
 - [**Legacy Docs** (pre-1.0.0)](https://aws-data-wrangler.readthedocs.io/en/0.3.3/)

diff --git a/awswrangler/__metadata__.py b/awswrangler/__metadata__.py
@@ -7,5 +7,5 @@
 
 __title__: str = "awswrangler"
 __description__: str = "Pandas on AWS."
-__version__: str = "2.8.0"
+__version__: str = "2.9.0"
 __license__: str = "Apache License 2.0"
diff --git a/awswrangler/athena/_read.py b/awswrangler/athena/_read.py
@@ -605,11 +605,11 @@ def read_sql_query(
 
     **Related tutorial:**
 
-    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/006%20-%20Amazon%20Athena.html>`_
-    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/019%20-%20Athena%20Cache.html>`_
-    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/021%20-%20Global%20Configurations.html>`_
 
     **There are two approaches to be defined through ctas_approach parameter:**
@@ -657,7 +657,7 @@ def read_sql_query(
     /athena.html#Athena.Client.get_query_execution>`_ .
 
     For a practical example check out the
-    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
     tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
 
 
@@ -872,11 +872,11 @@ def read_sql_table(
 
     **Related tutorial:**
 
-    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/006%20-%20Amazon%20Athena.html>`_
-    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/019%20-%20Athena%20Cache.html>`_
-    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    - `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
       tutorials/021%20-%20Global%20Configurations.html>`_
 
     **There are two approaches to be defined through ctas_approach parameter:**
@@ -921,7 +921,7 @@ def read_sql_table(
     /athena.html#Athena.Client.get_query_execution>`_ .
 
     For a practical example check out the
-    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.8.0/
+    `related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.9.0/
     tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
 
 

diff --git a/awswrangler/s3/_read_parquet.py b/awswrangler/s3/_read_parquet.py
@@ -788,7 +788,7 @@ def read_parquet_table(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     columns : List[str], optional
         Names of columns to read from the file(s).
     validate_schema:

diff --git a/awswrangler/s3/_read_text.py b/awswrangler/s3/_read_text.py
@@ -241,7 +241,7 @@ def read_csv(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs :
         KEYWORD arguments forwarded to pandas.read_csv(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.
@@ -389,7 +389,7 @@ def read_fwf(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs:
         KEYWORD arguments forwarded to pandas.read_fwf(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.
@@ -541,7 +541,7 @@ def read_json(
         This function MUST return a bool, True to read the partition or False to ignore it.
         Ignored if `dataset=False`.
         E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
     pandas_kwargs:
         KEYWORD arguments forwarded to pandas.read_json(). You can NOT pass `pandas_kwargs` explicit, just add valid
         Pandas arguments in the function call and Wrangler will accept it.

diff --git a/awswrangler/s3/_write_parquet.py b/awswrangler/s3/_write_parquet.py
@@ -298,18 +298,18 @@ def to_parquet(  # pylint: disable=too-many-arguments,too-many-locals
     concurrent_partitioning: bool
         If True will increase the parallelism level during the partitions writing. It will decrease the
         writing time and increase the memory usage.
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
     mode: str, optional
         ``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
         For details check the related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
     catalog_versioning : bool
         If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
     schema_evolution : bool
         If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
         (Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
         Related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/014%20-%20Schema%20Evolution.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/014%20-%20Schema%20Evolution.html
     database : str, optional
         Glue/Athena catalog: Database name.
     table : str, optional

diff --git a/awswrangler/s3/_write_text.py b/awswrangler/s3/_write_text.py
@@ -175,11 +175,11 @@ def to_csv(  # pylint: disable=too-many-arguments,too-many-locals,too-many-state
     concurrent_partitioning: bool
         If True will increase the parallelism level during the partitions writing. It will decrease the
         writing time and increase the memory usage.
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
     mode : str, optional
         ``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
         For details check the related tutorial:
-        https://aws-data-wrangler.readthedocs.io/en/2.8.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
+        https://aws-data-wrangler.readthedocs.io/en/2.9.0/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet
     catalog_versioning : bool
         If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
     database : str, optional

diff --git a/building/lambda/Dockerfile b/building/lambda/Dockerfile
@@ -19,7 +19,7 @@ RUN pip3 install -r /root/requirements.txt
 
 ADD requirements-dev.txt /root/
 # Removing "-e ." installation
-RUN head -n -2 /root/requirements-dev.txt > /root/temp.txt
+RUN head -n -3 /root/requirements-dev.txt > /root/temp.txt
 RUN mv /root/temp.txt /root/requirements-dev.txt
 RUN pip3 install -r /root/requirements-dev.txt
 

diff --git a/docs/source/install.rst b/docs/source/install.rst
@@ -62,7 +62,7 @@ Go to your Glue PySpark job and create a new *Job parameters* key/value:
 
 To install a specific version, set the value for above Job parameter as follows:
 
-* Value: ``pyarrow==2,awswrangler==2.8.0``
+* Value: ``pyarrow==2,awswrangler==2.9.0``
 
 .. note:: Pyarrow 3 is not currently supported in Glue PySpark Jobs, which is why a previous installation of pyarrow 2 is required.
 
@@ -95,7 +95,7 @@ Here is an example of how to reference the Lambda layer in your CDK app:
             "wrangler-bucket",
             bucket_arn="arn:aws:s3:::aws-data-wrangler-public-artifacts",
         ),
-        key="releases/2.8.0/awswrangler-layer-2.8.0-py3.8.zip",
+        key="releases/2.9.0/awswrangler-layer-2.9.0-py3.8.zip",
       ),
       layer_version_name="aws-data-wrangler"
     )
@@ -190,7 +190,7 @@ complement Big Data pipelines.
         sudo pip install pyarrow==2 awswrangler
 
 .. note:: Make sure to freeze the Wrangler version in the bootstrap for productive
-          environments (e.g. awswrangler==2.8.0)
+          environments (e.g. awswrangler==2.9.0)
 
 .. note:: Pyarrow 3 is not currently supported in the default EMR image, which is why a previous installation of pyarrow 2 is required.