AWS Data Wrangler 2.12.0
Caveats
⚠️ For platforms without PyArrow 5 support (e.g. MWAA, EMR, Glue PySpark Job):
➡️pip install pyarrow==2 awswrangler
New Functionalities
- Add Support for Opensearch #891 🔥 Check out the tutorial. Many thanks to @AssafMentzer and @mureddy19 for this contribution
Enhancements
- redshift.read_sql_query - handle empty table corner case #874
- Refactor read parquet table to reduce file list scan based on available partitions #878
- Shrink lambda layer with strip command #884
- Enabling DynamoDB endpoint URL #887
- EMR jobs concurrency #889
- Add feature to allow custom AMI for EMR #907
- wr.redshift.unload_to_files empty the S3 folder instead of overwriting existing files #914
- Add catalog_id arg to wr.catalog.does_table_exist #920
- Ad enpoint_url for AWS Secrets Manager #929
Documentation
- Update docs for awswrangler.s3.to_csv #868
Bug Fix
- wr.mysql.to_sql with use_column_names=True when column names are reserved words #918
Thanks
We thank the following contributors/users for their work on this release:
@AssafMentzer, @mureddy19, @isichei, @DonnaArt, @kukushking, @jaidisido
P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run or use them from our S3 public bucket!