AWS Data Wrangler 2.7.0

jaidisido released this 15 Apr 17:17

· 1275 commits to main since this release

Caveats

⚠️ For platforms without PyArrow 3 support (e.g. MWAA, EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

Documentation

Updated documentation to clarify wr.athena.read_sql_query params argument use #609

New Functionalities

Supporting MySQL upserts #608
Enable prepending S3 parquet files with a prefix in wr.s3.write.to_parquet #617
Add exist_ok flag to safely create a Glue database #642
Add "Unsupported Pyarrow type" exception #639

Bug Fix

Fix chunked mode in wr.s3.read_parquet_table #627
Fix missing \ character from wr.s3.read_parquet_table method #638
Support postgres as an engine value #630
Add default workgroup result configuration #633
Raise exception when merge_upsert_table fails or data_quality is insufficient #601
Fixing nested structure bug in athena2pyarrow method #612

Thanks

We thank the following contributors/users for their work on this release:

@maxispeicher, @igorborgest, @mattboyd-aws, @vlieven, @bentkibler, @adarsh-chauhan, @impredicative, @nmduarteus, @JoshCrosby, @TakumiHaruta, @zdk123, @tuannguyen0901, @jiteshsoni, @luminita.

P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run!

Assets 6