Skip to content

Commit

Permalink
Updated release notes for v0.2.1
Browse files Browse the repository at this point in the history
Signed-off-by: Maroun Touma <touma@us.ibm.com>
  • Loading branch information
touma-I committed Sep 24, 2024
1 parent ca8087a commit 41ca069
Showing 1 changed file with 36 additions and 0 deletions.
36 changes: 36 additions & 0 deletions release-notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,41 @@
# Data Prep Kit Release notes

## Release 0.2.1 - 9/24/2024

### General
1. Bug fixes across the repo
1. Added AI Alliance RAG demo, tutorials and notebooks and tips for running on google colab
1. Added new transforms and single package for transforms published to pypi
1. improved CI/CD with targeted workflow triggered on specific changes to specific modules
1. New enhancements for cutting a release


### data-prep-toolkit libraries (python, ray, spark)

1. Restructure the repository to distinguish/separate runtime libraries
1. Split data-processing-lib/ray into python and ray
1. Spark runtime
1. updated pyarrow version
1. define required transform() method as abstract to AbstractTableTransform
1. Enables configuration of makefile to use src or pypi for data-prep-kit library dependencies


### KFP Workloads

1. Update kfp image version
1. Enable kfp in GH action for testing randomly selected workflow and prevent kfp test for transforms that do not support it
1. Auto generate kfp pipelines
1. Combine the common KFP support code in a shared library
1. Update K8s cluster deployment and remove creation of clusterrolebinding in kubeflow installation


### Transforms

1. Added 7 new transdforms including: language identification, profiler, repo level ordering, doc quality, pdf2parquet, HTML2Parquet and PII Transform
1. Added ededup python implementation and incremental ededup
1. Added fuzzy floating point comparison


## Release 0.2.0 - 6/27/2024

### General
Expand Down

0 comments on commit 41ca069

Please sign in to comment.