Skip to content

Delta Sharing Python 1.1.0

Compare
Choose a tag to compare
@linzhou-db linzhou-db released this 02 Aug 01:54
· 25 commits to main since this release

We are super excited to announce the release of Delta Sharing Python 1.1, which includes several exciting new features and artifacts.

Delta Sharing Python to Support Sharing Tables with Deletion Vectors and Column Mapping

In order to support Delta advanced features such as DeletionVectors and ColumnMapping in the Delta Sharing OSS Python connector, “Delta format sharing” is introduced. To read the delta actions, Delta-Kernel-Rust (https://github.com/delta-incubator/delta-kernel-rs) was integrated with the Python Connector. A Python wheel called “delta-kernel-rust-sharing-wrapper” is built from rust using Maturin and is imported into delta-sharing. The “delta-kernel-rust-sharing-wrapper” package is publicly available at: https://pypi.org/project/delta-kernel-rust-sharing-wrapper/

load_as_pandas() will autoresolve which format to use when reading a shared table, and it will only use the new code path if the table has Deletion Vectors and Column Mapping features. The client can also explicitly specify a boolean param “use_delta_format” in load_as_pandas(). If the user specifies “use_delta_format” to be True, then the new kernel code path will be used, which can also result in a multi times speedup from the old code path when reading a shared table. If the user sets “use_delta_format” to be False while reading a table with Deletion Vectors and Column Mapping, the query will error as before.
Restrictions: ColumnMapping.mode = ‘id’ is not supported.

Delta Kernel Integration with Reference Server

Delta Kernel was integrated with the reference server, to enable the support for tables with DeletionVecotors and ColumnMapping, and will minimize the integration effort to support future delta features.

Important Installation Notes

To be able to use the new delta-sharing release, you MUST install the delta-kernel-rust-sharing-wrapper:

pip3 install delta-kernel-rust-sharing-wrapper

The delta-kernel-rust-sharing-wrapper has been released for python3.8+ for the following distributions: x86_64 and ARM based MacOS, x86_64 WindowsOS, and x86_64 and ARM Linux that has glibc 2.31+ .

If you are running a Linux distribution with glibc less than 2.31, you must either upgrade to 2.31 or follow the steps in the link below to build it locally in your own OS. https://github.com/delta-io/delta-sharing/blob/main/python/delta-kernel-rust-sharing-wrapper/README.md

If you see an error during installation:

  • Check python3 version >= 3.8
  • Upgrade your pip3 to the latest version
  • Check the linux glibc version >= 2.31

(#494, #495, #497, #499, #503, #505, #510, #513, #516, #517, #518, #521, #522, #524, #525, #527, #528, #529, #530, #533, #535, #536, #539, #540, #542, #543, #544, #547, #548)

Credits: Pranav Sukumar, Lin Zhou